LLaDA2.0: Scaling Up Diffusion Language Models to 100B Paper • 2512.15745 • Published Dec 10, 2025 • 78
Unveiling Intrinsic Dimension of Texts: from Academic Abstract to Creative Story Paper • 2511.15210 • Published Nov 19, 2025 • 89
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Paper • 2511.13254 • Published Nov 17, 2025 • 136
Enhancing Vision-Language Model Training with Reinforcement Learning in Synthetic Worlds for Real-World Success Paper • 2508.04280 • Published Aug 6, 2025 • 35
Vadim21221/sae__mount_path_qwen2.5-7b_resid_post_layer_14_size_163840_mul_fractal_topk Updated Jul 29, 2025
Vadim21221/sae_Qwen_Qwen2.5-1.5B_resid_post_layer_14_size_131072_mul_fractal_jumprelu Updated Jul 29, 2025
Vadim21221/sae_Qwen_Qwen2.5-1.5B_resid_post_layer_14_size_131072_mul_fractal_jumprelu Updated Jul 29, 2025
Vadim21221/sae__mount_path_qwen2.5-7b_resid_post_layer_14_size_163840_mul_fractal_topk Updated Jul 29, 2025
Vadim21221/sae_Qwen_Qwen2.5-1.5B_resid_post_layer_14_size_65536_mul_fractal_jumprelu Updated Jul 29, 2025
Vadim21221/sae_Qwen_Qwen2.5-1.5B_resid_post_layer_14_size_65536_mul_fractal_jumprelu Updated Jul 29, 2025
Vadim21221/sae_Qwen_Qwen2.5-1.5B_resid_post_layer_14_size_131072_mul_fractal_jumprelu Updated Jul 29, 2025
Vadim21221/sae__mount_path_qwen2.5-7b_resid_post_layer_14_size_163840_mul_fractal_topk Updated Jul 29, 2025