shuoxing/qwen3-4b-full-pretrain-mix-low-tweet-1m-en-no-packing-new Text Generation • 196k • Updated 6 days ago • 31
shuoxing/qwen3-4b-full-pretrain-mix-mid-tweet-1m-en-no-packing-new Text Generation • 196k • Updated 6 days ago • 34
shuoxing/qwen3-4b-full-pretrain-mix-high-tweet-1m-en-no-packing-new Text Generation • 196k • Updated 6 days ago • 28
shuoxing/qwen3-4b-full-pretrain-control-tweet-1m-en-no-packing-new Text Generation • 196k • Updated 6 days ago • 26
huseyinatahaninan/appworld_distillation_sft-SFT-Qwen3-4B-Instruct-2507 Text Generation • 4B • Updated 1 day ago • 18