AlekseyKorshuk/twscrape-prepared-trl-sft-qwen-7b-grpo-1epochs Text Generation • 8B • Updated Mar 10 • 11
zijianh/DeepSeek-R1-Distill-Qwen-7B-RL-length-penalty-low-new Text Generation • 8B • Updated Mar 21 • 7
secmlr/VD-DS-DSFormat-Clean-8k_VD-DS-DSFormat-Clean-16k_DeepSeek-R1-Distill-Qwen-7B_full_sft_1e-5 Text Generation • 8B • Updated Mar 31 • 8