kureha295/deepseek-ai-DeepSeek-R1-Distill-Llama-8B-ortho-baseline-layer-11 8B • Updated about 1 month ago • 5
kureha295/deepseek-ai-DeepSeek-R1-Distill-Qwen-7B-ortho-baseline-layer-17 8B • Updated about 1 month ago • 1
Bochkov/growing-transformers-model-frozen-16-bit-baseline-monolyth-181m Text Generation • Updated 21 days ago • 23
Bochkov/growing-transformers-model-frozen-unicode-baseline-monolyth-247m Text Generation • Updated 21 days ago • 17
Bochkov/growing-transformers-model-unfrozen-baseline-monolyth-247m Text Generation • Updated 21 days ago • 14