Devstral 2 Collection A couple of agentic LLMs for software engineering tasks, excelling at using tools to explore codebases, edit multiple files, and power SWE Agents. • 3 items • Updated about 16 hours ago • 20
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated 8 days ago • 118
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 9 days ago • 232
Uni-MoE-2.0-Omni: Scaling Language-Centric Omnimodal Large Model with Advanced MoE, Training and Data Paper • 2511.12609 • Published 24 days ago • 102
VIDEOP2R: Video Understanding from Perception to Reasoning Paper • 2511.11113 • Published 26 days ago • 111
Souper-Model: How Simple Arithmetic Unlocks State-of-the-Art LLM Performance Paper • 2511.13254 • Published 23 days ago • 134
P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published 23 days ago • 132
MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling Paper • 2511.11793 • Published 26 days ago • 159
Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation Paper • 2511.14993 • Published 21 days ago • 222
Olmo 3 Post-training Collection All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them. • 32 items • Updated about 8 hours ago • 39
view article Article Building for an Open Future - our new partnership with Google Cloud 27 days ago • 46
Kimi-K2 Collection Moonshot's MoE LLMs with 1 trillion parameters, exceptional on agentic intellegence • 5 items • Updated 26 days ago • 159