In a Training Loop 🔄

5 46 93

Arthur EDMOND

Shumatsurontek

AI & ML interests

LLM & Computer Vision

Recent Activity

liked a model about 19 hours ago

Qwen/Qwen3-4B-Instruct-2507

updated a model 1 day ago

Tiime/camembert-pcg-real-transactions

updated a model 1 day ago

Tiime/camembert-pcg-real-transactions

View all activity

Organizations

upvoted an article 7 days ago

Article

Welcome EmbeddingGemma, Google's new efficient embedding model

Sep 4

•

264

upvoted a paper 8 days ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published 17 days ago • 252

upvoted a paper 22 days ago

MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

Paper • 2511.11793 • Published 26 days ago • 159

upvoted a paper 29 days ago

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published Nov 9 • 129

upvoted 2 papers about 1 month ago

Every Activation Boosted: Scaling General Reasoner to 1 Trillion Open Language Foundation

Paper • 2510.22115 • Published Oct 25 • 83

DeepAgent: A General Reasoning Agent with Scalable Toolsets

Paper • 2510.21618 • Published Oct 24 • 99

upvoted 2 papers 2 months ago

Paper2Video: Automatic Video Generation from Scientific Papers

Paper • 2510.05096 • Published Oct 6 • 117

Apriel-1.5-15b-Thinker

Paper • 2510.01141 • Published Oct 1 • 117

upvoted a collection 2 months ago

Granite 4.0 Language Models

Collection

13 items • Updated 23 days ago • 195

upvoted 3 papers 2 months ago

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29 • 140

TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning

Paper • 2509.25760 • Published Sep 30 • 55

EPO: Entropy-regularized Policy Optimization for LLM Agents Reinforcement Learning

Paper • 2509.22576 • Published Sep 26 • 134

upvoted 3 papers 3 months ago

upvoted an article 3 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

•

735

upvoted 3 papers 3 months ago

Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4 • 193

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published Aug 31 • 84

R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Paper • 2508.21113 • Published Aug 28 • 110

upvoted a paper 4 months ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 256