-
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
Paper • 2406.04151 • Published • 24 -
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science
Paper • 2510.16872 • Published • 104 -
Scaling Generalist Data-Analytic Agents
Paper • 2509.25084 • Published • 18 -
Scaling Agents via Continual Pre-training
Paper • 2509.13310 • Published • 117
Collections
Discover the best community collections!
Collections including paper arxiv:2508.08189
-
Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal
Paper • 2508.05988 • Published • 19 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 98 -
Compressing Chain-of-Thought in LLMs via Step Entropy
Paper • 2508.03346 • Published • 7 -
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 29
-
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning
Paper • 2505.15966 • Published • 53 -
GRIT: Teaching MLLMs to Think with Images
Paper • 2505.15879 • Published • 12 -
Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models
Paper • 2505.16854 • Published • 11 -
VLM-R^3: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought
Paper • 2505.16192 • Published • 12
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 123 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4
-
A Survey of Direct Preference Optimization
Paper • 2503.11701 • Published -
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 29 -
A Technical Survey of Reinforcement Learning Techniques for Large Language Models
Paper • 2507.04136 • Published -
A Survey of Reinforcement Learning for Large Reasoning Models
Paper • 2509.08827 • Published • 189
-
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 29 -
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
Paper • 2508.17437 • Published • 37 -
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation
Paper • 2509.00428 • Published • 17 -
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions
Paper • 2509.06951 • Published • 31
-
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
Paper • 2506.07977 • Published • 41 -
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Paper • 2506.07986 • Published • 19 -
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Paper • 2506.06276 • Published • 26 -
Aligning Latent Spaces with Flow Priors
Paper • 2506.05240 • Published • 27
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 526 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 103 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75
-
AgentGym: Evolving Large Language Model-based Agents across Diverse Environments
Paper • 2406.04151 • Published • 24 -
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science
Paper • 2510.16872 • Published • 104 -
Scaling Generalist Data-Analytic Agents
Paper • 2509.25084 • Published • 18 -
Scaling Agents via Continual Pre-training
Paper • 2509.13310 • Published • 117
-
A Survey of Direct Preference Optimization
Paper • 2503.11701 • Published -
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 29 -
A Technical Survey of Reinforcement Learning Techniques for Large Language Models
Paper • 2507.04136 • Published -
A Survey of Reinforcement Learning for Large Reasoning Models
Paper • 2509.08827 • Published • 189
-
Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal
Paper • 2508.05988 • Published • 19 -
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
Paper • 2508.07407 • Published • 98 -
Compressing Chain-of-Thought in LLMs via Step Entropy
Paper • 2508.03346 • Published • 7 -
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 29
-
Reinforcement Learning in Vision: A Survey
Paper • 2508.08189 • Published • 29 -
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
Paper • 2508.17437 • Published • 37 -
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation
Paper • 2509.00428 • Published • 17 -
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions
Paper • 2509.06951 • Published • 31
-
OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation
Paper • 2506.07977 • Published • 41 -
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers
Paper • 2506.07986 • Published • 19 -
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis
Paper • 2506.06276 • Published • 26 -
Aligning Latent Spaces with Flow Priors
Paper • 2506.05240 • Published • 27
-
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning
Paper • 2505.15966 • Published • 53 -
GRIT: Teaching MLLMs to Think with Images
Paper • 2505.15879 • Published • 12 -
Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models
Paper • 2505.16854 • Published • 11 -
VLM-R^3: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought
Paper • 2505.16192 • Published • 12
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 526 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
RL + Transformer = A General-Purpose Problem Solver
Paper • 2501.14176 • Published • 28 -
Towards General-Purpose Model-Free Reinforcement Learning
Paper • 2501.16142 • Published • 30 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 123 -
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization
Paper • 2412.12098 • Published • 4
-
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Paper • 2407.20798 • Published • 24 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
Paper • 2501.03262 • Published • 103 -
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution
Paper • 2502.18449 • Published • 75