Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2508.08189

AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

Paper • 2406.04151 • Published Jun 6, 2024 • 24
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science

Paper • 2510.16872 • Published Oct 19 • 104
Scaling Generalist Data-Analytic Agents

Paper • 2509.25084 • Published Sep 29 • 18
Scaling Agents via Continual Pre-training

Paper • 2509.13310 • Published Sep 16 • 117

Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal

Paper • 2508.05988 • Published Aug 8 • 19
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

Paper • 2508.07407 • Published Aug 10 • 98
Compressing Chain-of-Thought in LLMs via Step Entropy

Paper • 2508.03346 • Published Aug 5 • 7
Reinforcement Learning in Vision: A Survey

Paper • 2508.08189 • Published Aug 11 • 29

Papers que podrían ser interesantes

Hidden Dynamics of Massive Activations in Transformer Training

Paper • 2508.03616 • Published Aug 5 • 18
Reinforcement Learning in Vision: A Survey

Paper • 2508.08189 • Published Aug 11 • 29

Vision Reasoning

Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning

Paper • 2505.15966 • Published May 21 • 53
GRIT: Teaching MLLMs to Think with Images

Paper • 2505.15879 • Published May 21 • 12
Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models

Paper • 2505.16854 • Published May 22 • 11
VLM-R^3: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought

Paper • 2505.16192 • Published May 22 • 12

RL+reason model

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24 • 28
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 30
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 4

A Survey of Direct Preference Optimization

Paper • 2503.11701 • Published Mar 12
Reinforcement Learning in Vision: A Survey

Paper • 2508.08189 • Published Aug 11 • 29
A Technical Survey of Reinforcement Learning Techniques for Large Language Models

Paper • 2507.04136 • Published Jul 5
A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10 • 189

Reinforcement Learning in Vision: A Survey

Paper • 2508.08189 • Published Aug 11 • 29
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20 • 37
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation

Paper • 2509.00428 • Published Aug 30 • 17
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions

Paper • 2509.06951 • Published Sep 8 • 31

Image Generation

OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

Paper • 2506.07977 • Published Jun 9 • 41
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Paper • 2506.07986 • Published Jun 9 • 19
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

Paper • 2506.06276 • Published Jun 6 • 26
Aligning Latent Spaces with Flow Priors

Paper • 2506.05240 • Published Jun 5 • 27

about 11 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 526 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 88

Reinforcement learning

about 17 hours ago

Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning

Paper • 2407.20798 • Published Jul 30, 2024 • 24
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 103
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25 • 75

AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

Paper • 2406.04151 • Published Jun 6, 2024 • 24
DeepAnalyze: Agentic Large Language Models for Autonomous Data Science

Paper • 2510.16872 • Published Oct 19 • 104
Scaling Generalist Data-Analytic Agents

Paper • 2509.25084 • Published Sep 29 • 18
Scaling Agents via Continual Pre-training

Paper • 2509.13310 • Published Sep 16 • 117

A Survey of Direct Preference Optimization

Paper • 2503.11701 • Published Mar 12
Reinforcement Learning in Vision: A Survey

Paper • 2508.08189 • Published Aug 11 • 29
A Technical Survey of Reinforcement Learning Techniques for Large Language Models

Paper • 2507.04136 • Published Jul 5
A Survey of Reinforcement Learning for Large Reasoning Models

Paper • 2509.08827 • Published Sep 10 • 189

Pruning the Unsurprising: Efficient Code Reasoning via First-Token Surprisal

Paper • 2508.05988 • Published Aug 8 • 19
A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

Paper • 2508.07407 • Published Aug 10 • 98
Compressing Chain-of-Thought in LLMs via Step Entropy

Paper • 2508.03346 • Published Aug 5 • 7
Reinforcement Learning in Vision: A Survey

Paper • 2508.08189 • Published Aug 11 • 29

Reinforcement Learning in Vision: A Survey

Paper • 2508.08189 • Published Aug 11 • 29
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels

Paper • 2508.17437 • Published Aug 20 • 37
Mixture of Global and Local Experts with Diffusion Transformer for Controllable Face Generation

Paper • 2509.00428 • Published Aug 30 • 17
F1: A Vision-Language-Action Model Bridging Understanding and Generation to Actions

Paper • 2509.06951 • Published Sep 8 • 31

Papers que podrían ser interesantes

Hidden Dynamics of Massive Activations in Transformer Training

Paper • 2508.03616 • Published Aug 5 • 18
Reinforcement Learning in Vision: A Survey

Paper • 2508.08189 • Published Aug 11 • 29

Image Generation

OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

Paper • 2506.07977 • Published Jun 9 • 41
Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Paper • 2506.07986 • Published Jun 9 • 19
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

Paper • 2506.06276 • Published Jun 6 • 26
Aligning Latent Spaces with Flow Priors

Paper • 2506.05240 • Published Jun 5 • 27

Vision Reasoning

Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning

Paper • 2505.15966 • Published May 21 • 53
GRIT: Teaching MLLMs to Think with Images

Paper • 2505.15879 • Published May 21 • 12
Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models

Paper • 2505.16854 • Published May 22 • 11
VLM-R^3: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought

Paper • 2505.16192 • Published May 22 • 12

about 11 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 526 • 98
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 88

RL+reason model

RL + Transformer = A General-Purpose Problem Solver

Paper • 2501.14176 • Published Jan 24 • 28
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 30
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 123
MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Paper • 2412.12098 • Published Dec 16, 2024 • 4

Reinforcement learning

about 17 hours ago

Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning

Paper • 2407.20798 • Published Jul 30, 2024 • 24
Offline Reinforcement Learning for LLM Multi-Step Reasoning

Paper • 2412.16145 • Published Dec 20, 2024 • 38
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 103
SWE-RL: Advancing LLM Reasoning via Reinforcement Learning on Open Software Evolution

Paper • 2502.18449 • Published Feb 25 • 75

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs