TGPO: Temporal Grounded Policy Optimization for Signal Temporal Logic Tasks Paper • 2510.00225 • Published Sep 30, 2025 • 2
X-CoT: Explainable Text-to-Video Retrieval via LLM-based Chain-of-Thought Reasoning Paper • 2509.21559 • Published Sep 25, 2025 • 2
Lavida-O: Elastic Large Masked Diffusion Models for Unified Multimodal Understanding and Generation Paper • 2509.19244 • Published Sep 23, 2025 • 11
R&B: Domain Regrouping and Data Mixture Balancing for Efficient Foundation Model Training Paper • 2505.00358 • Published May 1, 2025 • 26
MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision Paper • 2505.13427 • Published May 19, 2025 • 26
The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think Paper • 2505.10185 • Published May 15, 2025 • 26
AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection Paper • 2505.07293 • Published May 12, 2025 • 28
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models Paper • 2505.14810 • Published May 20, 2025 • 62