WebMMU: A Benchmark for Multimodal Multilingual Website Understanding and Code Generation Paper • 2508.16763 • Published Aug 22 • 2
BigCharts-R1: Enhanced Chart Reasoning with Visual Reinforcement Finetuning Paper • 2508.09804 • Published Aug 13
Scope: Selective Cross-modal Orchestration of Visual Perception Experts Paper • 2510.12974 • Published Oct 14
Grounding Computer Use Agents on Human Demonstrations Paper • 2511.07332 • Published 29 days ago • 104
Grounding Computer Use Agents on Human Demonstrations Paper • 2511.07332 • Published 29 days ago • 104
Symbolic Graphics Programming with Large Language Models Paper • 2509.05208 • Published Sep 5 • 46 • 7
InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation Paper • 2407.06423 • Published Jul 8, 2024
UI-Vision: A Desktop-centric GUI Benchmark for Visual Perception and Interaction Paper • 2503.15661 • Published Mar 19 • 2
StarFlow: Generating Structured Workflow Outputs From Sketch Images Paper • 2503.21889 • Published Mar 27 • 2
Rendering-Aware Reinforcement Learning for Vector Graphics Generation Paper • 2505.20793 • Published May 27 • 12
Rendering-Aware Reinforcement Learning for Vector Graphics Generation Paper • 2505.20793 • Published May 27 • 12 • 3
Distilling semantically aware orders for autoregressive image generation Paper • 2504.17069 • Published Apr 23 • 7
Distilling semantically aware orders for autoregressive image generation Paper • 2504.17069 • Published Apr 23 • 7 • 2