MultiBanana: A Challenging Benchmark for Multi-Reference Text-to-Image Generation Paper • 2511.22989 • Published 12 days ago • 15
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts Paper • 2402.09727 • Published Feb 15, 2024 • 38
Multimodal Web Navigation with Instruction-Finetuned Foundation Models Paper • 2305.11854 • Published May 19, 2023 • 5
A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis Paper • 2307.12856 • Published Jul 24, 2023 • 36