Multimodal - a Sladwell Collection

Sladwell 's Collections

KooFit

Agents

Multimodal

updated Sep 25, 2025

Visual Representation Alignment for Multimodal Large Language Models

Paper • 2509.07979 • Published Sep 9, 2025 • 83
LatticeWorld: A Multimodal Large Language Model-Empowered Framework for Interactive Complex World Generation

Paper • 2509.05263 • Published Sep 5, 2025 • 10
Symbolic Graphics Programming with Large Language Models

Paper • 2509.05208 • Published Sep 5, 2025 • 46
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

Paper • 2509.12201 • Published Sep 15, 2025 • 105
Multimodal Reasoning for Science: Technical Report and 1st Place Solution to the ICML 2025 SeePhys Challenge

Paper • 2509.06079 • Published Sep 7, 2025 • 6
Lost in Embeddings: Information Loss in Vision-Language Models

Paper • 2509.11986 • Published Sep 15, 2025 • 28
PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits

Paper • 2509.11362 • Published Sep 14, 2025 • 4
UI-S1: Advancing GUI Automation via Semi-online Reinforcement Learning

Paper • 2509.11543 • Published Sep 15, 2025 • 47
MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods, Results, Discussion, and Outlook

Paper • 2509.14142 • Published Sep 17, 2025 • 10
Qwen3-Omni Technical Report

Paper • 2509.17765 • Published Sep 22, 2025 • 143