Mwangi PRO

Benson

AI & ML interests

None yet

Recent Activity

upvoted a paper about 7 hours ago

Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

liked a dataset about 9 hours ago

sarulab-speech/yodas2_sidon

upvoted a paper 1 day ago

PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing

View all activity

Organizations

None yet

upvoted a paper about 7 hours ago

Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

Paper • 2512.04677 • Published 5 days ago • 151

upvoted a paper 1 day ago

PaperDebugger: A Plugin-Based Multi-Agent System for In-Editor Academic Writing, Review, and Editing

Paper • 2512.02589 • Published 7 days ago • 46

upvoted a paper 6 days ago

TUNA: Taming Unified Visual Representations for Native Unified Multimodal Models

Paper • 2512.02014 • Published 7 days ago • 57

upvoted a paper 14 days ago

Phi-4-reasoning Technical Report

Paper • 2504.21318 • Published Apr 30 • 53

upvoted a paper 17 days ago

VIDEOP2R: Video Understanding from Perception to Reasoning

Paper • 2511.11113 • Published 25 days ago • 111

upvoted a paper 18 days ago

ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries

Paper • 2511.14349 • Published 21 days ago • 16

upvoted 4 papers about 1 month ago

Thinking with Video: Video Generation as a Promising Multimodal Reasoning Paradigm

Paper • 2511.04570 • Published Nov 6 • 208

ROVER: Benchmarking Reciprocal Cross-Modal Reasoning for Omnimodal Generation

Paper • 2511.01163 • Published Nov 3 • 31

World Simulation with Video Foundation Models for Physical AI

Paper • 2511.00062 • Published Oct 28 • 40

Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents

Paper • 2510.23691 • Published Oct 27 • 52

upvoted 2 collections about 1 month ago

Gauss Gym Datasets

Collection

Datasets used for the gauss gym photorealistic simulator • 4 items • Updated Oct 17 • 8

Qwen3-Omni

Collection

6 items • Updated Oct 9 • 168

upvoted an article about 2 months ago

Article

VR Forklift Simulation Data for RLHF - Skills Model and Indicators

Oct 2

•

upvoted a paper about 2 months ago

FlashWorld: High-quality 3D Scene Generation within Seconds

Paper • 2510.13678 • Published Oct 15 • 71

upvoted an article about 2 months ago

Article

Introduction to MedVideoCap-55K: A New, Large-Scale, High-Quality Medical Video-Caption Pair Dataset

Jun 25

•

upvoted a paper about 2 months ago

CommonForms: A Large, Diverse Dataset for Form Field Detection

Paper • 2509.16506 • Published Sep 20 • 19

upvoted a collection about 2 months ago

EgoLife

Collection

CVPR 2025 - EgoLife: Towards Egocentric Life Assistant. Homepage: https://egolife-ai.github.io/ • 10 items • Updated Mar 7 • 20

upvoted 2 papers 2 months ago

Cosmos World Foundation Model Platform for Physical AI

Paper • 2501.03575 • Published Jan 7 • 81

Lyra: Generative 3D Scene Reconstruction via Video Diffusion Model Self-Distillation

Paper • 2509.19296 • Published Sep 23 • 23

upvoted a paper 3 months ago

Statistical Methods in Generative AI

Paper • 2509.07054 • Published Sep 8 • 11

Mwangi PRO

AI & ML interests

Recent Activity

Organizations

Benson's activity

VR Forklift Simulation Data for RLHF - Skills Model and Indicators

Introduction to MedVideoCap-55K: A New, Large-Scale, High-Quality Medical Video-Caption Pair Dataset