9 26 51

Zekun Qi

qizekun

https://qizekun.github.io/

qizekun

AI & ML interests

Embodied Intelligence, Large Langugae Model, 3D Computer Vision

Recent Activity

liked a Space about 1 month ago

yyfz233/Pi3

authored a paper about 2 months ago

Reasoning in Space via Grounding in the World

liked a model about 2 months ago

Qwen/Qwen3-VL-4B-Instruct

View all activity

Organizations

liked a Space about 1 month ago

Pi3

📈

Scalable Permutation-Equivariant Visual Geometry Learning

authored a paper about 2 months ago

Reasoning in Space via Grounding in the World

Paper • 2510.13800 • Published Oct 15 • 14

liked a model about 2 months ago

Qwen/Qwen3-VL-4B-Instruct

Image-Text-to-Text • 4B • Updated Oct 15 • 824k • 259

upvoted a collection about 2 months ago

GS-Reasoner

Collection

Collections of paper "Reasoning in Space via Grounding in the World" • 6 items • Updated Oct 20 • 2

upvoted a paper about 2 months ago

Reasoning in Space via Grounding in the World

Paper • 2510.13800 • Published Oct 15 • 14

commented a paper about 2 months ago

Reasoning in Space via Grounding in the World

Paper • 2510.13800 • Published Oct 15 • 14 •

updated a dataset 3 months ago

qizekun/OmniSpatial

Preview • Updated Sep 23 • 273 • 15

upvoted 2 papers 4 months ago

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Paper • 2508.18265 • Published Aug 25 • 208

ODYSSEY: Open-World Quadrupeds Exploration and Manipulation for Long-Horizon Tasks

Paper • 2508.08240 • Published Aug 11 • 45

liked 2 Spaces 4 months ago

Florence 2

📉

811

Generate captions and analyze images with various tasks

OmniPart

📚

Generate 3D models from 2D images with mask control

upvoted a paper 4 months ago

NextStep-1: Toward Autoregressive Image Generation with Continuous Tokens at Scale

Paper • 2508.10711 • Published Aug 14 • 144

liked a model 5 months ago

allenai/GraspMolmo

Robotics • 8B • Updated Jun 7 • 282 • 8

upvoted 2 papers 5 months ago

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Paper • 2507.13344 • Published Jul 17 • 57

Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning

Paper • 2507.05255 • Published Jul 7 • 74

authored a paper 5 months ago

DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge

Paper • 2507.04447 • Published Jul 6 • 44

upvoted a paper 5 months ago

DreamVLA: A Vision-Language-Action Model Dreamed with Comprehensive World Knowledge

Paper • 2507.04447 • Published Jul 6 • 44

liked 3 Spaces 5 months ago

Better Florence 2

🔥

198

Analyze images to detect objects, generate captions, or perform OCR

FLUX Prompt Generator

😻

1.35k

Launch a customizable user interface

Stable Diffusion 3.5 Large

🏃

1.96k

Generate images with SD3.5

Zekun Qi

AI & ML interests

Recent Activity

Organizations

qizekun's activity

Pi3

Florence 2

OmniPart

Better Florence 2

FLUX Prompt Generator

Stable Diffusion 3.5 Large