|
|
--- |
|
|
base_model: |
|
|
- stabilityai/stable-video-diffusion-img2vid-xt-1-1 |
|
|
pipeline_tag: image-to-video |
|
|
datasets: |
|
|
- TaiMingLu/Genex-DB-World-Exploration |
|
|
license: cc-by-4.0 |
|
|
--- |
|
|
|
|
|
# GenEx-World-Explorer ππ |
|
|
|
|
|
**GenEx World Explorer** is a video generation pipeline built on top of [Stable Video Diffusion (SVD)](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt-1-1). |
|
|
. It takes a keyframe, and generates a temporally consistent video. This explorer version builds on SVD with a custom `UNetSpatioTemporalConditionModel`. |
|
|
|
|
|
The diffuser generate a forward moving path of a panoramic input image, to explore a given scene. |
|
|
|
|
|
|
|
|
 |
|
|
|
|
|
## π¦ Usage |
|
|
|
|
|
```python |
|
|
from diffusers import UNetSpatioTemporalConditionModel, StableVideoDiffusionPipeline |
|
|
import torch |
|
|
from PIL import Image |
|
|
|
|
|
model_id = 'genex-world/GenEx-World-Explorer' |
|
|
|
|
|
# Load the custom UNet |
|
|
unet = UNetSpatioTemporalConditionModel.from_pretrained( |
|
|
model_id, |
|
|
subfolder='unet', |
|
|
torch_dtype=torch.float16, |
|
|
low_cpu_mem_usage=True, |
|
|
) |
|
|
|
|
|
# Load the full pipeline with custom UNet |
|
|
pipe = StableVideoDiffusionPipeline.from_pretrained( |
|
|
model_id, |
|
|
unet=unet, |
|
|
low_cpu_mem_usage=True, |
|
|
torch_dtype=torch.float16, |
|
|
local_files_only=True, |
|
|
).to('cuda') |
|
|
|
|
|
# Explore the world! |
|
|
image = Image.open('example.png').resize((1024, 576), Image.BICUBIC).convert('RGB') |
|
|
|
|
|
generator = torch.manual_seed(-1) |
|
|
with torch.inference_mode(): |
|
|
frames = self.pipe(image, |
|
|
num_frames=25, |
|
|
width=1024, |
|
|
height=576, |
|
|
decode_chunk_size=8, generator=generator, motion_bucket_id=127, fps=7, num_inference_steps=30, noise_aug_strength=0.02).frames[0] |
|
|
``` |
|
|
|
|
|
## π§ Requirements |
|
|
|
|
|
``` |
|
|
diffusers>=0.33.1 |
|
|
transformers |
|
|
numpy |
|
|
pillow |
|
|
``` |
|
|
|
|
|
## β¨ BibTex |
|
|
|
|
|
``` |
|
|
@misc{lu2025genexgeneratingexplorableworld, |
|
|
title={GenEx: Generating an Explorable World}, |
|
|
author={Taiming Lu and Tianmin Shu and Junfei Xiao and Luoxin Ye and Jiahao Wang and Cheng Peng and Chen Wei and Daniel Khashabi and Rama Chellappa and Alan Yuille and Jieneng Chen}, |
|
|
year={2025}, |
|
|
eprint={2412.09624}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.CV}, |
|
|
url={https://arxiv.org/abs/2412.09624}, |
|
|
} |
|
|
``` |