DINO-Foresight: Looking into the Future with DINO (NeurIPS 2025)
DINO-Foresight is a novel framework that operates in the semantic feature space of pretrained Vision Foundation Models (VFMs) to predict future dynamics. It trains a masked feature transformer in a self-supervised manner to forecast the evolution of VFM features over time, enabling various scene understanding tasks through off-the-shelf, task-specific heads.
Paper
DINO-Foresight: Looking into the Future with DINO
Project Page
https://dino-foresight.github.io
Code
The official implementation can be found on GitHub: https://github.com/Sta8is/DINO-Foresight
Sample Usage
The model is built with PyTorch. You can set up the environment and install dependencies as follows:
conda create -n dinof python=3.11
conda activate dinof
pip install torch==2.2.0 torchvision==0.17.0 torchaudio==2.2.0 --index-url https://download.pytorch.org/whl/cu121
git clone https://github.com/Sta8is/DINO-Foresight
cd DINO-Foresight
pip install -r requirements.txt
For detailed usage examples and model training, refer to the provided demos and the GitHub repository.
Demo
We provide 2 quick demos.
- Demo.
Citation
If you found DINO-Foresight useful in your research, please consider starring โญ us on GitHub and citing ๐ us in your research!
@inproceedings{karypidis2025dinoforesight,
title={{DINO}-Foresight: Looking into the Future with {DINO}},
author={Efstathios Karypidis and Ioannis Kakogeorgiou and Spyros Gidaris and Nikos Komodakis},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems},
year={2025},
url={https://arxiv.org/abs/2412.11673}
}
Acknowledgements
Our code is partially based on Maskgit-pytorch, a pytorch implementation of MaskedGit by ValeoAI. We also thank authors of DINOv2, DPT, DepthAnythingV2, LOTUS for their work and open-source code.
- Downloads last month
- 14
