CVPR 2025
Official implementation of Using Diffusion Priors for Video Amodal Segmentation
Kaihua Chen, Deva Ramanan, Tarasha Khurana
- Release the checkpoint and inference code
- Release evaluation code for SAIL-VOS and TAO-Amodal
- Release fine-tuning code for Diffusion-VAS
git clone https://github.com/Kaihua-Chen/diffusion-vas
cd diffusion-vas
conda create --name diffusion_vas python=3.10
conda activate diffusion_vas
pip install -r requirements.txt
We provide our Diffusion-VAS checkpoints finetuned on SAIL-VOS on Hugging Face. To download them, run:
mkdir checkpoints
cd checkpoints
git lfs install
git clone https://huggingface.co/kaihuac/diffusion-vas-amodal-segmentation
git clone https://huggingface.co/kaihuac/diffusion-vas-content-completion
cd ..
Note: Ignore any Windows-related warnings when downloading.
For Depth Anything V2's checkpoints, download the Pre-trained Models (e.g., Depth-Anything-V2-Large) from this link and place them inside the checkpoints/
folder.
To run inference, simply execute:
python demo.py
This will infer the birdcage example from demo_data/
.
To try different examples, modify the seq_name
argument:
python demo.py --seq_name <your_sequence_name>
You can also change the checkpoint path, data output paths, and other parameters as needed.
Start with a video, use the SAM2's web demo or its codebase to segment the target object, and extract frames preferably at 8 FPS. Ensure that the output follows the same directory structure as examples from demo_data/
before running inference.
If you find this work helpful, please consider citing our paper:
@inproceedings{chen2025diffvas,
title={Using Diffusion Priors for Video Amodal Segmentation},
author={Kaihua Chen and Deva Ramanan and Tarasha Khurana},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2025}
}