Skip to content

[CVPR 2025] Official code for Using Diffusion Priors for Video Amodal Segmentation

Notifications You must be signed in to change notification settings

Kaihua-Chen/diffusion-vas

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Using Diffusion Priors for Video Amodal Segmentation

CVPR 2025

Official implementation of Using Diffusion Priors for Video Amodal Segmentation

Kaihua Chen, Deva Ramanan, Tarasha Khurana

diffusion-vas

Paper | Project Page

TODO 🤓

  • Release the checkpoint and inference code
  • Release evaluation code for SAIL-VOS and TAO-Amodal
  • Release fine-tuning code for Diffusion-VAS

Getting Started

Installation

1. Clone the repository

git clone https://github.com/Kaihua-Chen/diffusion-vas
cd diffusion-vas

2. Create and activate a virtual environment

conda create --name diffusion_vas python=3.10
conda activate diffusion_vas
pip install -r requirements.txt

Download Checkpoints

We provide our Diffusion-VAS checkpoints finetuned on SAIL-VOS on Hugging Face. To download them, run:

mkdir checkpoints
cd checkpoints
git lfs install
git clone https://huggingface.co/kaihuac/diffusion-vas-amodal-segmentation
git clone https://huggingface.co/kaihuac/diffusion-vas-content-completion
cd ..

Note: Ignore any Windows-related warnings when downloading.

For Depth Anything V2's checkpoints, download the Pre-trained Models (e.g., Depth-Anything-V2-Large) from this link and place them inside the checkpoints/ folder.

Inference

To run inference, simply execute:

python demo.py

This will infer the birdcage example from demo_data/.

To try different examples, modify the seq_name argument:

python demo.py --seq_name <your_sequence_name>

You can also change the checkpoint path, data output paths, and other parameters as needed.

Using custom data

Start with a video, use the SAM2's web demo or its codebase to segment the target object, and extract frames preferably at 8 FPS. Ensure that the output follows the same directory structure as examples from demo_data/ before running inference.

Citation

If you find this work helpful, please consider citing our paper:

@inproceedings{chen2025diffvas,
      title={Using Diffusion Priors for Video Amodal Segmentation},
      author={Kaihua Chen and Deva Ramanan and Tarasha Khurana},
      booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
      year={2025}
}

About

[CVPR 2025] Official code for Using Diffusion Priors for Video Amodal Segmentation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published