DAS (Diffusion Alignment as Sampling), ICLR'25 Spotlight

This is the official implementation of our paper Test-time Alignment of Diffusion Models without Reward Over-optimization

by Sunwoo Kim¹, Minkyu Kim², Dongmin Park².

¹ Seoul National University, ² KRAFTON AI

Abstract

Diffusion models excel in generative tasks, but aligning them with specific objectives while maintaining their versatility remains challenging. Existing fine-tuning methods often suffer from reward over-optimization, while approximate guidance approaches fail to optimize target rewards effectively. Addressing these limitations, we propose a training-free sampling method based on Sequential Monte Carlo (SMC) to sample from the reward-aligned target distribution. Our approach, tailored for diffusion sampling and incorporating tempering techniques, achieves comparable or superior target rewards to fine-tuning methods while preserving diversity and cross-reward generalization. We demonstrate its effectiveness in single-reward optimization, multi-objective scenarios, and online black-box optimization. This work offers a robust solution for aligning diffusion models with diverse downstream objectives without compromising their general capabilities.

Installation

conda create -n das python=3.10
conda activate das
pip install -e .
pip install --no-deps image-reward

Install hpsv2 from HPSv2. Recommend using method 2 (installing locally) to avoid errors.

Usage

Single prompt

DAS is implemented over diffusers library, making it easy to use. Minimal code for usage with single test prompt can be found in the examples folder.

python examples/sd.py
python examples/sdxl.py
python examples/lcm.py

Multiple prompts with Multiple gpus

To run Aesthetic score experiment with Stable Diffusion 1.5:

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch DAS.py --config config/sd.py:aesthetic

To run PickScore experiment:

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch DAS.py --config config/sd.py:pick

To run multi-objective (Aesthetic score + CLIPScore) experiment:

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch DAS.py --config config/sd.py:multi

where the ratio of two rewards can be customized in the config file.

Similarly, to use SDXL or LCM:

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch DAS.py --config config/sdxl.py:aesthetic
CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch DAS.py --config config/sdxl.py:pick
CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch DAS.py --config config/sdxl.py:multi

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch DAS.py --config config/lcm.py:aesthetic
CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch DAS.py --config config/lcm.py:pick
CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch DAS.py --config config/lcm.py:multi

Evaluation

Evaluation for cross-reward generalization and sample diversity can be performed using the eval.ipynb Jupyter notebook.

Online black-box optimization

Online black-box optimization experiments can be conducted in SEIKO folder which use codes from the SEIKO repository. To use DAS for online black-box optimization with aesthetic score or jpeg compressibility as black-box rewards:

cd SEIKO

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch online/online_main_smc.py --config config/UCB_smc.py:aesthetic
CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch online/online_main_smc.py --config config/Bootstrap_smc.py:aesthetic

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch online/online_main_smc.py --config config/UCB_smc.py:jpeg
CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch online/online_main_smc.py --config config/Bootstrap_smc.py:jpeg

The above codes save trained surrogate reward models. To generate samples, change config.reward_model_path to the final surrogate model checkpoint and run:

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch online/online_main_smc.py --config config/UCB_smc.py:evaluate
CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch online/online_main_smc.py --config config/Bootstrap_smc.py:evaluate

Toy Examples

The Mixture of Gaussians ans Swiss roll experiments can be reproduced using Jupyter notebooks in the notebooks folder.

Citation

@inproceedings{
    kim2025testtime,
    title={Test-time Alignment of Diffusion Models without Reward Over-optimization},
    author={Sunwoo Kim and Minkyu Kim and Dongmin Park},
    booktitle={The Thirteenth International Conference on Learning Representations},
    year={2025},
    url={https://openreview.net/forum?id=vi3DjUhFVm}
}

Acknowledgments

We sincerely thank those who have open-sourced their works including, but not limited to, the repositories below:

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
SEIKO		SEIKO
assets		assets
config		config
das		das
examples		examples
notebooks		notebooks
.gitignore		.gitignore
DAS.py		DAS.py
DiffusionSampler.py		DiffusionSampler.py
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DAS (Diffusion Alignment as Sampling), ICLR'25 Spotlight

Abstract

Installation

Usage

Single prompt

Multiple prompts with Multiple gpus

Evaluation

Online black-box optimization

Toy Examples

Citation

Acknowledgments

About

Releases

Packages

Contributors 2

Languages

License

krafton-ai/DAS

Folders and files

Latest commit

History

Repository files navigation

DAS (Diffusion Alignment as Sampling), ICLR'25 Spotlight

Abstract

Installation

Usage

Single prompt

Multiple prompts with Multiple gpus

Evaluation

Online black-box optimization

Toy Examples

Citation

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages