Skip to content

value errors in convert to/from diffusers from original stable diffusion #11285

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ppbrown opened this issue Apr 10, 2025 · 0 comments
Open
Labels
bug Something isn't working

Comments

@ppbrown
Copy link

ppbrown commented Apr 10, 2025

Describe the bug

There's a hardcode somewhere for 77 tokens, when it should be using the dimensions of what is actually in the model.

I have a diffusers-layout SD1.5 model, with LongCLIP.

https://huggingface.co/opendiffusionai/xllsd-alpha0

I can pull it locally, then convert to single file format, with

python convert_diffusers_to_original_stable_diffusion.py
--use_safetensors
--model_path $SRCM
--checkpoint_path $DESTM

But then if I try to convert it back, I get size errors for the text encoder not being 77 size.

I should point out that the model WORKS PROPERLY for diffusion, when loaded in diffusers format, so I dont have some funky broken model here.

Reproduction

from transformers import CLIPTextModel, CLIPTokenizer

from diffusers import StableDiffusionPipeline, AutoencoderKL
import torch

pipe = StableDiffusionPipeline.from_single_file(
"XLLsd-phase0.safetensors",
torch_dtype=torch.float32,
use_safetensors=True)

outname = "XLLsd_recreate"
pipe.save_pretrained(outname, safe_serialization=False)

Logs

venv/lib/python3.12/site-packages/diffusers/models/model_loading_utils.py", line 230, in load_model_dict_into_meta
    raise ValueError(
ValueError: Cannot load  because text_model.embeddings.position_embedding.weight expected shape torch.Size([77, 768]), but got torch.Size([248, 768]). If you want to instead overwrite randomly initialized weights, please make sure to pass both `low_cpu_mem_usage=False` and `ignore_mismatched_sizes=True`. For more information, see also: https://github.com/huggingface/diffusers/issues/1619#issuecomment-1345604389 as an example.

System Info

  • 🤗 Diffusers version: 0.32.2
  • Platform: Linux-6.8.0-55-generic-x86_64-with-glibc2.39
  • Running on Google Colab?: No
  • Python version: 3.12.3
  • PyTorch version (GPU?): 2.6.0+cu124 (True)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Huggingface_hub version: 0.29.3
  • Transformers version: 4.50.0
  • Accelerate version: 1.5.2
  • PEFT version: not installed
  • Bitsandbytes version: 0.45.2
  • Safetensors version: 0.5.3
  • xFormers version: not installed
  • Accelerator: NVIDIA GeForce RTX 4090, 24564 MiB

Who can help?

No response

@ppbrown ppbrown added the bug Something isn't working label Apr 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant