Skip to content

MultiControlNetModel is not supported for SD3ControlNetInpaintingPipeline #11208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
DanilaAniva opened this issue Apr 4, 2025 · 3 comments · May be fixed by #11251
Open

MultiControlNetModel is not supported for SD3ControlNetInpaintingPipeline #11208

DanilaAniva opened this issue Apr 4, 2025 · 3 comments · May be fixed by #11251
Labels
bug Something isn't working contributions-welcome Good Example PR help wanted Extra attention is needed

Comments

@DanilaAniva
Copy link

Describe the bug

When using StableDiffusion3ControlNetInpaintingPipeline with SD3MultiControlNetModel, I receive an error:

NotImplementedError: MultiControlNetModel is not supported for SD3ControlNetInpaintingPipeline.

Reproduction

Example reproduction code:

import os
import torch
from diffusers.utils import load_image
from diffusers.pipelines import StableDiffusion3ControlNetInpaintingPipeline
from diffusers.models import SD3ControlNetModel, SD3MultiControlNetModel
from diffusers import BitsAndBytesConfig, SD3Transformer2DModel
from transformers import T5EncoderModel

# Load images
image = load_image(
    "https://huggingface.co/alimama-creative/SD3-Controlnet-Inpainting/resolve/main/images/dog.png"
)
mask = load_image(
    "https://huggingface.co/alimama-creative/SD3-Controlnet-Inpainting/resolve/main/images/dog_mask.png"
)

# Initialize ControlNet models
controlnetA = SD3ControlNetModel.from_pretrained("InstantX/SD3-Controlnet-Pose")
controlnetB = SD3ControlNetModel.from_pretrained("alimama-creative/SD3-Controlnet-Inpainting", use_safetensors=True, extra_conditioning_channels=1)
controlnet = SD3MultiControlNetModel([controlnetA, controlnetB])

# Load transformer and text encoder
nf4_config = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4", bnb_4bit_compute_dtype=torch.bfloat16)
model_id = "stabilityai/stable-diffusion-3.5-large-turbo"
model_nf4 = SD3Transformer2DModel.from_pretrained(model_id, subfolder="transformer", quantization_config=nf4_config, torch_dtype=torch.bfloat16)
t5_nf4 = T5EncoderModel.from_pretrained("diffusers/t5-nf4", torch_dtype=torch.bfloat16)

# Initialize pipeline
pipe = StableDiffusion3ControlNetInpaintingPipeline.from_pretrained(
    "stabilityai/stable-diffusion-3.5-large-turbo",
    token=os.getenv("HF_TOKEN"),
    controlnet=controlnet,
    transformer=model_nf4,
    text_encoder_3=t5_nf4,
    torch_dtype=torch.bfloat16
)

pipe.enable_model_cpu_offload()

# This fails with NotImplementedError
result_image = pipe(
    prompt="a cute dog with a hat",
    negative_prompt="low quality, bad anatomy",
    control_image=[image, image],
    num_inference_steps=30,
    guidance_scale=7.5,
    controlnet_conditioning_scale=[1.0, 1.0],
    output_type="pil",
).images[0]

Logs

Error


NotImplementedError: MultiControlNetModel is not supported for SD3ControlNetInpaintingPipeline.


Error occurs in `diffusers/pipelines/controlnet_sd3/pipeline_stable_diffusion_3_controlnet_inpainting.py` at line 1026. *Full error code*:


---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
Cell In[1], line 41
     38 pipe.enable_model_cpu_offload()
     40 # This fails with NotImplementedError
---> 41 result_image = pipe(
     42     prompt="a cute dog with a hat",
     43     negative_prompt="low quality, bad anatomy",
     44     control_image=[image, image],
     45     num_inference_steps=30,
     46     guidance_scale=7.5,
     47     controlnet_conditioning_scale=[1.0, 1.0],
     48     output_type="pil",
     49 ).images[0]

File ~/miniconda3/envs/bnb310/lib/python3.10/site-packages/torch/utils/_contextlib.py:115, in context_decorator.<locals>.decorate_context(*args, **kwargs)
    112 @functools.wraps(func)
    113 def decorate_context(*args, **kwargs):
    114     with ctx_factory():
--> 115         return func(*args, **kwargs)

File ~/miniconda3/envs/bnb310/lib/python3.10/site-packages/diffusers/pipelines/controlnet_sd3/pipeline_stable_diffusion_3_controlnet_inpainting.py:1026, in StableDiffusion3ControlNetInpaintingPipeline.__call__(self, prompt, prompt_2, prompt_3, height, width, num_inference_steps, sigmas, guidance_scale, control_guidance_start, control_guidance_end, control_image, control_mask, controlnet_conditioning_scale, controlnet_pooled_projections, negative_prompt, negative_prompt_2, negative_prompt_3, num_images_per_prompt, generator, latents, prompt_embeds, negative_prompt_embeds, pooled_prompt_embeds, negative_pooled_prompt_embeds, output_type, return_dict, joint_attention_kwargs, clip_skip, callback_on_step_end, callback_on_step_end_tensor_inputs, max_sequence_length)
   1023     width = latent_width * self.vae_scale_factor
   1025 elif isinstance(self.controlnet, SD3MultiControlNetModel):
-> 1026     raise NotImplementedError("MultiControlNetModel is not supported for SD3ControlNetInpaintingPipeline.")
   1027 else:
   1028     assert False

NotImplementedError: MultiControlNetModel is not supported for SD3ControlNetInpaintingPipeline.


Expected Behavior
I expect `StableDiffusion3ControlNetInpaintingPipeline` to support `SD3MultiControlNetModel`

System Info

Versions

Python version: 3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0]
PyTorch version: 2.2.0+cu118
CUDA version: 11.8
Diffusers version: 0.32.2
Transformers version: 4.50.3
Accelerate version: 1.7.0.dev0

Who can help?

@yiyixuxu @sayakpaul

@DanilaAniva DanilaAniva added the bug Something isn't working label Apr 4, 2025
@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Apr 8, 2025

do you have a use case where you would use multiple controlnet with inpaiting for sd3? cc @asomoza here too
functionally we should be able to support mulicontrolnet

@DanilaAniva
Copy link
Author

do you have a use case where you would use multiple controlnet with inpaiting for sd3? cc @asomoza here too functionally we should be able to support mulicontrolnet

Yes, I have a specific use case requiring multiple ControlNets with SD3's inpainting capabilities. I would like to use depth control alongside inpainting to better preserve the anatomical features and structure of the original image.

Combining inpainting with depth control sometimes produces better results than using inpainting alone. This approach helps maintain the original image's spatial relationships while targeting specific areas for regeneration.

Here are examples of what I'm trying to do with SD3:

  1. I can use this approach with Flux models:
import torch
from diffusers import FluxControlInpaintPipeline
from diffusers.models.transformers import FluxTransformer2DModel
from transformers import T5EncoderModel
from diffusers.utils import load_image, make_image_grid
from image_gen_aux import DepthPreprocessor
from PIL import Image
import numpy as np

pipe = FluxControlInpaintPipeline.from_pretrained(
    "black-forest-labs/FLUX.1-Depth-dev",
    torch_dtype=torch.bfloat16,
)
# GPU optimization code...
pipe.to("cuda")

prompt = "a blue robot singing opera with human-like expressions"
image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/robot.png")

# Create mask for the robot's head
head_mask = np.zeros_like(image)
head_mask[65:580,300:642] = 255
mask_image = Image.fromarray(head_mask)

# Process depth map
processor = DepthPreprocessor.from_pretrained("LiheYoung/depth-anything-large-hf")
control_image = processor(image)[0].convert("RGB")

output = pipe(
    prompt=prompt,
    image=image,
    control_image=control_image,
    mask_image=mask_image,
    num_inference_steps=30,
    strength=0.9,
    guidance_scale=10.0,
    generator=torch.Generator().manual_seed(42),
).images[0]

Some result example:

Image

  1. I also can use similar functionality with SD 1.5 models:
import torch
import numpy as np
from PIL import Image
from diffusers import StableDiffusionControlNetInpaintPipeline, ControlNetModel
from diffusers.utils import make_image_grid
import controlnet_hinter
import cv2

# Setup model
controlnet = ControlNetModel.from_pretrained("lllyasviel/sd-controlnet-depth", torch_dtype=torch.float16)
pipe = StableDiffusionControlNetInpaintPipeline.from_pretrained(
    "...inpaint_model...",
    controlnet=controlnet,
    torch_dtype=torch.float16
)
pipe.to("cuda")

# Load images
init_image = Image.open('image.jpg')
mask_img = Image.open('mask.png')

# Prepare mask
mask_array = np.array(mask_img) > 0
mask_array = cv2.resize(mask_array.astype(np.uint8), 
                     (init_image.size[0], init_image.size[1]), 
                     interpolation=cv2.INTER_NEAREST).astype(bool)
mask = Image.fromarray(mask_array.astype(np.uint8) * 255)

# Generate depth map
control_image = controlnet_hinter.hint_depth(init_image)
control_image = control_image.resize(init_image.size)

# Generation configuration
generator = torch.Generator("cuda").manual_seed(42)
config = {
    "negative_prompt": "bad quality, worst quality",
    "num_inference_steps": 30,
    "guidance_scale": 7.5,
    "strength": 0.7,
    "controlnet_conditioning_scale": 0.6,
    "control_guidance_start": 0.6,
    "control_guidance_end": 0.8
}

# Generate image
output = pipe(
    prompt="Elegant blonde woman displaying refined style...",
    image=init_image,
    mask_image=mask,
    control_image=control_image,
    # other parameters...
).images[0]

Some result example:

Image

Additionally, my primary concern with SD3 inpainting is its significant issues with anatomical consistency. When using SD3 inpainting without depth control, I frequently encounter severe distortions in human faces and body proportions that reduce output quality. Adding depth control would help maintain proper structural integrity while inpainting, solving these anatomical problems that are more pronounced in SD3 than previous SD generations

@yiyixuxu
Copy link
Collaborator

yiyixuxu commented Apr 8, 2025

thanks @DanilaAniva
I opened up to community, we will add it too if no one picks this up

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working contributions-welcome Good Example PR help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants