模型:
lllyasviel/control_v11e_sd15_shuffle
Controlnet v1.1于 lllyasviel/ControlNet-v1-1 发布,由 Lvmin Zhang 完成。
该检查点是将 the original checkpoint 转换为扩散器格式的检查点。可以与稳定扩散技术结合使用,例如 runwayml/stable-diffusion-v1-5 。
更多详细信息,请参阅 ? Diffusers docs 。
ControlNet是一种神经网络结构,用于通过添加额外条件来控制扩散模型。
此检查点对应于基于shuffle图像的ControlNet。
开发者:Lvmin Zhang,Maneesh Agrawala
模型类型:基于扩散的文本到图像生成模型
语言:英语
许可证: The CreativeML OpenRAIL M license 是 Open RAIL M license 的许可证,该许可证是 BigScience 和 the RAIL Initiative 共同在负责任的AI许可领域开展工作的成果。另请参阅我们许可证所基于的 the article about the BLOOM Open RAIL license 。
更多信息资源: GitHub Repository , Paper 。
引用如下:
@misc{zhang2023adding, title={Adding Conditional Control to Text-to-Image Diffusion Models}, author={Lvmin Zhang and Maneesh Agrawala}, year={2023}, eprint={2302.05543}, archivePrefix={arXiv}, primaryClass={cs.CV}}
Lvmin Zhang,Maneesh Agrawala于 Adding Conditional Control to Text-to-Image Diffusion Models 提出了Controlnet。
摘要如下:
我们提出了一种神经网络结构ControlNet,用于控制预训练的大型扩散模型以支持附加输入条件。 ControlNet以端到端方式学习任务特定的条件,并且即使训练数据集很小(= 0.16.0.dev0!
$ pip install git+https://github.com/huggingface/diffusers.git transformers accelerate
import torch import os from huggingface_hub import HfApi from pathlib import Path from diffusers.utils import load_image from PIL import Image import numpy as np from controlnet_aux import ContentShuffleDetector from diffusers import ( ControlNetModel, StableDiffusionControlNetPipeline, UniPCMultistepScheduler, ) checkpoint = "lllyasviel/control_v11e_sd15_shuffle" image = load_image( "https://huggingface.co/lllyasviel/control_v11e_sd15_shuffle/resolve/main/images/input.png" ) prompt = "New York" processor = ContentShuffleDetector() control_image = processor(image) control_image.save("./images/control.png") controlnet = ControlNetModel.from_pretrained(checkpoint, torch_dtype=torch.float16) pipe = StableDiffusionControlNetPipeline.from_pretrained( "runwayml/stable-diffusion-v1-5", controlnet=controlnet, torch_dtype=torch.float16 ) pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config) pipe.enable_model_cpu_offload() generator = torch.manual_seed(33) image = pipe(prompt, num_inference_steps=30, generator=generator, image=control_image).images[0] image.save('images/image_out.png')
作者发布了14个不同的检查点,每个检查点都是使用 Stable Diffusion v1-5 在不同类型的条件下训练的:
Model Name | Control Image Overview | Condition Image | Control Image Example | Generated Image Example |
---|---|---|---|---|
12319321 | Trained with canny edge detection | A monochrome image with white edges on a black background. | 12320321 12321321||
12322321 | Trained with pixel to pixel instruction | No condition . | 12323321 12324321||
12325321 | Trained with image inpainting | No condition. | 12326321 12327321||
12328321 | Trained with multi-level line segment detection | An image with annotated line segments. | 12329321 12330321||
12331321 | Trained with depth estimation | An image with depth information, usually represented as a grayscale image. | 12332321 12333321||
12334321 | Trained with surface normal estimation | An image with surface normal information, usually represented as a color-coded image. | 12335321 12336321||
12337321 | Trained with image segmentation | An image with segmented regions, usually represented as a color-coded image. | 12338321 12339321||
12340321 | Trained with line art generation | An image with line art, usually black lines on a white background. | 12341321 12342321||
12343321 | Trained with anime line art generation | An image with anime-style line art. | 12344321 12345321||
12346321 | Trained with human pose estimation | An image with human poses, usually represented as a set of keypoints or skeletons. | 12347321 12348321||
12349321 | Trained with scribble-based image generation | An image with scribbles, usually random or user-drawn strokes. | 12350321 12351321||
12352321 | Trained with soft edge image generation | An image with soft edges, usually to create a more painterly or artistic effect. | 12353321 12354321||
12355321 | Trained with image shuffling | An image with shuffled patches or regions. | 12356321 12357321||
12358321 | Trained with image tiling | A blurry image or part of an image . | 12359321 12360321
更多信息,请参阅 Diffusers ControlNet Blog Post ,并查看 official docs 。