Instructions to use xinsir/controlnet-canny-sdxl-1.0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use xinsir/controlnet-canny-sdxl-1.0 with Diffusers:
pip install -U diffusers transformers accelerate
from diffusers import ControlNetModel, StableDiffusionControlNetPipeline controlnet = ControlNetModel.from_pretrained("xinsir/controlnet-canny-sdxl-1.0") pipe = StableDiffusionControlNetPipeline.from_pretrained( "fill-in-base-model", controlnet=controlnet ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- Draw Things
- DiffusionBee
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,162 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
tags:
|
| 4 |
+
- text_to_image
|
| 5 |
+
- diffusers
|
| 6 |
+
- controlnet
|
| 7 |
+
- controlnet-canny-sdxl-1.0
|
| 8 |
+
---
|
| 9 |
+
|
| 10 |
+
# Controlnet-Canny-Sdxl-1.0
|
| 11 |
+
|
| 12 |
+
<!-- Provide a quick summary of what the model is/does. -->
|
| 13 |
+
|
| 14 |
+
Hello, I am very happy to announce the controlnet-canny-sdxl-1.0 model, a very powerful controlnet that can help you draw pictures with thin lines. The model was trained
|
| 15 |
+
with large amount of high quality data, with carefully filtered and captioned. Besides, useful tricks are applied during the training, including date augmentation, mutiple loss
|
| 16 |
+
and multi resolution. With only 1 stage training, the performance outperforms the other opensource canny models(Detail Analysis will be provide). I release it and hope to advance
|
| 17 |
+
the application of stable diffusion models. Canny is one of the most important ControlNet series models and can be applied to many jobs associated with drawing and designing.
|
| 18 |
+
|
| 19 |
+
## Model Details
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
### Model Description
|
| 23 |
+
|
| 24 |
+
<!-- Provide a longer summary of what this model is. -->
|
| 25 |
+
|
| 26 |
+
- **Developed by:** xinsir
|
| 27 |
+
- **Model type:** ControlNet_SDXL
|
| 28 |
+
- **License:** apache-2.0
|
| 29 |
+
- **Finetuned from model [optional]:** stabilityai/stable-diffusion-xl-base-1.0
|
| 30 |
+
|
| 31 |
+
### Model Sources [optional]
|
| 32 |
+
|
| 33 |
+
<!-- Provide the basic links for the model. -->
|
| 34 |
+
|
| 35 |
+
- **Paper [optional]:** https://arxiv.org/abs/2302.05543
|
| 36 |
+
|
| 37 |
+
## Uses
|
| 38 |
+
|
| 39 |
+
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
| 40 |
+
|
| 41 |
+
### Examples
|
| 42 |
+
|
| 43 |
+
prompt: A closeup of two day of the dead models, looking to the side, large flowered headdress, full dia de Los muertoe make up, lush red lips, butterflies,
|
| 44 |
+
flowers, pastel colors, looking to the side, jungle, birds, color harmony , extremely detailed, intricate, ornate, motion, stunning, beautiful, unique, soft lighting
|
| 45 |
+
|
| 46 |
+

|
| 47 |
+

|
| 48 |
+
|
| 49 |
+
prompt: ghost with a plague doctor mask in a venice carnaval hyper realistic
|
| 50 |
+

|
| 51 |
+

|
| 52 |
+
|
| 53 |
+
prompt: A picture surrounded by blue stars and gold stars, glowing, dark navy blue and gray tones, distributed in light silver and gold, playful, festive atmosphere, pure fabric, chalk, FHD 8K
|
| 54 |
+

|
| 55 |
+

|
| 56 |
+
|
| 57 |
+
prompt: Delicious vegetarian pizza with champignon mushrooms, tomatoes, mozzarella, peppers and black olives, isolated on white background , transparent isolated white background , top down view, studio photo, transparent png, Clean sharp focus. High end retouching. Food magazine photography. Award winning photography. Advertising photography. Commercial photography
|
| 58 |
+

|
| 59 |
+

|
| 60 |
+
|
| 61 |
+
prompt: a blonde woman in a wedding dress in a maple forest in summer with a flower crown laurel. Watercolor painting in the style of John William Waterhouse. Romanticism. Ethereal light.
|
| 62 |
+

|
| 63 |
+

|
| 64 |
+
|
| 65 |
+
### Examples Anime(Note that you need to change the base model to CounterfeitXL, others remains the same)
|
| 66 |
+

|
| 67 |
+

|
| 68 |
+
|
| 69 |
+

|
| 70 |
+

|
| 71 |
+
|
| 72 |
+

|
| 73 |
+

|
| 74 |
+
|
| 75 |
+

|
| 76 |
+

|
| 77 |
+
|
| 78 |
+
|
| 79 |
+
## How to Get Started with the Model
|
| 80 |
+
|
| 81 |
+
Use the code below to get started with the model.
|
| 82 |
+
|
| 83 |
+
```python
|
| 84 |
+
from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL
|
| 85 |
+
from diffusers import DDIMScheduler, EulerAncestralDiscreteScheduler
|
| 86 |
+
from PIL import Image
|
| 87 |
+
import torch
|
| 88 |
+
import numpy as np
|
| 89 |
+
import cv2
|
| 90 |
+
|
| 91 |
+
controlnet_conditioning_scale = 1.0
|
| 92 |
+
prompt = "your prompt, the longer the better, you can describe it as detail as possible"
|
| 93 |
+
negative_prompt = 'longbody, lowres, bad anatomy, bad hands, missing fingers, extra digit, fewer digits, cropped, worst quality, low quality'
|
| 94 |
+
|
| 95 |
+
|
| 96 |
+
|
| 97 |
+
eulera_scheduler = EulerAncestralDiscreteScheduler.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", subfolder="scheduler")
|
| 98 |
+
|
| 99 |
+
|
| 100 |
+
controlnet = ControlNetModel.from_pretrained(
|
| 101 |
+
"xinsir/controlnet-canny-sdxl-1.0",
|
| 102 |
+
torch_dtype=torch.float16
|
| 103 |
+
)
|
| 104 |
+
|
| 105 |
+
# when test with other base model, you need to change the vae also.
|
| 106 |
+
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
|
| 107 |
+
|
| 108 |
+
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
|
| 109 |
+
"stabilityai/stable-diffusion-xl-base-1.0",
|
| 110 |
+
controlnet=controlnet,
|
| 111 |
+
vae=vae,
|
| 112 |
+
safety_checker=None,
|
| 113 |
+
torch_dtype=torch.float16,
|
| 114 |
+
scheduler=eulera_scheduler,
|
| 115 |
+
)
|
| 116 |
+
|
| 117 |
+
# need to resize the image resolution to 1024 * 1024 or same bucket resolution to get the best performance
|
| 118 |
+
|
| 119 |
+
controlnet_img = cv2.imread("your image path")
|
| 120 |
+
height, width, _ = controlnet_img.shape
|
| 121 |
+
ratio = np.sqrt(1024. * 1024. / (width * height))
|
| 122 |
+
new_width, new_height = int(width * ratio), int(height * ratio)
|
| 123 |
+
controlnet_img = cv2.resize(controlnet_img, (new_width, new_height))
|
| 124 |
+
|
| 125 |
+
controlnet_img = cv2.Canny(controlnet_img, 100, 200)
|
| 126 |
+
controlnet_img = HWC3(controlnet_img)
|
| 127 |
+
controlnet_img = Image.fromarray(controlnet_img)
|
| 128 |
+
|
| 129 |
+
images = pipe(
|
| 130 |
+
prompt,
|
| 131 |
+
negative_prompt=negative_prompt,
|
| 132 |
+
image=controlnet_img,
|
| 133 |
+
controlnet_conditioning_scale=controlnet_conditioning_scale,
|
| 134 |
+
width=new_width,
|
| 135 |
+
height=new_height,
|
| 136 |
+
num_inference_steps=30,
|
| 137 |
+
).images
|
| 138 |
+
|
| 139 |
+
images[0].save(f"your image save path, png format is usually better than jpg or webp in terms of image quality but got much bigger")
|
| 140 |
+
```
|
| 141 |
+
|
| 142 |
+
|
| 143 |
+
|
| 144 |
+
## Training Details
|
| 145 |
+
|
| 146 |
+
The model is trained using high quality data, only 1 stage training. The resolution setting is the same with sdxl-base, 1024*1024
|
| 147 |
+
|
| 148 |
+
|
| 149 |
+
### Training Data
|
| 150 |
+
|
| 151 |
+
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
| 152 |
+
|
| 153 |
+
The data consists of many sources, including midjourney, laion 5B, danbooru, and so on. The data is carefully filtered and annotated.
|
| 154 |
+
|
| 155 |
+
|
| 156 |
+
### Evaluation
|
| 157 |
+
|
| 158 |
+
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
|
| 159 |
+
|
| 160 |
+
In our evaluation, the model got better aesthetic score in real images compared with stabilityai/stable-diffusion-xl-base-1.0, and comparable performance in cartoon sytle images.
|
| 161 |
+
The model is better in control ability when test with perception similarity due to more strong data augmentation and more training steps.
|
| 162 |
+
Besides, the model has lower rate to generate abnormal images which tend to include some abnormal human structure.
|