Instructions to use Zaytron40k/Qwen-Image-Edit-2511-CharSheet2Art-LoRA with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Inference
Qwen-Image-Edit-2511 LoRA - Character References -> Scene
Turns one or more character reference images (neutral pose, plain light-gray background, eye-level) into a full illustrated scene with those characters, preserving identity and art style. Trained with DiffSynth-Studio on 179 reference->scene pairs (stylized Western digital art).
Prompt format (same as training):
Using the character(s) from Image 1, create a full illustrated scene: <scene description>
Multiple separate references are supported: ...from Image 1 and ... from Image 2...
Usage
import torch
from PIL import Image
from diffsynth.pipelines.qwen_image import QwenImagePipeline, ModelConfig
pipe = QwenImagePipeline.from_pretrained(
torch_dtype=torch.bfloat16, device="cuda",
model_configs=[
ModelConfig(model_id="Qwen/Qwen-Image-Edit-2511", origin_file_pattern="transformer/diffusion_pytorch_model*.safetensors"),
ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="text_encoder/model*.safetensors"),
ModelConfig(model_id="Qwen/Qwen-Image", origin_file_pattern="vae/diffusion_pytorch_model.safetensors"),
],
tokenizer_config=None,
processor_config=ModelConfig(model_id="Qwen/Qwen-Image-Edit", origin_file_pattern="processor/"),
)
pipe.load_lora(pipe.dit, "checkpoints/epoch-4.safetensors")
ref = Image.open("reference.jpeg")
img = pipe("Using the character from Image 1, create a full illustrated scene: ...",
edit_image=[ref], seed=0, num_inference_steps=40,
height=1328, width=1024, zero_cond_t=True) # zero_cond_t is REQUIRED for 2511
Training config
| base | Qwen/Qwen-Image-Edit-2511 (DiT only) |
| rank / lr | 32 / 1e-4 |
| epochs x steps | 5 x 537 (179 pairs, repeat 3) |
| resolution | dynamic, max_pixels 1048576 (native AR) |
| precision | bf16 + gradient checkpointing |
| special | --zero_cond_t (2511-specific, also required at inference) |
Full args: training_config.json
Loss
| epoch | step | EMA loss min |
|---|---|---|
| 0 | 525 | 0.0686 |
| 1 | 864 | 0.0688 |
| 2 | 1422 | 0.0670 <- global min |
| 3 | 2118 | 0.0680 |
| 4 | 2183 | 0.0682 |
Validation samples
19 held-out prompts x 3 checkpoints (epochs 2-4) in val_samples/ - characters never seen in training; includes single-character, multi-character composite (one input) and multi-image (separate inputs) modes. Prompts: val_samples/prompts.json. Examples (epoch-4):
Dataset sample
One training pair in dataset_example/: reference (model input), scene (target) and the caption (pair.json).
Model tree for Zaytron40k/Qwen-Image-Edit-2511-CharSheet2Art-LoRA
Base model
Qwen/Qwen-Image-Edit-2511





