TokForge β DreamShaper-7 + LCM GGUF (Q4_0)
A Q4_0-quantized single self-contained GGUF for
stable-diffusion.cpp,
packaging DreamShaper-7 (SD1.5, realistic
finetune) with the LCM-LoRA
fused into the UNet for fast, guidance-free few-step sampling.
This is the 6 GB-tier fast image route for the TokForge apps β the
smaller sibling of TokForge-DreamShaper-LCM-GGUF
(f16, ~2.1 GB). It renders the same coherent people and hands (no SD-Turbo body-horror) at the
same few-step LCM speed class, but at a smaller download + RAM footprint so it fits the 6 GB
device tier where the f16 build is gated out.
Files
| File | Size | Precision | Contents |
|---|---|---|---|
dreamshaper-7-lcm-q4_0.gguf |
~1.63 GB | Q4_0 (mixed) | CLIP text encoder (F16) + LCM-fused UNet (Q4_0 linears/1Γ1 convs, F16 3Γ3 convs) + VAE |
MD5SUMS and manifest.json carry the integrity hash + render defaults.
Precision details (CLIP-safe quantization)
This GGUF was quantized with stable-diffusion.cpp's own -M convert --type q4_0, whose
tensor_should_be_converted rule protects embeddings and norms β it keeps every tensor whose
name contains embedding (and all .bias / .scale / norm tensors) at F16. So the CLIP
text encoder's token_embedding.weight stays F16.
This is deliberate and required: a block-quantized CLIP token-embedding (the failure mode of
some external blanket-quantizers) produces empty CLIP conditioning through ggml_get_rows and
makes sd.cpp abort at conditioner.hpp GGML_ASSERT(!chunk_hidden_states.empty()). Keeping CLIP
at F16 avoids that entirely.
Note the file is ~1.63 GB rather than ~0.6 GB because SD-1.5's UNet is dominated by 3Γ3
convolution weights (ne[0] = 3), which are not divisible by Q4_0's block size of 32 and so
are kept at F16 by ggml (the same reason the working gpustack Q4_0 SD-1.5 GGUF is ~1.75 GB).
The Q4_0 win comes from the attention/projection linears and 1Γ1 convs. Verified type histogram:
690 F16 tensors + 440 Q4_0 tensors; CLIP token_embedding.weight = F16.
Recommended render settings (LCM, few-step, guidance-free)
sampler: lcm
scheduler: lcm
steps: 6 (4 = fast floor, 8 = extra refinement)
cfg-scale: 1.5
resolution: 512x512 (SD1.5 native; 256/384 presets also work)
stable-diffusion.cpp CLI example
sd -M img_gen \
-m dreamshaper-7-lcm-q4_0.gguf \
-p "a busy outdoor street market crowded with people shopping, candid street photo" \
--sampling-method lcm --scheduler lcm --steps 6 --cfg-scale 1.5 \
-W 512 -H 512 -o out.png
Provenance & how this was built
- Started from
TokForge-DreamShaper-LCM-GGUFdreamshaper-7-lcm-f16.gguf(DreamShaper-7 with LCM-LoRA fused into the UNet, exported to a single SD1.5 GGUF, f16). - Quantized to Q4_0 with
stable-diffusion.cpp(leejet):sd -M convert -m dreamshaper-7-lcm-f16.gguf -o dreamshaper-7-lcm-q4_0.gguf --type q4_0. sd.cpp's quantizer keeps CLIP / embeddings / norms at F16 (model_loader.cpp tensor_should_be_converted). - Verified the CLIP
token_embedding.weightis F16 in the output header, and ran a 6-step LCM smoke render that produced a coherent on-prompt image.
License & attribution
- License: CreativeML OpenRAIL-M (inherited from DreamShaper-7 / Stable Diffusion 1.5). Use is subject to the OpenRAIL-M use restrictions.
- Base model: DreamShaper-7 by Lykon β https://huggingface.co/Lykon/dreamshaper-7
- Adapter: LCM-LoRA SD1.5 by Latent Consistency β https://huggingface.co/latent-consistency/lcm-lora-sdv1-5
- Quantization tooling: stable-diffusion.cpp by leejet
- Built on top of Stable Diffusion 1.5 (Runway/CompVis/Stability).
No additional restrictions are imposed by this repackaging; the original OpenRAIL-M terms and attribution requirements propagate to this GGUF and any images generated with it.
- Downloads last month
- -
4-bit
Model tree for darkmaniac7/TokForge-DreamShaper-LCM-GGUF-q4
Base model
Lykon/dreamshaper-7