Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,104 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: creativeml-openrail-m
|
| 3 |
+
license_name: tokforge-sd15-ipadapter-bundle
|
| 4 |
+
tags:
|
| 5 |
+
- text-to-image
|
| 6 |
+
- image-to-image
|
| 7 |
+
- stable-diffusion
|
| 8 |
+
- ip-adapter
|
| 9 |
+
- dreamshaper
|
| 10 |
+
- reference-image
|
| 11 |
+
- identity
|
| 12 |
+
- gguf
|
| 13 |
+
- stable-diffusion-cpp
|
| 14 |
+
- tokforge
|
| 15 |
+
base_model:
|
| 16 |
+
- Lykon/dreamshaper-7
|
| 17 |
+
- h94/IP-Adapter
|
| 18 |
+
pipeline_tag: text-to-image
|
| 19 |
+
---
|
| 20 |
+
|
| 21 |
+
# TokForge β SD1.5 IP-Adapter (Reference Identity) bundle
|
| 22 |
+
|
| 23 |
+
The **reference-identity** image route for the [TokForge](https://tokforge.ai) Android app.
|
| 24 |
+
Attach a photo of a person, then render **that person** in any scene
|
| 25 |
+
(*"me as a superhero flying over New York"*). The **plus-face** IP-Adapter transfers
|
| 26 |
+
the **face only** while the **prompt drives the whole scene**.
|
| 27 |
+
|
| 28 |
+
This bundle runs on the on-device [`stable-diffusion.cpp`](https://github.com/leejet/stable-diffusion.cpp)
|
| 29 |
+
GGUF engine (TokForge's IP-Adapter port) on **CPU** and **Adreno OpenCL**. SD1.5 is
|
| 30 |
+
light enough for any 8 GB+ phone β the broadest-reach identity tier (lighter than the
|
| 31 |
+
SDXL PhotoMaker tier).
|
| 32 |
+
|
| 33 |
+
## Files
|
| 34 |
+
|
| 35 |
+
| File | Size | License | Contents |
|
| 36 |
+
|------|------|---------|----------|
|
| 37 |
+
| `sd15-base-f16.gguf` | ~2.2 GB | CreativeML-OpenRAIL-M | **DreamShaper-7** (SD1.5 realistic finetune) β CLIP text encoder + UNet + VAE in one **f16** GGUF |
|
| 38 |
+
| `ip-adapter-plus-face_sd15.safetensors` | ~98 MB | Apache-2.0 | IP-Adapter **plus-face** (`h94/IP-Adapter`) β 16-token Resampler + decoupled cross-attn |
|
| 39 |
+
| `ip_adapter_clip_vision_vith.safetensors` | ~2.5 GB | MIT | OpenCLIP **ViT-H-14** image encoder (the plus-face path needs ViT-H, not bigG) |
|
| 40 |
+
|
| 41 |
+
`manifest.json` and `MD5SUMS` carry the integrity hashes + render defaults.
|
| 42 |
+
|
| 43 |
+
### Why this base, and why f16 (not Q4)
|
| 44 |
+
|
| 45 |
+
The base is the **standard, non-LCM DreamShaper-7** β the same realistic SD1.5 finetune
|
| 46 |
+
TokForge ships on its other image tiers. It is converted at **f16** (full precision) so
|
| 47 |
+
the IP-Adapter's decoupled cross-attention and the face Resampler keep **subject quality**
|
| 48 |
+
high. A `q4_0`/emaonly base measurably weakens the transferred identity, so this bundle
|
| 49 |
+
deliberately uses f16.
|
| 50 |
+
|
| 51 |
+
### Why plus-face (not the base adapter)
|
| 52 |
+
|
| 53 |
+
The **base** `ip-adapter_sd15` projects the whole pooled CLIP embedding (4 tokens) β it
|
| 54 |
+
drags the reference's *entire scene* through (a car selfie came out *"the person in his car"*).
|
| 55 |
+
The **plus-face** Resampler extracts the **face only** (16 tokens from the ViT-H penultimate
|
| 56 |
+
hidden state) β identity is preserved while the **prompt** controls the scene. The TokForge
|
| 57 |
+
sd.cpp IP-Adapter loader auto-detects plus-face by the presence of `image_proj.latents`.
|
| 58 |
+
|
| 59 |
+
## How TokForge uses it
|
| 60 |
+
|
| 61 |
+
In the app: **Image** model picker β download **"SD1.5 IP-Adapter (Reference Identity)"** β
|
| 62 |
+
attach a face photo as a reference under chat β prompt the scene. The engine is invoked as:
|
| 63 |
+
|
| 64 |
+
```bash
|
| 65 |
+
sd -M img_gen \
|
| 66 |
+
-m sd15-base-f16.gguf \
|
| 67 |
+
-p "as a superhero flying over New York" \
|
| 68 |
+
-n "<strong negative>" \
|
| 69 |
+
--clip_vision ip_adapter_clip_vision_vith.safetensors \
|
| 70 |
+
--ip-adapter ip-adapter-plus-face_sd15.safetensors \
|
| 71 |
+
--ip-adapter-image <your_face.jpg> \
|
| 72 |
+
--ip-adapter-scale 0.6 \
|
| 73 |
+
--cfg-scale 7.0 --sampling-method euler_a --scheduler discrete \
|
| 74 |
+
--steps 30 -H 512 -W 512
|
| 75 |
+
```
|
| 76 |
+
|
| 77 |
+
### Recommended render settings
|
| 78 |
+
|
| 79 |
+
| Setting | Value |
|
| 80 |
+
|---------|-------|
|
| 81 |
+
| sampler | `euler_a` |
|
| 82 |
+
| scheduler | `discrete` |
|
| 83 |
+
| steps | `30` (full quality; fewer = faster) |
|
| 84 |
+
| cfg-scale | `7.0` |
|
| 85 |
+
| ip-adapter-scale | `0.6` (β0.5β0.6 keeps the scene with recognizable identity; ~0.8 reconstructs the reference) |
|
| 86 |
+
| resolution | `512Γ512` (SD1.5 native) |
|
| 87 |
+
|
| 88 |
+
## Licenses
|
| 89 |
+
|
| 90 |
+
This is an aggregate of three independently-licensed components β each retains its own license:
|
| 91 |
+
|
| 92 |
+
- **DreamShaper-7 base** (`sd15-base-f16.gguf`) β **CreativeML-OpenRAIL-M** ([Lykon/dreamshaper-7](https://huggingface.co/Lykon/dreamshaper-7)). Use must comply with the OpenRAIL-M use-based restrictions.
|
| 93 |
+
- **IP-Adapter plus-face** (`ip-adapter-plus-face_sd15.safetensors`) β **Apache-2.0** ([h94/IP-Adapter](https://huggingface.co/h94/IP-Adapter)).
|
| 94 |
+
- **OpenCLIP ViT-H-14 image encoder** (`ip_adapter_clip_vision_vith.safetensors`) β **MIT** (OpenCLIP / LAION ViT-H-14).
|
| 95 |
+
|
| 96 |
+
> The non-commercial **IP-Adapter-FaceID** / InsightFace path is **NOT** used here β only the
|
| 97 |
+
> Apache-2.0 base + plus-face adapters from `h94/IP-Adapter`.
|
| 98 |
+
|
| 99 |
+
## Provenance
|
| 100 |
+
|
| 101 |
+
- Base converted from `Lykon/dreamshaper-7` (diffusers) to a single f16 GGUF via the TokForge
|
| 102 |
+
`stable-diffusion.cpp` convert path (`-M convert --type f16`).
|
| 103 |
+
- Adapter + image encoder copied verbatim from `h94/IP-Adapter` (`models/ip-adapter-plus-face_sd15.safetensors`,
|
| 104 |
+
`models/image_encoder/model.safetensors`).
|