hotdogs
/

qwen3.6-35b-opus-to-kimi-lora

Text Generation

Model card Files Files and versions

hotdogs commited on May 2

Commit

e898ae1

·

verified ·

1 Parent(s): 0e96fd5

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +34 -0

README.md CHANGED Viewed

@@ -191,6 +191,40 @@ sudo docker run --rm -p 8080:8080 \
 | `-fa on` | Flash Attention enabled |
 | `--mlock` | Lock model in RAM (prevents swap) |
 | `--jinja` | Use Jinja2 chat templates |
 **Single GPU alternative:**
 ```bash

 | `-fa on` | Flash Attention enabled |
 | `--mlock` | Lock model in RAM (prevents swap) |
 | `--jinja` | Use Jinja2 chat templates |
+| `--lora` | Apply LoRA adapter (applied first, before scaled) |
+| `--lora-scaled` | Apply LoRA with scale (comma-separated for multiple) |
+---
+### 🛡️ 3-Layer Stack with Refusal Removal LoRA
+For the **purest uncensored stack** using weight-diff extracted LoRAs:
+| Layer | Component | Purpose |
+|-------|-----------|---------|
+| 1 | Opus GGUF (base model) | Qwen3.6-35B + Opus reasoning |
+| 2 | [refusal-removal-lora](https://huggingface.co/hotdogs/qwen3.6-35b-refusal-removal-lora) | 🛡️ Remove refusals (uncensored) |
+| 3 | opus-to-kimi-lora (scale 0.5) | 🎨 Kimi K2.6 verbose style |
+```bash
+docker run --gpus all -p 8080:8080 \
+  -v /path/to/models:/models \
+  ghcr.io/ggml-org/llama.cpp:server-cuda \
+  -m /models/lordx64_Qwen3.6-35B-A3B-Claude-4.7-Opus-Q6_K.gguf \
+  --lora /models/qwen3.6-35b-refusal-removal-lora.gguf \
+  --lora-scaled /models/qwen3.6-35b-opus-to-kimi-lora.gguf:0.5 \
+  --host 0.0.0.0 --port 8080 \
+  --n-gpu-layers 999 \
+  --ctx-size 131072 \
+  --batch-size 4096 \
+  -fa on
+```
+> 🔬 **Technical note**: The refusal-removal LoRA was extracted via Weight-Diff SVD from `huihui-ai/Huihui-Qwen3.6-35B-A3B-Claude-4.7-Opus-abliterated` minus `lordx64/...Opus`. It modifies **only o_proj** in 10 layers (3,7,11,15,19,23,27,31,35,39) — an extremely sparse signal compared to full distillation (Kimi LoRA touches all 44 attention tensors).
+---
+**Old stack (uncensored GGUF base):**
 **Single GPU alternative:**
 ```bash