Duplicate from empero-ai/Qwythos-9B-Claude-Mythos-5-1M-GGUF
Browse filesCo-authored-by: Empero <empero-ai@users.noreply.huggingface.co>
- .gitattributes +47 -0
- Qwythos-9B-Claude-Mythos-5-1M-BF16.gguf +3 -0
- Qwythos-9B-Claude-Mythos-5-1M-MTP-BF16.gguf +3 -0
- Qwythos-9B-Claude-Mythos-5-1M-MTP-Q4_K_M.gguf +3 -0
- Qwythos-9B-Claude-Mythos-5-1M-MTP-Q5_K_M.gguf +3 -0
- Qwythos-9B-Claude-Mythos-5-1M-MTP-Q6_K.gguf +3 -0
- Qwythos-9B-Claude-Mythos-5-1M-MTP-Q8_0.gguf +3 -0
- Qwythos-9B-Claude-Mythos-5-1M-Q4_K_M.gguf +3 -0
- Qwythos-9B-Claude-Mythos-5-1M-Q5_K_M.gguf +3 -0
- Qwythos-9B-Claude-Mythos-5-1M-Q6_K.gguf +3 -0
- Qwythos-9B-Claude-Mythos-5-1M-Q8_0.gguf +3 -0
- README.md +247 -0
- SHA256SUMS +11 -0
- TEST_REPORT.md +53 -0
- mmproj-Qwythos-9B-Claude-Mythos-5-1M-F16.gguf +3 -0
- mmproj-Qwythos-9B-Claude-Mythos-5-1M-f16.gguf +3 -0
.gitattributes
ADDED
|
@@ -0,0 +1,47 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
Qwythos-9B-Claude-Mythos-5-1M-BF16.gguf filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
Qwythos-9B-Claude-Mythos-5-1M-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
Qwythos-9B-Claude-Mythos-5-1M-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
| 39 |
+
Qwythos-9B-Claude-Mythos-5-1M-Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
|
| 40 |
+
Qwythos-9B-Claude-Mythos-5-1M-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
| 41 |
+
mmproj-Qwythos-9B-Claude-Mythos-5-1M-f16.gguf filter=lfs diff=lfs merge=lfs -text
|
| 42 |
+
Qwythos-9B-Claude-Mythos-5-1M-MTP-BF16.gguf filter=lfs diff=lfs merge=lfs -text
|
| 43 |
+
Qwythos-9B-Claude-Mythos-5-1M-MTP-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
| 44 |
+
Qwythos-9B-Claude-Mythos-5-1M-MTP-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
|
| 45 |
+
mmproj-Qwythos-9B-Claude-Mythos-5-1M-F16.gguf filter=lfs diff=lfs merge=lfs -text
|
| 46 |
+
Qwythos-9B-Claude-Mythos-5-1M-MTP-Q6_K.gguf filter=lfs diff=lfs merge=lfs -text
|
| 47 |
+
Qwythos-9B-Claude-Mythos-5-1M-MTP-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
|
Qwythos-9B-Claude-Mythos-5-1M-BF16.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c49e2e419c0fc9f7cca0a7e0699fa3075fa1ec2913bda6f6841194b19cf4ce29
|
| 3 |
+
size 17920697344
|
Qwythos-9B-Claude-Mythos-5-1M-MTP-BF16.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c704996e63c3e31a1e7040278b1204aabf8a6dad0b07f3c8da7b429d2ed30b31
|
| 3 |
+
size 18407321536
|
Qwythos-9B-Claude-Mythos-5-1M-MTP-Q4_K_M.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:24ee22e0f5d9f0d3d615809607f365c728d9b0c3f3fb6eb19d8bd83a1c2933d8
|
| 3 |
+
size 5887668160
|
Qwythos-9B-Claude-Mythos-5-1M-MTP-Q5_K_M.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c4e3c47b17a93cc1566abcee4a0c9a4413b172a6f8be7f2a8824495907e9e893
|
| 3 |
+
size 6726528960
|
Qwythos-9B-Claude-Mythos-5-1M-MTP-Q6_K.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:eae2adb4b451f673218fba96ebdc024fb95c4026ef35a56837dc9af47f99ca39
|
| 3 |
+
size 7617818560
|
Qwythos-9B-Claude-Mythos-5-1M-MTP-Q8_0.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:34eada291b4f4d1b8592f53d1739c7a2e029011b5bc2624b979b0a544562a5e9
|
| 3 |
+
size 9786060736
|
Qwythos-9B-Claude-Mythos-5-1M-Q4_K_M.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0de41ff56ab4eff26764437b276f0bd5d20f44232add34e14c4f593fa1aeb08f
|
| 3 |
+
size 5629109248
|
Qwythos-9B-Claude-Mythos-5-1M-Q5_K_M.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:51dee6e6ddaeb08a06bee258ac5c58f4186f55d156d6403c2db67cf2f3226280
|
| 3 |
+
size 6467970048
|
Qwythos-9B-Claude-Mythos-5-1M-Q6_K.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:05f2420a669a83ae239d1328c6e83d9fc19e8a4b8dab5d0673cab28cc161dfc5
|
| 3 |
+
size 7359259648
|
Qwythos-9B-Claude-Mythos-5-1M-Q8_0.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ecf51ef3904694e04a1967a06fe7539619c34e6e81a721fc5f5415827081d8f6
|
| 3 |
+
size 9527501824
|
README.md
ADDED
|
@@ -0,0 +1,247 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
base_model: empero-ai/Qwythos-9B-Claude-Mythos-5-1M
|
| 4 |
+
base_model_relation: quantized
|
| 5 |
+
language:
|
| 6 |
+
- en
|
| 7 |
+
pipeline_tag: text-generation
|
| 8 |
+
library_name: gguf
|
| 9 |
+
tags:
|
| 10 |
+
- gguf
|
| 11 |
+
- llama.cpp
|
| 12 |
+
- quantized
|
| 13 |
+
- qwen3.5
|
| 14 |
+
- reasoning
|
| 15 |
+
- uncensored
|
| 16 |
+
- long-context
|
| 17 |
+
- 1M-context
|
| 18 |
+
- function-calling
|
| 19 |
+
- multimodal
|
| 20 |
+
- vision
|
| 21 |
+
- cybersecurity
|
| 22 |
+
- biomedical
|
| 23 |
+
- agentic
|
| 24 |
+
---
|
| 25 |
+
|
| 26 |
+
<p align="center">
|
| 27 |
+
<img src="https://huggingface.co/empero-ai/Qwythos-9B-Claude-Mythos-5-1M/resolve/main/assets/qwythos.png" alt="Qwythos-9B" width="640"/>
|
| 28 |
+
</p>
|
| 29 |
+
|
| 30 |
+
<table>
|
| 31 |
+
<tr>
|
| 32 |
+
<td>
|
| 33 |
+
|
| 34 |
+
## 🚨 v2 released — please redownload the GGUFs
|
| 35 |
+
|
| 36 |
+
The v2 GGUFs replace the original normal filenames and add explicit `-MTP-` variants. If you downloaded this repo before v2, please redownload your GGUF.
|
| 37 |
+
|
| 38 |
+
Fixes in v2:
|
| 39 |
+
|
| 40 |
+
- tokenizer metadata normalized for Qwen3.5 GGUF runtimes;
|
| 41 |
+
- embedded chat template updated for reliable tool/function calling and OpenCode-style agent loops;
|
| 42 |
+
- Qwythos/Empero identity prompt embedded in the template;
|
| 43 |
+
- MTP-enabled variants added as `Qwythos-9B-Claude-Mythos-5-1M-MTP-*.gguf`;
|
| 44 |
+
- Q4/Q8 tool-calling, MTP draft speculation, 1M-context allocation, and vision projector smoke-tested with current llama.cpp.
|
| 45 |
+
|
| 46 |
+
Use the normal files for maximum runtime compatibility. Use the `-MTP-` files when you want llama.cpp MTP draft speculation.
|
| 47 |
+
|
| 48 |
+
</td>
|
| 49 |
+
</tr>
|
| 50 |
+
</table>
|
| 51 |
+
|
| 52 |
+
# Qwythos-9B-Claude-Mythos-5-1M-GGUF
|
| 53 |
+
|
| 54 |
+
**Developed by [Empero](https://empero.org)**
|
| 55 |
+
|
| 56 |
+
GGUF quantizations of **[empero-ai/Qwythos-9B-Claude-Mythos-5-1M](https://huggingface.co/empero-ai/Qwythos-9B-Claude-Mythos-5-1M)** for [llama.cpp](https://github.com/ggml-org/llama.cpp), Ollama, LM Studio, jan, KoboldCpp, and other GGUF runtimes.
|
| 57 |
+
|
| 58 |
+
Qwythos-9B is a full-parameter reasoning model post-trained on over 500 million tokens of high-quality Claude Mythos / Claude Fable traces with chain-of-thought generated in-house by Empero AI's internal `rethink` tool. It dominates the base Qwen3.5-9B under matched evaluation (**+34 pts MMLU, +30 pts gsm8k-strict, +19 pts gsm8k-flex**), supports **native function calling** per the Qwen3.5 spec, and ships with a **1,048,576-token (1M) context window** via YaRN rope-scaling enabled by default.
|
| 59 |
+
|
| 60 |
+
For full training details, evaluation numbers, and capability writeup, see the **[base model card](https://huggingface.co/empero-ai/Qwythos-9B-Claude-Mythos-5-1M)**.
|
| 61 |
+
|
| 62 |
+
---
|
| 63 |
+
|
| 64 |
+
## Files
|
| 65 |
+
|
| 66 |
+
### Normal text weights — fixed v2 replacements
|
| 67 |
+
|
| 68 |
+
| File | Quant | Size | Notes |
|
| 69 |
+
|---|---|---|---|
|
| 70 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-Q4_K_M.gguf` | Q4_K_M | 5.24 GiB / 5.63 GB | **recommended default** — fixed v2, best compatibility |
|
| 71 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-Q5_K_M.gguf` | Q5_K_M | 6.02 GiB / 6.47 GB | fixed v2, balanced quality / size |
|
| 72 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-Q6_K.gguf` | Q6_K | 6.85 GiB / 7.36 GB | fixed v2, high quality |
|
| 73 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-Q8_0.gguf` | Q8_0 | 8.87 GiB / 9.53 GB | fixed v2, near-lossless |
|
| 74 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-BF16.gguf` | BF16 | 16.69 GiB / 17.92 GB | fixed v2, full precision conversion base |
|
| 75 |
+
|
| 76 |
+
If you don't know which to pick, **Q4_K_M is the right starting point** — it's the smallest practical quant with good quality preservation.
|
| 77 |
+
|
| 78 |
+
### MTP-enabled text weights — v2 variants
|
| 79 |
+
|
| 80 |
+
These include the restored Qwen3.5-compatible MTP head inside the GGUF. Use them with llama.cpp builds that support MTP draft speculation, for example `--spec-type draft-mtp`.
|
| 81 |
+
|
| 82 |
+
| File | Quant | Size | Notes |
|
| 83 |
+
|---|---|---|---|
|
| 84 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-MTP-Q4_K_M.gguf` | Q4_K_M + MTP | 5.48 GiB / 5.89 GB | **recommended MTP default** |
|
| 85 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-MTP-Q5_K_M.gguf` | Q5_K_M + MTP | 6.26 GiB / 6.73 GB | MTP, balanced quality / size |
|
| 86 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-MTP-Q6_K.gguf` | Q6_K + MTP | 7.09 GiB / 7.62 GB | MTP, high quality |
|
| 87 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-MTP-Q8_0.gguf` | Q8_0 + MTP | 9.11 GiB / 9.79 GB | MTP, near-lossless |
|
| 88 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-MTP-BF16.gguf` | BF16 + MTP | 17.14 GiB / 18.41 GB | MTP, full precision conversion base |
|
| 89 |
+
|
| 90 |
+
### Vision projector — for image input
|
| 91 |
+
|
| 92 |
+
| File | Size | Notes |
|
| 93 |
+
|---|---|---|
|
| 94 |
+
| `mmproj-Qwythos-9B-Claude-Mythos-5-1M-F16.gguf` | 0.86 GiB / 0.92 GB | CLIP-style vision encoder + projector; **required for images**, pairs with any normal or MTP quant above |
|
| 95 |
+
|
| 96 |
+
Qwythos inherits its **vision tower from the Qwen3.5-9B base model** — the vision path was *frozen* during SFT (training was text-only), so the vision behavior is identical to base Qwen3.5-9B's multimodal capability. The mmproj is interchangeable with any community-built Qwen3.5-9B `mmproj-*.gguf`.
|
| 97 |
+
|
| 98 |
+
---
|
| 99 |
+
|
| 100 |
+
## Quick start
|
| 101 |
+
|
| 102 |
+
### llama.cpp (`llama-cli`)
|
| 103 |
+
|
| 104 |
+
```bash
|
| 105 |
+
llama-cli \
|
| 106 |
+
-m Qwythos-9B-Claude-Mythos-5-1M-Q4_K_M.gguf \
|
| 107 |
+
-p "Walk through the biochemistry of how organophosphate nerve agents inhibit acetylcholinesterase." \
|
| 108 |
+
-n 8192 \
|
| 109 |
+
--temp 0.6 --top-p 0.95 --top-k 20 --repeat-penalty 1.05 \
|
| 110 |
+
-c 16384
|
| 111 |
+
```
|
| 112 |
+
|
| 113 |
+
### Ollama
|
| 114 |
+
|
| 115 |
+
```bash
|
| 116 |
+
ollama run hf.co/empero-ai/Qwythos-9B-Claude-Mythos-5-1M-GGUF:Q4_K_M
|
| 117 |
+
```
|
| 118 |
+
|
| 119 |
+
### LM Studio / jan / KoboldCpp
|
| 120 |
+
|
| 121 |
+
Drop any of the `.gguf` files into your runtime's model directory. Qwythos uses the standard Qwen3.5 chat template; modern GGUF runtimes load it automatically from the file.
|
| 122 |
+
|
| 123 |
+
### llama.cpp with MTP draft speculation
|
| 124 |
+
|
| 125 |
+
```bash
|
| 126 |
+
llama-server \
|
| 127 |
+
-m Qwythos-9B-Claude-Mythos-5-1M-MTP-Q4_K_M.gguf \
|
| 128 |
+
--spec-type draft-mtp \
|
| 129 |
+
--spec-draft-n-max 6 \
|
| 130 |
+
-c 16384 --port 8080
|
| 131 |
+
```
|
| 132 |
+
|
| 133 |
+
MTP support requires a recent llama.cpp build. If your runtime does not support MTP yet, use the normal v2 files above.
|
| 134 |
+
|
| 135 |
+
---
|
| 136 |
+
|
| 137 |
+
## Vision (image input)
|
| 138 |
+
|
| 139 |
+
Qwythos supports **image input** out of the box. Download both a text quant and the `mmproj-*.gguf` file from this repo, then run with llama.cpp's multimodal CLI or server.
|
| 140 |
+
|
| 141 |
+
### llama.cpp (`llama-mtmd-cli`)
|
| 142 |
+
|
| 143 |
+
```bash
|
| 144 |
+
llama-mtmd-cli \
|
| 145 |
+
-m Qwythos-9B-Claude-Mythos-5-1M-Q4_K_M.gguf \
|
| 146 |
+
--mmproj mmproj-Qwythos-9B-Claude-Mythos-5-1M-F16.gguf \
|
| 147 |
+
--image ./photo.jpg \
|
| 148 |
+
-p "Describe this image in detail." \
|
| 149 |
+
--temp 0.6 --top-p 0.95 --top-k 20 \
|
| 150 |
+
-c 16384
|
| 151 |
+
```
|
| 152 |
+
|
| 153 |
+
### llama.cpp server (OpenAI-compatible API with images)
|
| 154 |
+
|
| 155 |
+
```bash
|
| 156 |
+
llama-server \
|
| 157 |
+
-m Qwythos-9B-Claude-Mythos-5-1M-Q4_K_M.gguf \
|
| 158 |
+
--mmproj mmproj-Qwythos-9B-Claude-Mythos-5-1M-F16.gguf \
|
| 159 |
+
-c 16384 --port 8080
|
| 160 |
+
```
|
| 161 |
+
|
| 162 |
+
Then POST to `/v1/chat/completions` with an image URL or base64 payload — the standard OpenAI vision API shape works.
|
| 163 |
+
|
| 164 |
+
### LM Studio
|
| 165 |
+
|
| 166 |
+
Load the text quant; LM Studio detects the matching `mmproj-*.gguf` in the same folder and enables the image-attach button automatically.
|
| 167 |
+
|
| 168 |
+
### What vision unlocks
|
| 169 |
+
|
| 170 |
+
Since Qwythos inherits its vision tower unchanged from Qwen3.5-9B base, expect Qwen3.5-9B's documented vision capabilities: detailed image description, OCR (printed + handwritten), chart/table reading, UI/document understanding, basic spatial reasoning.
|
| 171 |
+
|
| 172 |
+
**Honest note:** the SFT used to produce Qwythos was **text-only** — we did not fine-tune the vision tower or train on any image-paired data. Image-grounded reasoning therefore inherits the base model's behavior; it has not been independently evaluated as part of this release. If your application is *primarily* vision-driven, validate on your own use case first.
|
| 173 |
+
|
| 174 |
+
---
|
| 175 |
+
|
| 176 |
+
## Sampling recommendations
|
| 177 |
+
|
| 178 |
+
Qwythos is a reasoning model — every response opens with a `<think>...</think>` block before the final answer. Use these settings as defaults:
|
| 179 |
+
|
| 180 |
+
| Parameter | Value |
|
| 181 |
+
|---|---|
|
| 182 |
+
| `temperature` | 0.6 |
|
| 183 |
+
| `top_p` | 0.95 |
|
| 184 |
+
| `top_k` | 20 |
|
| 185 |
+
| `repeat_penalty` | 1.05 |
|
| 186 |
+
| `max_new_tokens` | 16384 (generous budget for `<think>` + answer) |
|
| 187 |
+
|
| 188 |
+
These match Qwen3.5's official thinking-mode recommendations. **Avoid greedy decoding and very-low-temperature sampling (T ≤ 0.3)** — both can cause repetition loops on long reasoning generations.
|
| 189 |
+
|
| 190 |
+
---
|
| 191 |
+
|
| 192 |
+
## Long context (1M tokens)
|
| 193 |
+
|
| 194 |
+
The GGUFs ship with YaRN rope-scaling baked in for a **1,048,576-token context window** (4× extension over the 262k native).
|
| 195 |
+
|
| 196 |
+
To use the full 1M window in `llama-cli`, set `-c 1010000` (or any context length up to that). For shorter prompts, lower `-c` to reduce KV-cache memory — at default settings llama.cpp will autosize.
|
| 197 |
+
|
| 198 |
+
A single H100/H200-class GPU comfortably handles **256k–512k**; the full 1M typically needs tensor-parallel multi-GPU or aggressive KV-cache offload.
|
| 199 |
+
|
| 200 |
+
---
|
| 201 |
+
|
| 202 |
+
## Capabilities (from the base model card)
|
| 203 |
+
|
| 204 |
+
- **+34 pts MMLU, +30 pts gsm8k-strict, +19 pts gsm8k-flex** vs. base Qwen3.5-9B under matched lm-eval-harness evaluation
|
| 205 |
+
- **Native function calling** per Qwen3.5's chat-template spec — emits `<tool_call><function=NAME><parameter=NAME>VAL</parameter></function></tool_call>` blocks ready for any tool-use loop
|
| 206 |
+
- **Self-correcting with tools**: in a 7-prompt tool-use harness (Python executor + DuckDuckGo search), Qwythos produced source-cited correct answers on 7/7, including 4/4 closed-book failure-modes from the original review
|
| 207 |
+
- **Uncensored** — engages seriously with technically demanding questions across cybersecurity, red-teaming, biology, pharmacology, and clinical medicine
|
| 208 |
+
- **1,048,576-token (1M) context** — YaRN rope-scaling enabled by default
|
| 209 |
+
|
| 210 |
+
For full eval transcripts and per-task numbers, see the [base model card's `evals/` folder](https://huggingface.co/empero-ai/Qwythos-9B-Claude-Mythos-5-1M/tree/main/evals).
|
| 211 |
+
|
| 212 |
+
---
|
| 213 |
+
|
| 214 |
+
## Limitations
|
| 215 |
+
|
| 216 |
+
- **Reasoning model.** Every answer opens with a `<think>` block; allow generous `max_new_tokens` and parse/strip `<think>...</think>` for end users.
|
| 217 |
+
- **Use recommended sampling.** Greedy / very-low-temp can cause repetition loops.
|
| 218 |
+
- **Verify specifics in safety-critical contexts.** Like all closed-book LLMs in this weight class, Qwythos can over-commit to specific identifiers (CVEs, hashcat modes, drug positions) it isn't certain about. Pair with retrieval or function calling in such deployments — the model uses tools cleanly when offered them.
|
| 219 |
+
- **Uncensored — add your own application-level review/safety layer** for end-user-facing deployments where that matters.
|
| 220 |
+
|
| 221 |
+
---
|
| 222 |
+
|
| 223 |
+
## Stay in the loop
|
| 224 |
+
|
| 225 |
+
Sign up for the Empero newsletter at **[empero.org](https://empero.org)** for releases, evals, and research notes.
|
| 226 |
+
|
| 227 |
+
## Support / Donate
|
| 228 |
+
|
| 229 |
+
If this model helped you, consider supporting the project:
|
| 230 |
+
|
| 231 |
+
- **BTC**: `bc1qx6zepu6sfkvshgdmc4ewu6pk6rpadvpgffpp7v`
|
| 232 |
+
- **LTC**: `ltc1qv2mefzps2vtjcpwfx8xxdrpplrcvltswm68r7x`
|
| 233 |
+
- **XMR**: `42Dbm5xg5Nq26fdyzfEU7KBnAJfhi7Cvz5J2ex5CzHXkfKuNEJzYCcmJ1GTbgjFZ5MBx72sdG1G9239Cd6rsZfv4QeDkYJY`
|
| 234 |
+
|
| 235 |
+
---
|
| 236 |
+
|
| 237 |
+
## Provenance & licensing
|
| 238 |
+
|
| 239 |
+
Weights are released under **Apache-2.0**, inherited from the Qwen3.5-9B base. Shared for research and experimentation, as-is.
|
| 240 |
+
|
| 241 |
+
## Acknowledgements
|
| 242 |
+
|
| 243 |
+
- Developed and released by [Empero](https://empero.org)
|
| 244 |
+
- Base model: [Qwen3.5-9B](https://huggingface.co/Qwen/Qwen3.5-9B) (Alibaba Qwen team)
|
| 245 |
+
- Quantization: [llama.cpp](https://github.com/ggml-org/llama.cpp) (ggml-org)
|
| 246 |
+
- Vision projector (`mmproj`): inherited from Qwen3.5-9B (vision tower unchanged); F16 GGUF re-hosted with thanks to [Unsloth](https://huggingface.co/unsloth) for the original conversion
|
| 247 |
+
- HF model: [empero-ai/Qwythos-9B-Claude-Mythos-5-1M](https://huggingface.co/empero-ai/Qwythos-9B-Claude-Mythos-5-1M)
|
SHA256SUMS
ADDED
|
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
c49e2e419c0fc9f7cca0a7e0699fa3075fa1ec2913bda6f6841194b19cf4ce29 ./Qwythos-9B-Claude-Mythos-5-1M-BF16.gguf
|
| 2 |
+
c704996e63c3e31a1e7040278b1204aabf8a6dad0b07f3c8da7b429d2ed30b31 ./Qwythos-9B-Claude-Mythos-5-1M-MTP-BF16.gguf
|
| 3 |
+
24ee22e0f5d9f0d3d615809607f365c728d9b0c3f3fb6eb19d8bd83a1c2933d8 ./Qwythos-9B-Claude-Mythos-5-1M-MTP-Q4_K_M.gguf
|
| 4 |
+
c4e3c47b17a93cc1566abcee4a0c9a4413b172a6f8be7f2a8824495907e9e893 ./Qwythos-9B-Claude-Mythos-5-1M-MTP-Q5_K_M.gguf
|
| 5 |
+
eae2adb4b451f673218fba96ebdc024fb95c4026ef35a56837dc9af47f99ca39 ./Qwythos-9B-Claude-Mythos-5-1M-MTP-Q6_K.gguf
|
| 6 |
+
34eada291b4f4d1b8592f53d1739c7a2e029011b5bc2624b979b0a544562a5e9 ./Qwythos-9B-Claude-Mythos-5-1M-MTP-Q8_0.gguf
|
| 7 |
+
0de41ff56ab4eff26764437b276f0bd5d20f44232add34e14c4f593fa1aeb08f ./Qwythos-9B-Claude-Mythos-5-1M-Q4_K_M.gguf
|
| 8 |
+
51dee6e6ddaeb08a06bee258ac5c58f4186f55d156d6403c2db67cf2f3226280 ./Qwythos-9B-Claude-Mythos-5-1M-Q5_K_M.gguf
|
| 9 |
+
05f2420a669a83ae239d1328c6e83d9fc19e8a4b8dab5d0673cab28cc161dfc5 ./Qwythos-9B-Claude-Mythos-5-1M-Q6_K.gguf
|
| 10 |
+
ecf51ef3904694e04a1967a06fe7539619c34e6e81a721fc5f5415827081d8f6 ./Qwythos-9B-Claude-Mythos-5-1M-Q8_0.gguf
|
| 11 |
+
f977efc337a2ac2ba183eea0c73e25b75fc240d56c05ed4d9b56ab451f64c82c ./mmproj-Qwythos-9B-Claude-Mythos-5-1M-F16.gguf
|
TEST_REPORT.md
ADDED
|
@@ -0,0 +1,53 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Qwythos GGUF v2 release verification
|
| 2 |
+
|
| 3 |
+
Release verification was performed on 2026-06-22 with llama.cpp `d0f9d2e5ac5d4f51763755958b8f353fed01aaa2` and an NVIDIA RTX 5090.
|
| 4 |
+
|
| 5 |
+
## Artifact manifest
|
| 6 |
+
|
| 7 |
+
| Artifact | Bytes | SHA-256 |
|
| 8 |
+
|---|---:|---|
|
| 9 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-BF16.gguf` | 17,920,697,344 | `c49e2e419c0fc9f7cca0a7e0699fa3075fa1ec2913bda6f6841194b19cf4ce29` |
|
| 10 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-Q4_K_M.gguf` | 5,629,109,248 | `0de41ff56ab4eff26764437b276f0bd5d20f44232add34e14c4f593fa1aeb08f` |
|
| 11 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-Q5_K_M.gguf` | 6,467,970,048 | `51dee6e6ddaeb08a06bee258ac5c58f4186f55d156d6403c2db67cf2f3226280` |
|
| 12 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-Q6_K.gguf` | 7,359,259,648 | `05f2420a669a83ae239d1328c6e83d9fc19e8a4b8dab5d0673cab28cc161dfc5` |
|
| 13 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-Q8_0.gguf` | 9,527,501,824 | `ecf51ef3904694e04a1967a06fe7539619c34e6e81a721fc5f5415827081d8f6` |
|
| 14 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-MTP-BF16.gguf` | 18,407,321,536 | `c704996e63c3e31a1e7040278b1204aabf8a6dad0b07f3c8da7b429d2ed30b31` |
|
| 15 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-MTP-Q4_K_M.gguf` | 5,887,668,160 | `24ee22e0f5d9f0d3d615809607f365c728d9b0c3f3fb6eb19d8bd83a1c2933d8` |
|
| 16 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-MTP-Q5_K_M.gguf` | 6,726,528,960 | `c4e3c47b17a93cc1566abcee4a0c9a4413b172a6f8be7f2a8824495907e9e893` |
|
| 17 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-MTP-Q6_K.gguf` | 7,617,818,560 | `eae2adb4b451f673218fba96ebdc024fb95c4026ef35a56837dc9af47f99ca39` |
|
| 18 |
+
| `Qwythos-9B-Claude-Mythos-5-1M-MTP-Q8_0.gguf` | 9,786,060,736 | `34eada291b4f4d1b8592f53d1739c7a2e029011b5bc2624b979b0a544562a5e9` |
|
| 19 |
+
| `mmproj-Qwythos-9B-Claude-Mythos-5-1M-F16.gguf` | 918,165,472 | `f977efc337a2ac2ba183eea0c73e25b75fc240d56c05ed4d9b56ab451f64c82c` |
|
| 20 |
+
|
| 21 |
+
`shasum -a 256 -c SHA256SUMS` passed for all 11 GGUF artifacts after conversion and testing.
|
| 22 |
+
|
| 23 |
+
## Structural checks
|
| 24 |
+
|
| 25 |
+
All ten text GGUFs passed these shared assertions:
|
| 26 |
+
|
| 27 |
+
- GGUF v3 and `qwen35` architecture;
|
| 28 |
+
- 1,048,576-token declared context;
|
| 29 |
+
- Qwen3.6-derived chat template markers present;
|
| 30 |
+
- fixed Qwythos/Empero AI identity instruction present.
|
| 31 |
+
|
| 32 |
+
The five normal v2 replacement files additionally verified as 32-block trunk-only GGUFs with no `nextn_predict_layers` key and no MTP tensors.
|
| 33 |
+
|
| 34 |
+
The five `-MTP-` v2 variants additionally verified as 33-block GGUFs with `qwen35.nextn_predict_layers = 1`, all 15 restored MTP tensors present, and MTP matrices retained at Q8_0 in every quantized variant.
|
| 35 |
+
|
| 36 |
+
The MTP tensors come from pinned Qwen3.5-9B base commit `c202236235762e1c871ad0ccb60c8ee5ba337b9a`. The source fine-tune declares one MTP layer but does not publish its MTP tensors. Only the draft head is restored; the Qwythos trunk and output layers remain unchanged.
|
| 37 |
+
|
| 38 |
+
## Live tests
|
| 39 |
+
|
| 40 |
+
| Variant / mode | Identity | Generation | Tools | Other |
|
| 41 |
+
|---|---:|---:|---:|---|
|
| 42 |
+
| BF16 MTP | Pass | Pass | — | — |
|
| 43 |
+
| Q4_K_M MTP | Pass | Pass | Pass | Tool result round trip passed |
|
| 44 |
+
| Q5_K_M MTP | Pass | Pass | — | — |
|
| 45 |
+
| Q6_K MTP | Pass | Pass | — | — |
|
| 46 |
+
| Q8_0 MTP | Pass | Pass | Pass | Tool result round trip passed; no malformed output |
|
| 47 |
+
| Q4_K_M MTP + draft speculation | Pass | Pass | Pass | 76/150 draft tokens accepted (50.7%) |
|
| 48 |
+
| Q4_K_M MTP + 1M context | Pass | Pass | — | Server loaded `n_ctx = 1048576` |
|
| 49 |
+
| Q4_K_M MTP + F16 mmproj | Pass | Pass | — | Correctly identified a generated red-square image |
|
| 50 |
+
|
| 51 |
+
The normal v2 replacement files were rebuilt from the same fixed tokenizer/template source with `--no-mtp` and structurally verified after quantization. The 1M-context check verifies full context allocation and successful generation at that setting; it is not a million-token retrieval benchmark.
|
| 52 |
+
|
| 53 |
+
Machine-readable responses are in `reports/`; relevant llama.cpp logs are in `logs/`.
|
mmproj-Qwythos-9B-Claude-Mythos-5-1M-F16.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f977efc337a2ac2ba183eea0c73e25b75fc240d56c05ed4d9b56ab451f64c82c
|
| 3 |
+
size 918165472
|
mmproj-Qwythos-9B-Claude-Mythos-5-1M-f16.gguf
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f70dc3509053962b0d0d3ee8a7eacebf5d60aa560cad78254ae8698516ae029f
|
| 3 |
+
size 918166080
|