--- license: mit base_model: - Qwen/Qwen3.5-27B --- Optimized llama.cpp GGUF quants of Qwen 3.5 27b made for 16gb VRAM cards. Made using [GGUF-Tool-suite!](https://github.com/Thireus/GGUF-Tool-Suite/) by Thireus, tuned by me :) (Gammaception) iq3_xs:(headless 80k kv-cache ctx @ (q8_0,q5_0), -ub 256, q8_0 mmproj) | | unsloth ud-iq3_xxs | bartowski iq3_xs | GC-Thireus iq3_xs | mungert iq3_xs | bartowski iq3_m | GC-Thireus iq4_xs | unsloth ud-q3_k_xl | bartowski iq4_xs | | --- | --- | --- | --- | --- | --- | --- | --- | --- | | Size (GiB) | 10.716 | 11.357 | 11.784 | 12.078 | 12.240 | 13.390 | 13.447 | 14.130 | | PPL mean | 7.351227 | 7.274191 | 6.979385 | 7.144770 | 7.165072 | 6.936656 | 7.095631 | 6.932305 | | KLD mean | 0.080498 | 0.066157 | 0.045386 | 0.053660 | 0.057634 | 0.025940 | 0.039891 | 0.019588 | | Same top p | 88.483 | 89.878 | 91.686 | 90.655 | 90.798 | 94.143 | 92.694 | 94.592 | kl logits from unsloth ud-q6_k_xl (best i can fit), wikitext-2 512ctx bf16 kv ![ppl_graph](charts2/pplMean.png "Chart showing PPL analysis of quants") ![kld_graph](charts2/kldMean.png "Chart showing KLD analysis of quants") ![tpp_graph](charts2/sameTopP.png "Chart showing Top P analysis of quants")