Josiefied-Qwen3.5-0.8B-gabliterated-v1-GGUF

This repository contains GGUF quants for Goekdeniz-Guelmez/Josiefied-Qwen3.5-0.8B-gabliterated-v1.

Available Files

Filename Quant Type Size Description
josiefied-qwen-0.8b.f16.gguf F16 1.6 GB Original high-precision weights.
josiefied-qwen-0.8b.Q8_0.gguf Q8_0 0.9 GB Near-lossless. Recommended for most users.
josiefied-qwen-0.8b.Q6_K.gguf Q6_K 0.65 GB Great balance of size and quality.
josiefied-qwen-0.8b.Q4_K_M.gguf Q4_K_M 0.48 GB Smallest viable version for extreme RAM constraints.

Quickstart (llama.cpp)

To run the Q8 version in your terminal:

./llama-cli -m josiefied-qwen-0.8b.Q8_0.gguf -p "<|im_start|>system\nYou are Josie, a helpful assistant.<|im_end|>\n<|im_start|>user\nTell me a story.<|im_end|>\n<|im_start|>assistant\n" -n 512
Downloads last month
206
GGUF
Model size
0.8B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Abiray/Josiefied-Qwen3.5-0.8B-gabliterated-v1-GGUF

Collection including Abiray/Josiefied-Qwen3.5-0.8B-gabliterated-v1-GGUF