---
base_model:
  - z-lab/Qwen3.6-27B-DFlash
base_model_relation: quantized
quantized_by: Alittlehammmer
license: apache-2.0
license_link: https://huggingface.co/z-lab/Qwen3.6-27B-DFlash/blob/main/LICENSE
pipeline_tag: text-generation
library_name: gguf
tags:
  - gguf
  - quantized
  - llama.cpp
  - text-generation
  - agentic-coding
  - dflash
---

# Qwen3.6-27B-DFlash

GGUF quantizations of [z-lab/Qwen3.6-27B-DFlash](https://huggingface.co/z-lab/Qwen3.6-27B-DFlash).

Converted to BF16 using `convert_hf_to_gguf.py`, then quantized using `llama-quantize` from [llama.cpp](https://github.com/ggml-org/llama.cpp).

## Available quants

| Quant  | Bits  | Size    | Notes                            |
| ------ | ----- | ------- | -------------------------------- |
| Q4_K_M | 4     | ~1.03 GB | Average quality                  |
| Q5_K   | 5     | ~1.22 GB | High quality                     |
| Q6_K   | 6     | ~1.43 GB | Very high quality  |
| Q8_0   | 8     | ~1.84 GB | Highest quality, near lossless, Recommended |
| BF16   | 16    | ~3.47 GB | Full precision, reference file   |

## Usage

Use in conjunction with existing Qwen3.6 Quants, example `config` if using `llama-server`:

```bash
[Qwen3.6-27B-Q8_0-DFlash]
sm = layer
model = /mnt/gguf/Qwen3.6-27B/Qwen3.6-27B-Q8_0.gguf
model-draft = /mnt/gguf/Qwen3.6-27B/Qwen3.6-27B-DFlash-Q8_0.gguf
spec-type = draft-dflash
spec-draft-n-max = 6 
```

(Note: For some reason I cannot get `sm = tensor` to work, it crashes on launch, pretty sure this is an issue in llama.cpp)


## Original model

See the [original model card](https://huggingface.co/z-lab/Qwen3.6-27B-DFlash)
for details on capabilities, benchmarks, and license.