Qwen3.6 35B A3B MTP GGUF - Tater NoThink

This is a Tater-curated NoThink variant of unsloth/Qwen3.6-35B-A3B-MTP-GGUF.

Included file:

  • Qwen3.6-35B-A3B-UD-Q4_K_M.gguf

What Changed

The embedded GGUF chat template was patched so generation starts in the no-thinking path by default. The tensor weights are unchanged from the upstream UD-Q4_K_M GGUF.

Tater Notes

  • Recommended provider: llama.cpp
  • Recommended quant: UD-Q4_K_M
  • MTP is built into this upstream model family, so no separate draft model is required.
  • In Tater, enable MTP and leave Draft Model set to No draft model.
  • A draft token count of 2 is a good starting point for this model.

Source

Base model: https://huggingface.co/unsloth/Qwen3.6-35B-A3B-MTP-GGUF

Downloads last month
400
GGUF
Model size
36B params
Architecture
qwen35moe
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for TaterTotterson/Qwen3.6-35B-A3B-MTP-GGUF-Tater-NoThink

Quantized
(5)
this model