Qwen3.6-27B-NVFP4 (NVFP4A16)
Overview
PNVFP4 version of Qwen/Qwen3.6-27B.
- No calibration dataset
- One-shot quantization
- Uses llmcompressor
Quantization
- Scheme: NVFP4
- Weights: 4bit
- Activations: 16bit
- Targets: Linear layers only
Ignored:
- lm_head
- visual
- linear_attn
- mtp
KLD Results:
- Mean KLD: 0.062114
Usage
vllm serve TheHouseOfTheDude/Qwen3.6-27B-NVFP4 \
--quantization compressed-tensors
Notes
- Requires vLLM
- Not for vanilla Transformers
- Downloads last month
- 65
Model tree for TheHouseOfTheDude/Qwen3.6-27B-NVFP4
Base model
Qwen/Qwen3.6-27B