Qwen3.6-27B-NVFP4 (NVFP4A16)

Overview

PNVFP4 version of Qwen/Qwen3.6-27B.

  • No calibration dataset
  • One-shot quantization
  • Uses llmcompressor

Quantization

  • Scheme: NVFP4
  • Weights: 4bit
  • Activations: 16bit
  • Targets: Linear layers only

Ignored:

  • lm_head
  • visual
  • linear_attn
  • mtp

KLD Results:

  • Mean KLD: 0.062114

Usage

vllm serve TheHouseOfTheDude/Qwen3.6-27B-NVFP4 \
  --quantization compressed-tensors

Notes

  • Requires vLLM
  • Not for vanilla Transformers
Downloads last month
65
Safetensors
Model size
19B params
Tensor type
F32
BF16
F8_E4M3
U8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 馃檵 Ask for provider support

Model tree for TheHouseOfTheDude/Qwen3.6-27B-NVFP4

Base model

Qwen/Qwen3.6-27B
Quantized
(487)
this model