--- base_model: Qwen/Qwen3.5-397B-A17B tags: - quantized - nvfp4 - compressed-tensors - llm-compressor - moe - qwen3.5 quantized_by: Sehyo --- # Qwen3.5-397B-A17B-NVFP4 This is a quantized version of [Qwen/Qwen3.5-397B-A17B](https://huggingface.co/Qwen/Qwen3.5-397B-A17B) using the **NVFP4** quantization scheme. ## Important Needs this PR from VLLM to work: https://github.com/vllm-project/vllm/pull/34723 You might need to build from source as it is not included in the nightly build yet as I am writing this. Alternatively, patch the latest nightly image yourself to include that PR. ## Note: Reuploaded weights with some issues fixed (20/02/2026) ## Creation This model was created using [VLLM's LLM Compressor](https://github.com/vllm-project/llm-compressor) with Qwen3.5 MoE support added via [PR #2383](https://github.com/vllm-project/llm-compressor/pull/2383). The PR adds a custom `CalibrationQwen3MoeSparseMoeBlock` that routes calibration data to all experts during quantization, ensuring every expert receives proper calibration for accurate NVFP4 quantization.