GLM-5.2 NVFP4

Public NVFP4 quantization of zai-org/GLM-5.2.

Source checkpoint: zai-org/GLM-5.2 revision e32aaf0396e6987ee6dd2abb7f4d318b5f9b3cfe.

Quantization artifact: GLM-5.2-NVFP4-agentic-v2-b300-lukescope-max-8k-512-r1.

Quantization scope: ONLY the routed MoE experts are NVFP4 (experts-only). Attention (MLA q/kv/o), shared experts, dense MLP (layers 0-2), DSA indexer, router gates, lm_head, and the MTP head all remain BF16

Downloads last month: 2,810

Safetensors

Model size

381B params

Tensor type

F32

BF16

F8_E4M3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for harmya-modal/glm-5.2-nvfp4-experts-only

Base model

zai-org/GLM-5.2

Quantized

(74)

this model