Qwen3.5-122B-A10B-PrismaQuant-4.75bit-vllm / model-00001-of-00016.safetensors

Commit History

PrismaQuant: GPTQ + closed-form scale_sweep polish (replaces act_round). Measured ~67% per-Linear output-MSE reduction vs prior pipeline in bake-off.
f2011c4
verified

rdtand commited on

PrismaQuant final: visual NVFP4 (108/110 DP-placed), lm_head BF16 (vLLM ParallelLMHead runtime limit), pos_embed excluded
2f54276
verified

rdtand commited on

PrismaQuant Phase 2 incremental: body AWQ-safe GPTQ+act-round, visual Fisher-allocated (108/111 Linears via per-Linear DP)
bd4f6c5
verified

rdtand commited on