rdtand's picture
PrismaQuant final: visual NVFP4 (108/110 DP-placed), lm_head BF16 (vLLM ParallelLMHead runtime limit), pos_embed excluded
2f54276 verified