--- library_name: mlx tags: - mlx - oq - quantized --- # HY-MT1.5-1.8B-oQ8-fp16 This model was quantized using [oQ](https://github.com/jundot/omlx) (oMLX v0.3.9.dev2) mixed-precision quantization. ## Quantization details - **Model type**: hunyuan_v1_dense - **Bits**: 8 - **Group size**: 64 - **Format**: MLX safetensors Tested on m1max(32c) 64G MacOS 26.5 Note: fp16 gives ~20% faster prefill on M1/M2 Apple Silicon (native fp16). bfloat16 is safer on M3/M4 and for numerical stability. | Model | Context | PP (tok/s) | TG (tok/s) | |-------|---------|-----------|-----------| | HY-MT1.5-1.8B · 8bit | 1k | 1,096 | 116.0 | | HY-MT1.5-1.8B · 8bit | 4k | 1,229 | 97.6 | | HY-MT1.5-1.8B · 8bit | 8k | 1,074 | 80.3 | | HY-MT1.5-1.8B · 8bit | 16k | 875.0 | 59.4 | | HY-MT1.5-1.8B-oQ8-fp16 · 8bit | 1k | 1,614 | 121.0 | | HY-MT1.5-1.8B-oQ8-fp16 · 8bit | 4k | 1,879 | 104.9 | | HY-MT1.5-1.8B-oQ8-fp16 · 8bit | 8k | 1,501 | 91.1 | | HY-MT1.5-1.8B-oQ8-fp16 · 8bit | 16k | 1,221 | 69.8 |