---
license: mit
base_model: zai-org/GLM-5.2
base_model_relation: quantized
pipeline_tag: text-generation
library_name: mlx
tags:
- mlx
- moe
- glm
- text-generation
---

# GLM-5.2-MLX-mixed-3_6bit

MLX (Apple Silicon) conversion of [zai-org/GLM-5.2](https://huggingface.co/zai-org/GLM-5.2) — a `glm_moe_dsa` MoE (256 experts, DeepSeek-V3.2-style sparse attention) — quantized to **mixed**.

## Quantizations
Part of the [**GLM-5.2 MLX** collection](https://huggingface.co/collections/pipenetwork/glm-52-mlx-6a31fa56e37a8ac73daf25b7).

| Variant | Notes |
|---|---|
| [8-bit](https://huggingface.co/pipenetwork/GLM-5.2-MLX-8bit) | 8-bit · ~800GB · needs ~1TB RAM · integrity-checked |
| [6-bit](https://huggingface.co/pipenetwork/GLM-5.2-MLX-6bit) | 6-bit · ~625GB · needs ~768GB RAM · integrity-checked |
| [5-bit](https://huggingface.co/pipenetwork/GLM-5.2-MLX-5bit) | 5-bit · ~530GB · needs ~640GB RAM · integrity-checked |
| [4-bit](https://huggingface.co/pipenetwork/GLM-5.2-MLX-4bit) | 4-bit · ~430GB · tight on 512GB · smoke-tested |
| **mixed** (this repo) | mixed · experts@3-bit / non-expert@6-bit · ~360GB · 512GB-fit · smoke-tested |

## Use with mlx-lm
```bash
pip install mlx-lm
python -m mlx_lm generate --model pipenetwork/GLM-5.2-MLX-mixed-3_6bit --prompt "Hello" -m 256
```

## Validation
Smoke-tested locally (loads + generates coherent text).

## License
MIT (inherited from base). Quantization config (excerpt): `{"group_size": 64, "bits": 6, "mode": "affine", "model.embed_tokens": {"group_size": 64, "bits": 6}, "model.layers.0.self_attn.q_a_proj": {"group_size": 64, "bits": 6}, "model.layers.0.self_attn.q_b_p`.