# Quantization Notes

- Release track: MLX 4bit local build
- Intended for Apple Silicon local/offline inference
- Preserve the tokenizer, config, and chat template together with the weights
- If the local stack is used without routers and validators, treat the outputs as bare-model behavior

Conversion notes:

- Source snapshot: `../../pocket-polymath-e4b-mlx`
- This package should mirror the local inference-ready MLX artifact layout