mikoy92's picture
Upload MLX quantized Unlimited-OCR
87f546a verified
|
Raw
History Blame Contribute Delete
1.13 kB
---
base_model: mikoy92/Unlimited-OCR-bf16-mlx
license: mit
language:
- multilingual
pipeline_tag: image-text-to-text
tags:
- mlx
- mlx-vlm
- ocr
- vision-language
- baidu
- deepseekocr
- quantized
- 4-bit
- affine
library_name: mlx
---
# Unlimited-OCR 4-bit MLX
This is a 4-bit affine MLX quantization of [`mikoy92/Unlimited-OCR-bf16-mlx`](https://huggingface.co/mikoy92/Unlimited-OCR-bf16-mlx), converted with `mlx-vlm`.
Quantization settings:
- mode: `affine`
- bits: `4`
- group size: `64`
- observed effective bits per weight during conversion: `5.883`
Because this is a vision-language OCR model, `mlx-vlm` does not aggressively quantize every multimodal tensor; the effective bits-per-weight can be higher than exactly 4-bit.
## Usage
```bash
pip install -U mlx-vlm
mlx_vlm.generate \
--model mikoy92/Unlimited-OCR-4bit-mlx \
--image /path/to/image.png \
--prompt "Extract all readable text from this image." \
--max-tokens 512 \
--temperature 0
```
## Validation
Before upload, this checkpoint was loaded locally with `mlx_vlm.generate` and produced OCR text/table output on a document-image smoke test.