mikoy92
/

Unlimited-OCR-4bit-mlx

Image-Text-to-Text

vision-language

4-bit precision

Model card Files Files and versions

Unlimited-OCR-4bit-mlx / README.md

mikoy92's picture

Upload MLX quantized Unlimited-OCR

87f546a verified 6 days ago

|

History Blame Contribute Delete

1.13 kB

	---
	base_model: mikoy92/Unlimited-OCR-bf16-mlx
	license: mit
	language:
	- multilingual
	pipeline_tag: image-text-to-text
	tags:
	- mlx
	- mlx-vlm
	- ocr
	- vision-language
	- baidu
	- deepseekocr
	- quantized
	- 4-bit
	- affine
	library_name: mlx
	---

	# Unlimited-OCR 4-bit MLX

	This is a 4-bit affine MLX quantization of [`mikoy92/Unlimited-OCR-bf16-mlx`](https://huggingface.co/mikoy92/Unlimited-OCR-bf16-mlx), converted with `mlx-vlm`.

	Quantization settings:

	- mode: `affine`
	- bits: `4`
	- group size: `64`
	- observed effective bits per weight during conversion: `5.883`

	Because this is a vision-language OCR model, `mlx-vlm` does not aggressively quantize every multimodal tensor; the effective bits-per-weight can be higher than exactly 4-bit.

	## Usage

	```bash
	pip install -U mlx-vlm

	mlx_vlm.generate \
	--model mikoy92/Unlimited-OCR-4bit-mlx \
	--image /path/to/image.png \
	--prompt "Extract all readable text from this image." \
	--max-tokens 512 \
	--temperature 0
	```

	## Validation

	Before upload, this checkpoint was loaded locally with `mlx_vlm.generate` and produced OCR text/table output on a document-image smoke test.