Image-Text-to-Text
MLX
Safetensors
multilingual
deepseekocr
mlx-vlm
ocr
vision-language
baidu
quantized
8-bit precision
affine
conversational
Instructions to use mikoy92/Unlimited-OCR-8bit-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mikoy92/Unlimited-OCR-8bit-mlx with MLX:
# Make sure mlx-vlm is installed # pip install --upgrade mlx-vlm from mlx_vlm import load, generate from mlx_vlm.prompt_utils import apply_chat_template from mlx_vlm.utils import load_config # Load the model model, processor = load("mikoy92/Unlimited-OCR-8bit-mlx") config = load_config("mikoy92/Unlimited-OCR-8bit-mlx") # Prepare input image = ["http://images.cocodataset.org/val2017/000000039769.jpg"] prompt = "Describe this image." # Apply chat template formatted_prompt = apply_chat_template( processor, config, prompt, num_images=1 ) # Generate output output = generate(model, processor, formatted_prompt, image) print(output) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
File size: 1,133 Bytes
c4af243 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | ---
base_model: mikoy92/Unlimited-OCR-bf16-mlx
license: mit
language:
- multilingual
pipeline_tag: image-text-to-text
tags:
- mlx
- mlx-vlm
- ocr
- vision-language
- baidu
- deepseekocr
- quantized
- 8-bit
- affine
library_name: mlx
---
# Unlimited-OCR 8-bit MLX
This is an 8-bit affine MLX quantization of [`mikoy92/Unlimited-OCR-bf16-mlx`](https://huggingface.co/mikoy92/Unlimited-OCR-bf16-mlx), converted with `mlx-vlm`.
Quantization settings:
- mode: `affine`
- bits: `8`
- group size: `64`
- observed effective bits per weight during conversion: `9.408`
Because this is a vision-language OCR model, `mlx-vlm` does not aggressively quantize every multimodal tensor; the effective bits-per-weight can be higher than exactly 8-bit.
## Usage
```bash
pip install -U mlx-vlm
mlx_vlm.generate \
--model mikoy92/Unlimited-OCR-8bit-mlx \
--image /path/to/image.png \
--prompt "Extract all readable text from this image." \
--max-tokens 512 \
--temperature 0
```
## Validation
Before upload, this checkpoint was loaded locally with `mlx_vlm.generate` and produced OCR text/table output on a document-image smoke test.
|