Unlimited-OCR CrispEmbed GGUF
GGUF conversions of baidu/Unlimited-OCR for use with CrispEmbed.
Model
Unlimited-OCR is a 3.3B parameter VLM for full-page OCR. Architecture:
- SAM ViT-B (12 layers, 768d) β image encoder with windowed + global attention
- CLIP-L/14 (24 layers, 1024d) β receives SAM features as patch embeddings (dual-encoder "DeepLIP")
- Fusion β concat CLIP + SAM features (2048d) β linear projection (1280d)
- DeepSeek-V2 MoE decoder (12 layers, 1280d, 64 routed experts top-6, 2 shared experts, layer 0 dense)
- Tokenizer β GPT-2 BPE, 129,280 vocab
Files
| File | Quant | Size | Notes |
|---|---|---|---|
unlimited-ocr-f16.gguf |
F16 | 6.4 GB | Full precision, reference quality |
unlimited-ocr-q8_0.gguf |
Q8_0 | 3.5 GB | High quality, 2x compression |
unlimited-ocr-q5_k.gguf |
Q5_K | 2.4 GB | Best quantized quality (near-perfect pages) |
unlimited-ocr-q4_k.gguf |
Q4_K | 2.2 GB | Recommended β reads full pages, matches the HF model |
unlimited-ocr-q3_k.gguf |
Q3_K | 2.0 GB | Smaller; very good (slightly more char errors) |
All quantizations preserve the vision encoder (SAM v.* + CLIP c.*), the MoE
router (*.mlp_gate.weight), the projector, the token embeddings, and the
lm_head at Q8_0 minimum. Keeping the lm_head at Q8_0 is essential: at Q4_K
the output projection flips a borderline greedy pick early in generation, which
snowballs into a hallucination β the full-page OCR fails. With it protected, the
q4_k file reads full document pages identically to the unquantized HF model.
Quality vs size (the protected Q8_0 tensors dominate, so smaller quants save little): q5_k β best, q4_k (this file) excellent, q3_k good, q2_k not recommended (its 2-bit experts collapse into repetition on dense body text).
Usage with CrispEmbed
# Auto-download and run
crispembed --ocr-pipeline image.png --ocr-engine unlimited_ocr -m unlimited-ocr
# Or with explicit path
crispembed --ocr-pipeline image.png --ocr-engine unlimited_ocr \
--ocr-rec /path/to/unlimited-ocr-q4_k.gguf \
-m /path/to/unlimited-ocr-q4_k.gguf
License
MIT (same as the original model)
Credits
- Original model: Baidu
- GGUF conversion: CrispEmbed
- Downloads last month
- 640