--- base_model: unsloth/Qwen3.5-0.8B language: - ar library_name: peft model_name: Katib-Qwen3.5-0.8B-0.1 tags: - generated_from_trainer - unsloth - sft - trl - ocr - arabic - vision-language - handwriting-recognition license: apache-2.0 pipeline_tag: image-text-to-text metrics: - wer - cer - bleu datasets: - oddadmix/qari-0.2.2-news-dataset-large - oddadmix/qari-0.2.2-diacritics-dataset-large - johnlockejrr/KHATT_v1.0_dataset --- # KATIB 0.8B v0.1 — Arabic OCR Model **KATIB** (كاتب) is a fine-tuned Arabic OCR model built on [Qwen3.5-0.8B](https://huggingface.co/Qwen/Qwen3.5-0.8B), designed to accurately transcribe Arabic text from images — including printed documents and **handwritten content**. Despite being a 0.8B parameter model, KATIB outperforms larger 2B-class Arabic OCR models on standard benchmarks while running at **2× the speed** with **half the memory footprint**. --- ## ✨ Highlights - 🏆 **Outperforms Qari-OCR v0.3** (2B) on WER, CER, and BLEU - 🥈 **Competitive with Qari-OCR v0.2.2.1** (2B) — a stronger model — at half the size - ✍️ **Enhanced handwriting support** — better generalization to real-world Arabic scripts - ⚡ **2× faster inference** compared to 2B-parameter alternatives - 🪶 **Lightweight** — deployable on modest hardware --- ## 📊 Benchmark Results Evaluated on an Arabic OCR test set. Lower WER/CER is better; higher BLEU is better. | Model | Size | WER ↓ | CER ↓ | BLEU ↑ | |---|---|---|---|---| | **KATIB 0.8B v0.1** *(ours)* | **0.8B** | **0.2386** | **0.0648** | **0.5819** | | NAMAA-Space/Qari-OCR-v0.3-VL-2B-Instruct | 2B | 0.2643 | 0.0782 | 0.5520 | | NAMAA-Space/Qari-OCR-0.2.2.1-VL-2B-Instruct | 2B | 0.1993 | 0.0498 | 0.6402 | | Qwen/Qwen3.5-0.8B *(base, no fine-tune)* | 0.8B | 2.5834 | 1.9487 | 0.0256 | > **WER** = Word Error Rate | **CER** = Character Error Rate | **BLEU** = Bilingual Evaluation Understudy Score ### Key Takeaways - KATIB beats **Qari v0.3** across all three metrics — despite being 2.5× smaller. - KATIB comes close to **Qari v0.2.2.1** on WER and CER, with only a ~6 point BLEU gap — a strong result for a model at this size. - The base Qwen model without fine-tuning is essentially unusable for Arabic OCR (WER > 2.5), demonstrating the value of domain-specific fine-tuning. --- ## 🚀 Quick Start ```python from transformers import AutoProcessor, AutoModelForImageTextToText from PIL import Image import torch model_id = "oddadmix/Katib-Qwen3.5-0.8B-0.1" processor = AutoProcessor.from_pretrained(model_id) model = AutoModelForImageTextToText.from_pretrained( model_id, torch_dtype=torch.float16, device_map="auto" ) image = Image.open("arabic_document.jpg") messages = [ { "role": "user", "content": [ {"type": "image", "image": image}, {"type": "text", "text": "Free OCR"} ] } ] text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = processor(text=[text], images=[image], return_tensors="pt").to(model.device) with torch.no_grad(): output = model.generate(**inputs, max_new_tokens=512) result = processor.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True) print(result) ``` --- ## 🧪 Training Details | Detail | Value | |---|---| | Base Model | Qwen/Qwen3.5-0.8B | | Fine-tuning Method | Supervised Fine-Tuning (SFT) | | Language | Arabic (Modern Standard + Handwritten) | | Task | Optical Character Recognition (OCR) | | Precision | float16 / bfloat16 | --- ## 📋 Intended Use - ✅ Arabic document digitization - ✅ Handwritten Arabic text recognition - ✅ Arabic printed text extraction from images - ✅ Low-resource / edge deployment scenarios - ❌ Not intended for non-Arabic languages - ❌ Not a general-purpose vision-language model --- ## ⚠️ Limitations - Performance may degrade on very low-quality or heavily degraded scans. - Dialectal Arabic and mixed-language (Arabic + Latin) text may reduce accuracy. - Extreme cursive or stylized calligraphy has not been extensively evaluated. --- ## 📄 Citation If you use KATIB in your research or application, please consider citing this model: ```bibtex @misc{katib2025, title = {KATIB 0.8B v0.1: A Lightweight Arabic OCR Model}, author = {oddadmix}, year = {2026}, publisher = {Hugging Face}, url = {https://huggingface.co/oddadmix/Katib-Qwen3.5-0.8B-0.1} } ``` ---