File size: 1,614 Bytes
517b1c1 0fc2b6a 517b1c1 0fc2b6a 0c0e9c4 0fc2b6a 762aad7 0c0e9c4 517b1c1 0fc2b6a 0c0e9c4 0fc2b6a 0c0e9c4 f595ee6 0c0e9c4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | ---
base_model:
- Qwen/Qwen3-VL-2B-Instruct
tags:
- ocr
- qwen3_vl
- persian
- arabic
- text-line-ocr
- Optical Character Recognition
- vision-language
- vl
- persian-ocr
- arabic-ocr
- farsi
license: apache-2.0
language:
- en
- fa
- ar
datasets:
- mohajesmaeili/Persian_Arabic_TextLine_Image_Ocr_Small
pipeline_tag: image-to-text
---
# Persian/Arabic OCR - Qwen3-VL-2B-Instruct - v1.0
This is a **16-bit version** of **Qwen/Qwen3-VL-2B-Instruct** fine-tuned specifically for Persian text recognition (OCR) on **individual text lines**.
The model has been trained exclusively on cropped single-line text images and is **not designed for full-page OCR**.
## Training Details
- **Dataset**:
- 56,000 real Persian text line images
- 100,000 synthetic images (47 fonts in 3 diffrent sizes) with clean and noisy/colored backgrounds
- **Total Examples**: 156k text line images
- **Epochs**: 1
- **LoRA Rank**: 512
- **Batch Size**: 100
- **Learning Rate**: 2e-4
- **Trainable Parameters**: 759M Params
## Training Performance
Training loss decreased steadily, ending at approximately **0.071**.
| Step | Training Loss |
|--------|---------------|
| 0 | 2.2102 |
| 100 | 0.1577 |
| 200 | 0.1384 |
| 300 | 0.1330 |
| 400 | 0.1208 |
| 500 | 0.1078 |
| 600 | 0.1080 |
| 700 | 0.1065 |
| 800 | 0.0980 |
| 900 | 0.0871 |
| 1000 | 0.0827 |
| 1100 | 0.0802 |
| 1200 | 0.0898 |
| 1300 | 0.0866 |
| 1400 | 0.0811 |
| 1500 | 0.0774 |
| 1550 | 0.0714 | |