--- base_model: - Qwen/Qwen3-VL-2B-Instruct tags: - ocr - qwen3_vl - persian - arabic - text-line-ocr - Optical Character Recognition - vision-language - vl - persian-ocr - arabic-ocr - farsi license: apache-2.0 language: - en - fa - ar datasets: - mohajesmaeili/Persian_Arabic_TextLine_Image_Ocr_Small pipeline_tag: image-to-text --- # Persian/Arabic OCR - Qwen3-VL-2B-Instruct - v1.0 This is a **16-bit version** of **Qwen/Qwen3-VL-2B-Instruct** fine-tuned specifically for Persian text recognition (OCR) on **individual text lines**. The model has been trained exclusively on cropped single-line text images and is **not designed for full-page OCR**. ## Training Details - **Dataset**: - 56,000 real Persian text line images - 100,000 synthetic images (47 fonts in 3 diffrent sizes) with clean and noisy/colored backgrounds - **Total Examples**: 156k text line images - **Epochs**: 1 - **LoRA Rank**: 512 - **Batch Size**: 100 - **Learning Rate**: 2e-4 - **Trainable Parameters**: 759M Params ## Training Performance Training loss decreased steadily, ending at approximately **0.071**. | Step | Training Loss | |--------|---------------| | 0 | 2.2102 | | 100 | 0.1577 | | 200 | 0.1384 | | 300 | 0.1330 | | 400 | 0.1208 | | 500 | 0.1078 | | 600 | 0.1080 | | 700 | 0.1065 | | 800 | 0.0980 | | 900 | 0.0871 | | 1000 | 0.0827 | | 1100 | 0.0802 | | 1200 | 0.0898 | | 1300 | 0.0866 | | 1400 | 0.0811 | | 1500 | 0.0774 | | 1550 | 0.0714 |