File size: 1,614 Bytes
517b1c1
0fc2b6a
 
517b1c1
0fc2b6a
0c0e9c4
 
 
 
0fc2b6a
762aad7
0c0e9c4
 
 
 
517b1c1
 
 
0fc2b6a
0c0e9c4
0fc2b6a
 
 
0c0e9c4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f595ee6
0c0e9c4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
---
base_model:
- Qwen/Qwen3-VL-2B-Instruct
tags:
- ocr
- qwen3_vl
- persian
- arabic
- text-line-ocr
- Optical Character Recognition
- vision-language
- vl
- persian-ocr
- arabic-ocr
- farsi
license: apache-2.0
language:
- en
- fa
- ar
datasets:
- mohajesmaeili/Persian_Arabic_TextLine_Image_Ocr_Small
pipeline_tag: image-to-text
---
# Persian/Arabic OCR - Qwen3-VL-2B-Instruct - v1.0

This is a **16-bit version** of **Qwen/Qwen3-VL-2B-Instruct** fine-tuned specifically for Persian text recognition (OCR) on **individual text lines**. 
The model has been trained exclusively on cropped single-line text images and is **not designed for full-page OCR**.

## Training Details

- **Dataset**:
  - 56,000 real Persian text line images
  - 100,000 synthetic images (47 fonts in 3 diffrent sizes) with clean and noisy/colored backgrounds
- **Total Examples**: 156k text line images
- **Epochs**: 1
- **LoRA Rank**: 512
- **Batch Size**: 100 
- **Learning Rate**: 2e-4
- **Trainable Parameters**: 759M Params 

## Training Performance

Training loss decreased steadily, ending at approximately **0.071**.

| Step   | Training Loss |
|--------|---------------|
| 0      | 2.2102        |
| 100    | 0.1577        |
| 200    | 0.1384        |
| 300    | 0.1330        |
| 400    | 0.1208        |
| 500    | 0.1078        |
| 600    | 0.1080        |
| 700    | 0.1065        |
| 800    | 0.0980        |
| 900    | 0.0871        |
| 1000   | 0.0827        |
| 1100   | 0.0802        |
| 1200   | 0.0898        |
| 1300   | 0.0866        |
| 1400   | 0.0811        |
| 1500   | 0.0774        |
| 1550   | 0.0714        |