AitBAD commited on
Commit
ee9e011
·
verified ·
1 Parent(s): 659a931

Upload 2 files

Browse files
Files changed (2) hide show
  1. CHANGELOG.md +13 -0
  2. CREDITS.md +87 -0
CHANGELOG.md ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## v1.0 (2025-11-24)
2
+ - Initial release
3
+ - CER: 30.56%, WER: 72.94%
4
+ - 5k wordlist
5
+ - Support for all Kabyle diacritics
6
+ ```
7
+
8
+ ## v1.1 (2025-11-29)
9
+ - Optimal release
10
+ - CER: 5.08%, WER: 15.28%
11
+ - 30k wordlist
12
+ - Support for all Kabyle diacritics
13
+ ```
CREDITS.md ADDED
@@ -0,0 +1,87 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Credits and Attributions
2
+
3
+ ## Training and Validation Texts
4
+
5
+ ### Primary Sources
6
+ - **Tagrest, urɣu**
7
+ - Author: Ɛmer Mezdad
8
+ - Publisher: Tiẓrigin Talantikit, 2018
9
+ - Usage: Validation dataset
10
+ - Paper book
11
+ - Files: ATagres_urghu_5.png
12
+
13
+ - **Times d waman**
14
+ - Author: M'hend Askeur
15
+ - Publisher: Tiẓrigt Pages Bleues, 2016
16
+ - Usage: Validation dataset
17
+ - Paper book
18
+ - Files: Times_d_waman.png, Times_d_waman_1.png, Times_d_waman_2.png, Times_d_waman_3.png, Times_d_waman_4.png, Times_d_waman_5.png
19
+
20
+ - **Asnekwu n yiḍrisen deg wungal Tawaɣit n tayri n Abdella Haman**
21
+ - Author: Ramdane Abdenbi
22
+ - Publisher: Editions le montagnard, 2017
23
+ - Usage: Validation dataset
24
+ - Paper book
25
+ - Files: Tawaghit_tayri.png
26
+
27
+ ## Wordlist Sources
28
+
29
+ - **kab.dic Hunspell**
30
+ - Compiler: Kamal Bouamara et al,
31
+ - Year: Release 0.61, 2020
32
+ - Source: https://extensions.libreoffice.org/fr/extensions/show/tira-n-teqbaylit
33
+ - License: MIT License
34
+
35
+ - **Inventaire des néologismes amazighes/ Asuṭṭen n yiwalnuten n tmaziɣt **
36
+ - Compiler: Habib Allah Mansouri
37
+ - Year: [2013]
38
+ - Source: http://www.ayamun.com/Telechargement.htm
39
+
40
+ - **Asegzawal n teqbaylit s teqbaylit - Textes introductifs**
41
+ - Compiler: Kamal Bouamara
42
+ - Year: Haut Commissariat à l'Amazighté, 2008
43
+ - Source: Haut Commissariat à l'Amazighté
44
+
45
+ - **ACTES Aslugen n Tira n tmaziɣt - IḌRISEN**
46
+ - Compiler: Haut Commissariat à l'Amazighté
47
+ - Year: Haut Commissariat à l'Amazighté, 2012
48
+ - Source: Editions du Haut Commissariat à l'Amazighité
49
+
50
+ - **ACTES Aslugen n Tira n tmaziɣt - Tijenṭaḍ**
51
+ - Compiler: Haut Commissariat à l'Amazighté
52
+ - Year: Haut Commissariat à l'Amazighté, 2012
53
+ - Source: Editions du Haut Commissariat à l'Amazighité
54
+
55
+ - **Deg Lqahwa - Tullist**
56
+ - Author: Mohammed Dib, Tasuqilt: Samir Tighzert
57
+ - Publisher: Tighzert.s, 2025
58
+ - PDF document
59
+
60
+ - **Apulée de Madaure - Aɣyul n wureɣ neɣ Tifelɣiwin - Adlis amezwaru**
61
+ - Author: Tasuqilt n Ḥabib-Allah Manṣuri
62
+ - Source: http://www.ayamun.com/Aghyul_n_wuregh_1.pdf
63
+ - PDF document
64
+
65
+ - **Wid i d-yufraren garaneɣ - Tullist**
66
+ - Author: Bouaziz Ait Driss
67
+ - Publisher: Taẓregt tafulmant ⵣ, 2025
68
+ - Source: https://archive.org/details/wid-i-d-yufraren-garane-bouaziz-ait-driss-978-2-9823384-5-6-numerique
69
+ - PDF document
70
+
71
+ ## Software and Tools
72
+
73
+ - **Tesseract OCR**
74
+ - License: Apache 2.0
75
+ - Link: https://github.com/tesseract-ocr/tesseract
76
+
77
+ - **Tesstrain**
78
+ - License: Apache 2.0
79
+ - Link: https://github.com/tesseract-ocr/tesstrain
80
+
81
+ This project was developed by [Bouaziz Ait Driss] as an open-source initiative to support the digitization and preservation of the Taqbaylit (Kabyle) language. It is freely available for use and contribution.
82
+ ---
83
+
84
+ If you are a copyright holder of any source material used and have concerns,
85
+ please contact: lbrlingo.2023@gmail.com
86
+
87
+ ```