prithivida
/

miniDense_arabic_v1

Sentence Similarity

sentence-transformers

feature-extraction

passage-retrieval

knowledge-distillation

middle-training

text-embeddings-inference

Model card Files Files and versions

prithivida commited on Aug 6, 2024

Commit

b0cc527

·

verified ·

1 Parent(s): cd1ad1f

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -63,10 +63,10 @@ pipeline_tag: sentence-similarity
 # Request, Terms, Disclaimers
 <center>
   <img src="./ar_terms.png" width=250%/>
-  <b><p>[https://github.com/sponsors/PrithivirajDamodaran](https://github.com/sponsors/PrithivirajDamodaran)</p><b>
 </center>
@@ -173,7 +173,7 @@ The below numbers are with mDPR model, but miniDense_arabic_v1 should give a eve
 | Language  | ISO | nDCG@10 BM25 | nDCG@10 mDPR | nDCG@10 Hybrid |
 |-----------|-----|--------------|--------------|----------------|
-| **Arabic**     | **ar**  | **0.395**        | **0.499**        | **0.67.3**          |
 *Note: MIRACL paper shows a different (higher) value for BM25 Arabic, So we are taking that value from BGE-M3 paper, rest all are form the MIRACL paper.*
@@ -184,7 +184,7 @@ So it makes sense to evaluate our models in retrieval slice of the MTEB benchmar
 ##### Long Document Retrieval
 <center>
-<img src="./ar_metrics_4.png" width=100%/>
   <b><p>Table 3: Detailed Arabic retrieval performance on the MultiLongDoc dev set (measured by nDCG@10)</p></b>
 </center>
@@ -194,7 +194,7 @@ So it makes sense to evaluate our models in retrieval slice of the MTEB benchmar
 Almost all models below are monolingual arabic models based so they have no notion of any other languages.
 <center>
-<img src="./ar_metrics_5.png" width=100%/>
   <b><p>Table 4: Detailed Arabic retrieval performance on the 3 X-lingual test set (measured by nDCG@10)</p></b>
 </center>

 # Request, Terms, Disclaimers
+[https://github.com/sponsors/PrithivirajDamodaran](https://github.com/sponsors/PrithivirajDamodaran)
 <center>
   <img src="./ar_terms.png" width=250%/>
 </center>
 | Language  | ISO | nDCG@10 BM25 | nDCG@10 mDPR | nDCG@10 Hybrid |
 |-----------|-----|--------------|--------------|----------------|
+| **Arabic**     | **ar**  | **0.395**        | **0.499**        | **0.673**          |
 *Note: MIRACL paper shows a different (higher) value for BM25 Arabic, So we are taking that value from BGE-M3 paper, rest all are form the MIRACL paper.*
 ##### Long Document Retrieval
 <center>
+<img src="./ar_metrics_4.png" width=80%/>
   <b><p>Table 3: Detailed Arabic retrieval performance on the MultiLongDoc dev set (measured by nDCG@10)</p></b>
 </center>
 Almost all models below are monolingual arabic models based so they have no notion of any other languages.
 <center>
+<img src="./ar_metrics_5.png" width=80%/>
   <b><p>Table 4: Detailed Arabic retrieval performance on the 3 X-lingual test set (measured by nDCG@10)</p></b>
 </center>