prithivida
/

miniDense_arabic_v1

Sentence Similarity

sentence-transformers

feature-extraction

passage-retrieval

knowledge-distillation

middle-training

text-embeddings-inference

Model card Files Files and versions

prithivida commited on Aug 11, 2024

Commit

bfcc611

·

verified ·

1 Parent(s): 161a8f9

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -187,7 +187,7 @@ Refer tables above
 #### Long Document Retrieval
-This is very ambitious eval because we have not trained for long context, the max_len was 512 for all the models below.
 <center>
 <img src="./ar_metrics_4.png" width=150%/>
@@ -197,11 +197,11 @@ This is very ambitious eval because we have not trained for long context, the ma
 #### X-lingual Retrieval
-Almost all models below are monolingual arabic models so they have no notion of any other languages. But the below table shows how our model excels in cross-lingual scenarios owing to its deep multilingual understanding.
-This also explains its competitive performance when compared to models lot larger.
 <center>
-<img src="./ar_metrics_5.png" width=80%/>
   <b><p>Table 4: Detailed Arabic retrieval performance on the 3 X-lingual test set (measured by nDCG@10)</p></b>
 </center>

 #### Long Document Retrieval
+This is very ambitious eval because we have not trained for long context, the max_len was 512 for all the models below except BGE-M3 which had 8192 context and finetuned for long doc.
 <center>
 <img src="./ar_metrics_4.png" width=150%/>
 #### X-lingual Retrieval
+Except BGE-M3 all are monolingual arabic models so they have no notion of any other languages. But the below table shows how our model understands arabic in context with other languages.
+This explains it's overall competitive performance when compared to models that are a LOT larger.
 <center>
+<img src="./ar_metrics_5.png" width=120%/>
   <b><p>Table 4: Detailed Arabic retrieval performance on the 3 X-lingual test set (measured by nDCG@10)</p></b>
 </center>