prithivida
/

miniDense_arabic_v1

Sentence Similarity

sentence-transformers

feature-extraction

passage-retrieval

knowledge-distillation

middle-training

text-embeddings-inference

Model card Files Files and versions

prithivida commited on Aug 9, 2024

Commit

0bd3af2

·

verified ·

1 Parent(s): 835ba8f

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -51,7 +51,7 @@ pipeline_tag: sentence-similarity
     - [With Sentence Transformers:](#with-sentence-transformers)
     - [With Huggingface Transformers:](#with-huggingface-transformers)
 - [FAQs](#faqs)
-    - [How can I reduce overall inference cost ?](#how-can-i-reduce-overall-inference-cost)
     - [How do I reduce vector storage cost?](#how-do-i-reduce-vector-storage-cost)
     - [How do I offer hybrid search to improve accuracy?](#how-do-i-offer-hybrid-search-to-improve-accuracy)
 - [MTEB numbers](#mteb-numbers)
@@ -161,13 +161,13 @@ for query, query_embedding in zip(queries, query_embeddings):
 # FAQs:
-#### How can I reduce overall inference cost ?
 - You can host these models without heavy torch dependency using the ONNX flavours of these models via [FlashEmbed](https://github.com/PrithivirajDamodaran/flashembed) library.
-#### How do I reduce vector storage cost ?
 [Use Binary and Scalar Quantisation](https://huggingface.co/blog/embedding-quantization)
-#### How do I offer hybrid search to improve accuracy ?
 MIRACL paper shows simply combining BM25 is a good starting point for a Hybrid option:
 The below numbers are with mDPR model, but miniDense_arabic_v1 should give a even better hybrid performance.

     - [With Sentence Transformers:](#with-sentence-transformers)
     - [With Huggingface Transformers:](#with-huggingface-transformers)
 - [FAQs](#faqs)
+    - [How can I reduce overall inference cost?](#how-can-i-reduce-overall-inference-cost)
     - [How do I reduce vector storage cost?](#how-do-i-reduce-vector-storage-cost)
     - [How do I offer hybrid search to improve accuracy?](#how-do-i-offer-hybrid-search-to-improve-accuracy)
 - [MTEB numbers](#mteb-numbers)
 # FAQs:
+#### How can I reduce overall inference cost?
 - You can host these models without heavy torch dependency using the ONNX flavours of these models via [FlashEmbed](https://github.com/PrithivirajDamodaran/flashembed) library.
+#### How do I reduce vector storage cost?
 [Use Binary and Scalar Quantisation](https://huggingface.co/blog/embedding-quantization)
+#### How do I offer hybrid search to improve accuracy?
 MIRACL paper shows simply combining BM25 is a good starting point for a Hybrid option:
 The below numbers are with mDPR model, but miniDense_arabic_v1 should give a even better hybrid performance.