Instructions to use Infojura/mmlw-retrieval-e5-small-onnx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use Infojura/mmlw-retrieval-e5-small-onnx with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("Infojura/mmlw-retrieval-e5-small-onnx") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
mmlw-retrieval-e5-small-onnx
ONNX-exported version of sdadas/mmlw-retrieval-e5-small for use with fastembed (ONNX Runtime backend).
Contents
model.onnx— dynamic int8 quantized model (~113 MB)tokenizer.json,config.json— tokenizer + config (same as source model)
Embedding dimensions
384 (drop-in compat with sentence-transformers MiniLM family).
Usage with fastembed
from fastembed import TextEmbedding
TextEmbedding.add_custom_model(
model="Infojura/mmlw-retrieval-e5-small-onnx",
dim=384,
sources=ModelSource(hf="Infojura/mmlw-retrieval-e5-small-onnx"),
pooling=PoolingType.MEAN,
normalization=True,
)
model = TextEmbedding("Infojura/mmlw-retrieval-e5-small-onnx")
# E5 requires prefixes:
query_embedding = next(model.embed(["query: Jaka jest sygnatura sprawy?"]))
passage_embeddings = list(model.embed(["passage: Sąd Okręgowy w Warszawie..."]))
Export pipeline
uv run python -m optimum.exporters.onnx \
--model sdadas/mmlw-retrieval-e5-small \
--task feature-extraction \
./fp32/
uv run python -c "
from onnxruntime.quantization import quantize_dynamic, QuantType
quantize_dynamic('./fp32/model.onnx', './int8/model.onnx', weight_type=QuantType.QInt8)
"
Exported with optimum==2.1.0, optimum-onnx==0.1.0, transformers==4.57.6.
Measured performance (on Polish legal retrieval)
Baseline sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2:
- MRR on Polish SAOS corpus: 0.528 → 0.868 (+64%)
- R@1: 0.438 → 0.815
See Infojura/taado-converters ADR-004 for eval harness.
License
Apache 2.0 (inherited from source model sdadas/mmlw-retrieval-e5-small).
Full license: https://www.apache.org/licenses/LICENSE-2.0
Attribution
Original model by Sławomir Dadas:
@misc{dadas2024mmlw,
title={Multilingual and multilabel extension of the Polish language models},
author={Dadas, Sławomir},
year={2024},
}
- Downloads last month
- 52
Model tree for Infojura/mmlw-retrieval-e5-small-onnx
Base model
sdadas/mmlw-retrieval-e5-small