dleemiller
/

EttinX-nli-xxs

Text Classification

sentence-transformers

Generated from Trainer

dataset_size:942069

loss:PrecomputedDistillationLoss

Eval Results (legacy)

text-embeddings-inference

Model card Files Files and versions

dleemiller commited on Sep 8, 2025

Commit

e563719

·

verified ·

1 Parent(s): 35a01bd

Update README.md

Files changed (1) hide show

README.md +3 -3

README.md CHANGED Viewed

@@ -62,12 +62,12 @@ drop-in compatibility with comparable sentence transformers cross encoders.
 To train this model, I added teacher logits to the all-nli dataset `dleemiller/all-nli-distill` from the
 `dleemiller/ModernCE-large-nli` model. This significantly improves performance above standard training.
-This 17m architecture is based on ModernBERT and is an excellent candidate for lightweight CPU inference.
 ---
 ## Features
-- **High performing:** Achieves **80.19%** and 86.50% (Micro F1) on MNLI mismatched and SNLI test.
 - **Efficient architecture:** Based on the Ettin-17m encoder design (17M parameters), offering faster inference speeds.
 - **Extended context length:** Processes sequences up to 8192 tokens, great for LLM output evals.
@@ -124,7 +124,7 @@ Details:
 Fine-tuning was performed on the `dleemiller/all-nli-distill` dataset.
 ### Validation Results
-The model achieved the following test set performance after fine-tuning:
 - **MNLI Unmatched:** 0.8019
 - **SNLI:** 0.8650

 To train this model, I added teacher logits to the all-nli dataset `dleemiller/all-nli-distill` from the
 `dleemiller/ModernCE-large-nli` model. This significantly improves performance above standard training.
+This 17m architecture is based on ModernBERT and is an excellent candidate for lightweight **CPU inference**.
 ---
 ## Features
+- **High performing:** Achieves **80.19%** and **86.50%** (Micro F1) on MNLI mismatched and SNLI test.
 - **Efficient architecture:** Based on the Ettin-17m encoder design (17M parameters), offering faster inference speeds.
 - **Extended context length:** Processes sequences up to 8192 tokens, great for LLM output evals.
 Fine-tuning was performed on the `dleemiller/all-nli-distill` dataset.
 ### Validation Results
+The model achieved the following test set micro f1 performance after fine-tuning:
 - **MNLI Unmatched:** 0.8019
 - **SNLI:** 0.8650