dleemiller commited on
Commit
e563719
·
verified ·
1 Parent(s): 35a01bd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -62,12 +62,12 @@ drop-in compatibility with comparable sentence transformers cross encoders.
62
  To train this model, I added teacher logits to the all-nli dataset `dleemiller/all-nli-distill` from the
63
  `dleemiller/ModernCE-large-nli` model. This significantly improves performance above standard training.
64
 
65
- This 17m architecture is based on ModernBERT and is an excellent candidate for lightweight CPU inference.
66
 
67
  ---
68
 
69
  ## Features
70
- - **High performing:** Achieves **80.19%** and 86.50% (Micro F1) on MNLI mismatched and SNLI test.
71
  - **Efficient architecture:** Based on the Ettin-17m encoder design (17M parameters), offering faster inference speeds.
72
  - **Extended context length:** Processes sequences up to 8192 tokens, great for LLM output evals.
73
 
@@ -124,7 +124,7 @@ Details:
124
  Fine-tuning was performed on the `dleemiller/all-nli-distill` dataset.
125
 
126
  ### Validation Results
127
- The model achieved the following test set performance after fine-tuning:
128
  - **MNLI Unmatched:** 0.8019
129
  - **SNLI:** 0.8650
130
 
 
62
  To train this model, I added teacher logits to the all-nli dataset `dleemiller/all-nli-distill` from the
63
  `dleemiller/ModernCE-large-nli` model. This significantly improves performance above standard training.
64
 
65
+ This 17m architecture is based on ModernBERT and is an excellent candidate for lightweight **CPU inference**.
66
 
67
  ---
68
 
69
  ## Features
70
+ - **High performing:** Achieves **80.19%** and **86.50%** (Micro F1) on MNLI mismatched and SNLI test.
71
  - **Efficient architecture:** Based on the Ettin-17m encoder design (17M parameters), offering faster inference speeds.
72
  - **Extended context length:** Processes sequences up to 8192 tokens, great for LLM output evals.
73
 
 
124
  Fine-tuning was performed on the `dleemiller/all-nli-distill` dataset.
125
 
126
  ### Validation Results
127
+ The model achieved the following test set micro f1 performance after fine-tuning:
128
  - **MNLI Unmatched:** 0.8019
129
  - **SNLI:** 0.8650
130