--- language: - en tags: - sentence-transformers - cross-encoder - reranker - generated_from_trainer - dataset_size:942069 - loss:PrecomputedDistillationLoss base_model: jhu-clsp/ettin-encoder-17m datasets: - dleemiller/all-nli-distill pipeline_tag: text-classification library_name: sentence-transformers metrics: - f1_macro - f1_micro - f1_weighted model-index: - name: CrossEncoder based on jhu-clsp/ettin-encoder-17m results: - task: type: cross-encoder-classification name: Cross Encoder Classification dataset: name: AllNLI dev type: AllNLI-dev metrics: - type: f1_macro value: 0.843215238686306 name: F1 Macro - type: f1_micro value: 0.8435163046243068 name: F1 Micro - type: f1_weighted value: 0.8438547382511594 name: F1 Weighted - task: type: cross-encoder-classification name: Cross Encoder Classification dataset: name: AllNLI test type: AllNLI-test metrics: - type: f1_macro value: 0.8442865676487733 name: F1 Macro - type: f1_micro value: 0.8446784696784697 name: F1 Micro - type: f1_weighted value: 0.8449960204914074 name: F1 Weighted --- # EttinX Cross-Encoder: Natural Language Inference (NLI) This cross encoder performs sequence classification for contradiction/neutral/entailment labels. This has drop-in compatibility with comparable sentence transformers cross encoders. To train this model, I added teacher logits to the all-nli dataset `dleemiller/all-nli-distill` from the `dleemiller/ModernCE-large-nli` model. This significantly improves performance above standard training. This 17m architecture is based on ModernBERT and is an excellent candidate for lightweight **CPU inference**. --- ## Features - **High performing:** Achieves **80.47%** and **86.95%** (Micro F1) on MNLI mismatched and SNLI test. - **Efficient architecture:** Based on the Ettin-17m encoder design (17M parameters), offering faster inference speeds. - **Extended context length:** Processes sequences up to 8192 tokens, great for LLM output evals. --- ## Performance | Model | MNLI Mismatched | SNLI Test | Context Length | # Parameters | |---------------------------|-------------------|--------------|----------------|----------------| | `dleemiller/ModernCE-large-nli` | **0.9202** | 0.9110 | 8192 | 395M | | `dleemiller/ModernCE-base-nli` | 0.9034 | 0.9025 | 8192 | 149M | | `cross-encoder/deberta-v3-large` | 0.9049 | 0.9220 | 512 | 435M | | `cross-encoder/deberta-v3-base` | 0.9004 | 0.9234 | 512 | 184M | | `cross-encoder/nli-distilroberta-base` | 0.8398 | 0.8838 | 512 | 82M | | `dleemiller/EttinX-nli-xxs` | 0.8047 | 0.8695 | 8192 | 17M | --- ## Usage To use EttinX for NLI tasks, you can load the model with the Hugging Face `sentence-transformers` library: ```python from sentence_transformers import CrossEncoder # Load EttinX model model = CrossEncoder("dleemiller/EttinX-nli-xxs") scores = model.predict([ ('A man is eating pizza', 'A man eats something'), ('A black race car starts up in front of a crowd of people.', 'A man is driving down a lonely road.') ]) # Convert scores to labels label_mapping = ['contradiction', 'entailment', 'neutral'] labels = [label_mapping[score_max] for score_max in scores.argmax(axis=1)] # ['entailment', 'contradiction'] ``` --- ## Training Details ### Pretraining We initialize the `` weights. Details: - Batch size: 512 - Learning rate: 1e-4 - **Attention Dropout:** attention dropout 0.1 ### Fine-Tuning Fine-tuning was performed on the `dleemiller/all-nli-distill` dataset. ### Validation Results The model achieved the following test set micro f1 performance after fine-tuning: - **MNLI Unmatched:** 0.8047 - **SNLI:** 0.8695 --- ## Model Card - **Architecture:** Ettin-encoder-17m - **Fine-Tuning Data:** `dleemiller/all-nli-distill` --- ## Thank You Thanks to the Johns Hopkins team for providing the ModernBERT models, and the Sentence Transformers team for their leadership in transformer encoder models. --- ## Citation If you use this model in your research, please cite: ```bibtex @misc{moderncenli2025, author = {Miller, D. Lee}, title = {EttinX NLI: An NLI cross encoder model}, year = {2025}, publisher = {Hugging Face Hub}, url = {https://huggingface.co/dleemiller/EttinX-nli-xxs}, } ``` --- ## License This model is licensed under the [MIT License](LICENSE).