Update model card with better formatting and related models
Browse files
README.md
CHANGED
|
@@ -9,6 +9,8 @@ tags:
|
|
| 9 |
- rag
|
| 10 |
- fact-checking
|
| 11 |
- token-classification
|
|
|
|
|
|
|
| 12 |
datasets:
|
| 13 |
- RAGTruth
|
| 14 |
base_model: llm-semantic-router/modernbert-base-32k
|
|
@@ -23,7 +25,7 @@ model-index:
|
|
| 23 |
name: Hallucination Detection
|
| 24 |
dataset:
|
| 25 |
type: RAGTruth
|
| 26 |
-
name: RAGTruth
|
| 27 |
metrics:
|
| 28 |
- type: f1
|
| 29 |
value: 77.49
|
|
@@ -33,7 +35,7 @@ model-index:
|
|
| 33 |
name: Token-Level F1
|
| 34 |
---
|
| 35 |
|
| 36 |
-
# ModernBERT-base-32k Hallucination Detector
|
| 37 |
|
| 38 |
A hallucination detection model fine-tuned on RAGTruth dataset using extended 32K context ModernBERT.
|
| 39 |
|
|
@@ -49,12 +51,21 @@ This model detects hallucinations in LLM-generated text by classifying each toke
|
|
| 49 |
|
| 50 |
## Performance
|
| 51 |
|
|
|
|
|
|
|
| 52 |
| Metric | This Model | LettuceDetect BASE | LettuceDetect LARGE |
|
| 53 |
|--------|------------|-------------------|---------------------|
|
| 54 |
-
| **Example-Level F1** | **77.49%** | 75.99% | 79.22% |
|
| 55 |
| Token-Level F1 | 51.47% | 56.27% | - |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 56 |
|
| 57 |
-
|
|
|
|
| 58 |
|
| 59 |
## Usage
|
| 60 |
|
|
|
|
| 9 |
- rag
|
| 10 |
- fact-checking
|
| 11 |
- token-classification
|
| 12 |
+
- long-context
|
| 13 |
+
- 32k
|
| 14 |
datasets:
|
| 15 |
- RAGTruth
|
| 16 |
base_model: llm-semantic-router/modernbert-base-32k
|
|
|
|
| 25 |
name: Hallucination Detection
|
| 26 |
dataset:
|
| 27 |
type: RAGTruth
|
| 28 |
+
name: RAGTruth Test Set
|
| 29 |
metrics:
|
| 30 |
- type: f1
|
| 31 |
value: 77.49
|
|
|
|
| 35 |
name: Token-Level F1
|
| 36 |
---
|
| 37 |
|
| 38 |
+
# 🥬 ModernBERT-base-32k Hallucination Detector
|
| 39 |
|
| 40 |
A hallucination detection model fine-tuned on RAGTruth dataset using extended 32K context ModernBERT.
|
| 41 |
|
|
|
|
| 51 |
|
| 52 |
## Performance
|
| 53 |
|
| 54 |
+
Evaluated on **RAGTruth test set** (2,700 samples):
|
| 55 |
+
|
| 56 |
| Metric | This Model | LettuceDetect BASE | LettuceDetect LARGE |
|
| 57 |
|--------|------------|-------------------|---------------------|
|
| 58 |
+
| **Example-Level F1** | **77.49%** ✅ | 75.99% | 79.22% |
|
| 59 |
| Token-Level F1 | 51.47% | 56.27% | - |
|
| 60 |
+
| Context Window | **32K** | 8K | 8K |
|
| 61 |
+
|
| 62 |
+
### Key Results
|
| 63 |
+
- ✅ **Beats LettuceDetect BASE** by +1.5% on example-level F1
|
| 64 |
+
- ✅ **4x longer context** (32K vs 8K tokens)
|
| 65 |
+
- ✅ **Same model size** as BASE (~150M parameters)
|
| 66 |
|
| 67 |
+
### Related Model
|
| 68 |
+
- [`modernbert-base-32k-haldetect-combined`](https://huggingface.co/llm-semantic-router/modernbert-base-32k-haldetect-combined) - Trained on RAGTruth + HaluEval (48K samples)
|
| 69 |
|
| 70 |
## Usage
|
| 71 |
|