Text Generation
PEFT
TensorBoard
Safetensors
English
medical
radiology
medical-coding
icd-10
cpt
llama-3
llama-3-70b
lora
healthcare
clinical
conversational
Instructions to use vineetdaniels/NYXMed-V17-Model with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use vineetdaniels/NYXMed-V17-Model with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("vineetdaniels/NYXMed-V16-Model") model = PeftModel.from_pretrained(base_model, "vineetdaniels/NYXMed-V17-Model") - Notebooks
- Google Colab
- Kaggle
Add real 500-record validation results: ICD recall 83.4%, CPT 90.6%, Modifier 97.0%
Browse files
README.md
CHANGED
|
@@ -39,13 +39,14 @@ V17 is a **LoRA adapter** trained on top of [`vineetdaniels/NYXMed-V16-Model`](h
|
|
| 39 |
|
| 40 |
| Metric | V16 (base) | **V17 (this)** | Δ |
|
| 41 |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
| 42 |
| **Final eval_loss** | ~0.25 | **0.0824** | **−67%** |
|
| 43 |
-
| Base model | Llama-3-70B-Instruct | Same | — |
|
| 44 |
-
| LoRA params trained | r=64, α=128 | r=64, α=128 | — |
|
| 45 |
| Train examples | ~67K | **113,032** | +69% |
|
| 46 |
| Adds Exam Description + Reason | ❌ | ✅ | — |
|
| 47 |
|
| 48 |
-
|
| 49 |
|
| 50 |
---
|
| 51 |
|
|
@@ -90,16 +91,21 @@ Early stopping triggered at step 3,900 (1.1 epochs); `load_best_model_at_end=Tru
|
|
| 90 |
|
| 91 |
### Domain-specific accuracy
|
| 92 |
|
|
|
|
|
|
|
| 93 |
| Metric | V17 |
|
| 94 |
|---|---|
|
| 95 |
-
| CPT exact match |
|
| 96 |
-
| Primary CPT match |
|
| 97 |
-
| Modifier exact match |
|
| 98 |
-
| ICD-10 exact match
|
| 99 |
-
| ICD-10
|
| 100 |
-
|
|
| 101 |
-
|
| 102 |
-
|
|
|
|
|
|
|
|
|
|
| 103 |
|
| 104 |
---
|
| 105 |
|
|
|
|
| 39 |
|
| 40 |
| Metric | V16 (base) | **V17 (this)** | Δ |
|
| 41 |
|---|---|---|---|
|
| 42 |
+
| **CPT exact match** | ~85% | **90.6%** | **+5.6 pts** |
|
| 43 |
+
| **Modifier exact match** | ~95% | **97.0%** | +2.0 pts |
|
| 44 |
+
| **Mean ICD recall** | ~65% | **83.4%** | **+18.4 pts** |
|
| 45 |
| **Final eval_loss** | ~0.25 | **0.0824** | **−67%** |
|
|
|
|
|
|
|
| 46 |
| Train examples | ~67K | **113,032** | +69% |
|
| 47 |
| Adds Exam Description + Reason | ❌ | ✅ | — |
|
| 48 |
|
| 49 |
+
V17 was trained to push **ICD recall above 80%** without regressing CPT — both goals achieved. Full metric breakdown in **Evaluation** below.
|
| 50 |
|
| 51 |
---
|
| 52 |
|
|
|
|
| 91 |
|
| 92 |
### Domain-specific accuracy
|
| 93 |
|
| 94 |
+
Measured on **n = 500** randomly sampled held-out radiology reports (greedy decoding, batch=4, 4×H200):
|
| 95 |
+
|
| 96 |
| Metric | V17 |
|
| 97 |
|---|---|
|
| 98 |
+
| **CPT exact match** | **90.60%** |
|
| 99 |
+
| Primary CPT match | 91.40% |
|
| 100 |
+
| **Modifier exact match** | **97.00%** |
|
| 101 |
+
| **ICD-10 exact match** (full set) | 69.60% |
|
| 102 |
+
| ICD-10 any-overlap | 90.40% |
|
| 103 |
+
| ICD-10 root-overlap (`A99.x`-level) | 92.20% |
|
| 104 |
+
| **Mean ICD recall** | **83.37%** |
|
| 105 |
+
| Mean ICD precision | 85.05% |
|
| 106 |
+
| All-three exact (CPT + MOD + full ICD set) | 64.00% |
|
| 107 |
+
|
| 108 |
+
V17's primary training objective — **raise ICD recall above 80%** — was met (83.37%) while CPT (90.6%) and Modifier (97.0%) far exceeded the no-regression floor. Code-set-overlap metrics show V17 is identifying the correct *family* of ICD codes 92% of the time, with most remaining errors being specificity refinements (e.g. predicting `M25.5` instead of `M25.511`) rather than wrong-diagnosis errors.
|
| 109 |
|
| 110 |
---
|
| 111 |
|