Upload README.md with huggingface_hub

f28dd00 verified 8 days ago

1.93 kB

license: llama3
language:
  - en
library_name: peft
pipeline_tag: text-generation
base_model: vineetdaniels/NYXMed-V17-Merged
tags:
  - medical
  - radiology
  - medical-coding
  - icd-10
  - cpt
  - llama-3
  - lora
  - peft
  - healthcare

NYXMed V18 — Radiology Coding LoRA Adapter

LoRA adapter trained on top of vineetdaniels/NYXMed-V17-Merged, targeting primary-ICD accuracy with proximity-ranked retrieval candidates.

For a deployable single model, use vineetdaniels/NYXMed-V18-Merged.

Highlights

Best eval_loss: 0.0710 (early-stopped at step 1,700; best checkpoint step 1,400)
Trained on 59,170 coder-verified examples weighted toward primary-ICD corrections (family-swaps 47%)
Built on the proximity-ranking retrieval fix (+21.5pp recall@10 of the correct primary on previously-wrong records) — must be deployed with the matching preprocessor change

Training


Base	`vineetdaniels/NYXMed-V17-Merged`
LoRA	r=64, α=128, dropout=0.05, targets q/k/v/o/gate/up/down_proj
Examples	59,170 (weighted)
Effective batch	32
Hardware	4× H200, ~10.9h

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = AutoModelForCausalLM.from_pretrained("vineetdaniels/NYXMed-V17-Merged", torch_dtype=torch.bfloat16, device_map="auto")
tok  = AutoTokenizer.from_pretrained("vineetdaniels/NYXMed-V18-Model")
model = PeftModel.from_pretrained(base, "vineetdaniels/NYXMed-V18-Model").eval()

eval_loss is on V18's own held-out split (not directly comparable to V17's split). The authoritative metric is primary-ICD accuracy on a common held-out production set. Radiology-only, review-then-accept use.