Ghana ADR Detection System

Adverse drug reaction (ADR) detection from free-text clinical narratives, built on Ghanaian pharmacovigilance data. Fine-tuned from PubMedBERT with domain-adaptive pretraining (DAPT) on 128k Ghanaian biomedical sentences.

Model Description

This repository contains two production heads trained on the same DAPT backbone:

Component	Path	Task
DAPT backbone	`dapt-backbone/`	PubMedBERT MLM-adapted on Ghanaian biomedical corpus (PPL 6.11 → 4.55)
CLF head (Phase 2b)	`checkpoints/clf_phase2b_{fold}/clf_best/`	Binary: `contains_adr` 0/1
NER head (Phase 7)	`checkpoints/ner_phase7_{fold}/ner_best/`	Token labels: `DRUG`, `ADR`, `SEVERITY`, `PATIENT_DEMO`

Production config (Phase 7 Hybrid): clf_phase2b_cohort_study + ner_phase7_cohort_study, threshold 0.55.

Performance

Evaluated with Leave-One-Source-Out (LOSO) cross-validation — each source domain is held out as the test set while the model trains on the remaining three. This measures real cross-domain generalisation across Ghanaian clinical writing styles.

Classification (CLF)

Held-out source	N	F1
case_report	44	0.787
cohort_study	123	0.776
fda_newsletter	99	0.667
qualitative_interview	78	0.667
macro-avg	—	0.724

Named Entity Recognition (NER)

Held-out source	N	F1	DRUG F1	ADR F1
case_report	44	0.598	0.862	0.545
cohort_study	123	0.785	0.823	0.884
fda_newsletter	99	0.587	0.626	0.634
qualitative_interview	78	0.650	0.560	0.842
macro-avg	—	0.655	0.718	0.727

Batch regression against 95 curated hard cases (Pidgin idioms, dialect, regulatory register, clinical shorthand, minimal pairs): 85/95 pass (89.5%).

Training Data

Built from four Ghanaian pharmacovigilance source domains:

Source	Type
Ghana FDA DrugLens newsletters (5 issues)	PDF — regulatory
Ghana FDA Annual Report 2023 + ADR Guide	PDF — regulatory
PMC open-access case reports & cohort studies (9 articles)	JATS XML — clinical
Patient ADR interview transcripts	Qualitative — community

Gold dataset: 2,870 annotated sentences
Silver dataset: 2,105+ records (DailyMed weak supervision, DrugLens NER, OpenFDA ICSR, synthetic curriculum)

How to Use

from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers import AutoModelForTokenClassification
import torch

# Load DAPT backbone tokenizer
tokenizer = AutoTokenizer.from_pretrained("iamjamaal/ghana-adr-detection", subfolder="dapt-backbone")

# Load CLF head (production fold: cohort_study)
clf_model = AutoModelForSequenceClassification.from_pretrained(
    "iamjamaal/ghana-adr-detection",
    subfolder="checkpoints/clf_phase2b_cohort_study/clf_best"
)

# Load NER head (production fold: cohort_study)
ner_model = AutoModelForTokenClassification.from_pretrained(
    "iamjamaal/ghana-adr-detection",
    subfolder="checkpoints/ner_phase7_cohort_study/ner_best"
)

text = "Patient developed severe oculogyric crisis after starting haloperidol."

# CLF inference
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    logits = clf_model(**inputs).logits
prob_adr = torch.softmax(logits, dim=-1)[0][1].item()
contains_adr = prob_adr >= 0.55
print(f"ADR: {contains_adr} (p={prob_adr:.3f})")

Repo Structure

dapt-backbone/                          # DAPT backbone (config + safetensors)
checkpoints/
  clf_phase2b_case_report/clf_best/     # CLF checkpoint — case_report fold
  clf_phase2b_cohort_study/clf_best/    # CLF checkpoint — cohort_study fold ← production
  clf_phase2b_fda_newsletter/clf_best/
  clf_phase2b_qualitative_interview/clf_best/
  ner_phase7_case_report/ner_best/      # NER checkpoint — case_report fold
  ner_phase7_cohort_study/ner_best/     # NER checkpoint — cohort_study fold ← production
  ner_phase7_fda_newsletter/ner_best/
  ner_phase7_qualitative_interview/ner_best/
  ner_phase7_qualitative_interview_seed/ner_best/

Code & Demo

Pipeline code: github.com/iamjamaal/ghana-pharmacovigilance-ai
Live demo: Flask app with single-sentence analysis, batch upload, and Yellow Card–style reporting

Limitations

Trained on Ghanaian pharmacovigilance sources; performance may degrade on clinical text from other regions.
NER F1 on SEVERITY and PATIENT_DEMO is lower than DRUG/ADR due to limited annotation density.
Ghanaian Pidgin and dialect constructions improve batch regression scores but may not generalise to other West African Pidgin variants.

Citation

@misc{ghana-adr-2026,
  title  = {Ghana ADR Detection System},
  author = {Nabila, Noah Jamal},
  year   = {2026},
  url    = {https://huggingface.co/iamjamaal/ghana-adr-detection}
}

License

Code: MIT | Model weights: MIT | Dataset annotations: CC-BY-4.0

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for iamjamaal/ghana-adr-detection

Base model

microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext

Finetuned

(168)

this model

Evaluation results

Macro F1 (LOSO)
self-reported

0.724
Macro F1 (LOSO)
self-reported

0.655