noteguard-agent / README.md
github-actions[bot]
Deploy 9aa839066bbf99a8ada733b41479a39770b3bb83 from main
eb83689
|
Raw
History Blame Contribute Delete
9.62 kB
metadata
title: NoteGuard
emoji: πŸ₯
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false

status: experimental CI RAP level: Gold code style: ruff

NoteGuard β€” Trust Layer for Clinical AI

NHS clinical notes go in β†’ a safe AI-drafted summary comes out β†’ the model provably never sees a single identifier, with a live re-identification-risk number to prove it.

NoteGuard is a LangGraph agent (Gemini + Tavily) wrapped so that the language model and every tool only ever receive de-identified text. Real identifiers are restored only in the final, clinician-facing answer. The guarantee is enforced by assert_clean(), which raises before any PHI can reach the model.

Project history: NoteGuard began at the {Tech: Europe} London AI Hackathon. This is the post-hackathon 1.0 line β€” the codebase has been pruned to exactly the components that ship in the deployed app. See CHANGELOG.md.


What it does

messy NHS note  ──►  NoteGuard de-id  ──►  de-identified text
(synthetic)           (NHS-aware rules       + identifiers removed count
                       + vault from CSV)     + residual leakage %
                                β”‚
                  Tavily (NICE/NHS guidance) ──►│
                                                β–Ό
                                         Gemini drafts
                                         compact eDischarge card
                                         (sees ONLY de-identified text)
                                                β”‚
                         NoteGuard re-id  β—„β”€β”€β”€β”€β”€β”˜
                         (surrogates β†’ real names, clinician only)
                                β”‚
                         Trust panel (de-id correctness only):
                         de-id PASS/FAIL Β· identifiers replaced
                         residual PII (model input) Β· reversible

The key technical guarantee: assert_clean() raises before Gemini or Tavily receive anything β€” the model structurally cannot leak what it never saw.


Quickstart

# 1. Clone and create environment
git clone https://github.com/chaeyoonyunakim/noteguard-agent.git
cd noteguard-agent
python -m venv .venv
.venv\Scripts\activate          # Windows
# source .venv/Scripts/activate # bash
# source .venv/bin/activate     # macOS/Linux

# 2. Install
pip install -e ".[dev]"

# 3. Configure
cp .env.example .env
# fill: GOOGLE_API_KEY  TAVILY_API_KEY  LANGSMITH_API_KEY

# 4. Smoke test β€” no API keys needed
python src/deid.py

# 5. Interactive de-id demo β€” no API keys needed
pip install -e ".[demo]"
streamlit run streamlit_app.py

# 6. Clinician web UI (full agent)
uvicorn app.api:app --reload --port 8000
# then open: http://localhost:8000

# 7. LangGraph dev server + Agent Chat UI
langgraph dev   # requires pip install -e ".[dev]"
# then open: https://smith.langchain.com/studio/?baseUrl=http://127.0.0.1:2024

# 8. LangSmith evals
python -m eval.run_eval

Key files

File Purpose
src/deid.py Dependency-free de-id core. python src/deid.py runs standalone.
src/fetch_dataset.py Downloads the synthetic dataset into data/ (run once).
agent/graph.py LangGraph graph exposed as noteguard for langgraph dev.
app/api.py FastAPI backend β€” /, /health, /process, /summarise, /samples, /sample/{id}.
app/static/index.html Single-file clinician web UI (vanilla JS, no build step).
streamlit_app.py Interactive de-id demo β€” no API keys needed.
eval/run_eval.py LangSmith evals: zero_phi_to_model (must be 1.0) + faithfulness.
langgraph.json Graph manifest for langgraph dev.
.env.example Required environment variables.
docs/tool_card.md Five Safes, bias & fairness, use cases out of scope, DPIA prerequisites.
docs/report.md gov.uk ATRS record (Tier 1 + Tier 2).

The guarantee (non-negotiable)

deidentify_in β†’ assert_clean() β†’ [ONLY DE-IDENTIFIED TEXT] β†’ Gemini / Tavily
                                                               ↓
                                                          reidentify_out β†’ clinician

assert_clean() raises ValueError if any known identifier or regex pattern (NHS number, email, phone, GMC, NMC, postcode) survives de-identification. The LangSmith zero_phi_to_model evaluator verifies this on every run and must score 1.0.


Graph pipeline

deidentify_in β†’ agent (Gemini + Tavily) β†’ reidentify_out β†’ compute_trust
Node Function
deidentify_in NoteGuard.deidentify() + assert_clean() β€” strips PHI; raises if any identifier survives.
agent create_react_agent (Gemini + Tavily) β€” drafts the eDischarge card; sees only de-identified text.
reidentify_out NoteGuard.reidentify() β€” restores surrogates β†’ real names for the clinician only.
compute_trust Audits de-id correctness β€” scan_pii(deid_text) for residual PII the model saw, plus orphaned surrogate tokens for reversibility.

REST API

POST /process

{
  "note": "Pt Margaret Okafor (NHS 485 777 3456) admitted post-fall.",
  "question": "Draft a discharge summary."
}

Response fields:

Field Description
clinician_note Verbatim input note
ai_note De-identified note the model saw (surrogate tokens)
identifiers Original identifier strings that were redacted
discharge_summary Gemini-drafted compact eDischarge card, re-identified for the clinician
metrics.deid_ok Overall verdict β€” true only when nothing leaked and every surrogate is reversible
metrics.identifiers_removed Count of PII spans pseudonymised this turn
metrics.residual_pii List of {type, text} β€” suspected un-redacted PII the model still saw
metrics.residual_pii_count Number of residual-PII findings (0 = de-identified)
metrics.reversible true when every surrogate restores to a real value
metrics.leaked_tokens Orphaned/unresolved surrogate tokens (the reversibility detail)

A 422 response means assert_clean() detected surviving PHI β€” the request is rejected before the model sees anything.

GET /samples

Paginated list of synthetic notes with optional search and note_type filter.

GET /samples?q=COPD&note_type=Discharge&limit=50

GET /sample/{clinical_note_id}

Returns the full note text for a given note ID (used by the note-picker UI).


Components

Stage Service Role
De-identification src/deid.py Dependency-free NHS-aware recognisers; the trust boundary.
Reasoning Google Gemini Drafts the discharge summary; sees only de-identified text.
Grounding Tavily Pulls NICE / NHS public guidance; never receives patient text.
Orchestration LangGraph Wires the de-id β†’ agent β†’ re-id β†’ trust pipeline.
Observability LangSmith Traces + privacy & faithfulness evals.

Hugging Face Spaces deployment

The app ships as a Docker Space β€” FastAPI + vanilla JS UI, served by uvicorn on port 7860. The image bundles Presidio + spaCy NER (en_core_web_md, via the [nlp] extra) so free-text patient/clinician names with no vault entry are de-identified; set NOTEGUARD_SPACY_MODEL=en_core_web_lg for higher recall at a larger image size.

Required secrets (Space β†’ Settings β†’ Variables and secrets):

  • GOOGLE_API_KEY
  • TAVILY_API_KEY
  • LANGSMITH_API_KEY (optional β€” enables tracing)

Auto-deploy: .github/workflows/deploy-hf.yml pushes a fresh snapshot of main onto the Space (chaeyoona/noteguard-agent) on every push, which triggers a Docker rebuild. (A snapshot β€” a single orphan commit β€” is used so historical binary blobs that HF's git backend rejects are never sent.) It needs an HF_TOKEN repository secret β€” a write-scoped Hugging Face access token added under Settings β†’ Secrets and variables β†’ Actions. Trigger the first deploy manually via the workflow's Run workflow button once the secret is set.


Data

NHSEDataScience/synthetic_clinical_notes (Hugging Face, MIT licence, fully synthetic).

python src/fetch_dataset.py   # downloads patients.csv, admissions.csv, synthetic_clinical_notes.csv into data/

load_known_from_csv("data/patients.csv", "data/admissions.csv") builds the identifier vault from both structured tables β€” patient names and clinician names β€” so residual leakage is measured against ground truth.


Ethics

  • Pseudonymised β‰  anonymous β€” still personal data under UK GDPR; don't over-claim.
  • Synthetic β‰  real β€” frame as methodology, not a finished product.
  • Clinician stays in the loop and signs off every summary.
  • See docs/tool_card.md for the full Five Safes mapping and bias & fairness statement.