Spaces:
Running
title: NoteGuard
emoji: π₯
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
NoteGuard β Trust Layer for Clinical AI
NHS clinical notes go in β a safe AI-drafted summary comes out β the model provably never sees a single identifier, with a live re-identification-risk number to prove it.
NoteGuard is a LangGraph agent (Gemini + Tavily) wrapped so that the language
model and every tool only ever receive de-identified text. Real identifiers
are restored only in the final, clinician-facing answer. The guarantee is
enforced by assert_clean(), which raises before any PHI can reach the model.
Project history: NoteGuard began at the {Tech: Europe} London AI Hackathon. This is the post-hackathon
1.0line β the codebase has been pruned to exactly the components that ship in the deployed app. SeeCHANGELOG.md.
What it does
messy NHS note βββΊ NoteGuard de-id βββΊ de-identified text
(synthetic) (NHS-aware rules + identifiers removed count
+ vault from CSV) + residual leakage %
β
Tavily (NICE/NHS guidance) βββΊβ
βΌ
Gemini drafts
compact eDischarge card
(sees ONLY de-identified text)
β
NoteGuard re-id βββββββ
(surrogates β real names, clinician only)
β
Trust panel (de-id correctness only):
de-id PASS/FAIL Β· identifiers replaced
residual PII (model input) Β· reversible
The key technical guarantee: assert_clean() raises before Gemini or Tavily
receive anything β the model structurally cannot leak what it never saw.
Quickstart
# 1. Clone and create environment
git clone https://github.com/chaeyoonyunakim/noteguard-agent.git
cd noteguard-agent
python -m venv .venv
.venv\Scripts\activate # Windows
# source .venv/Scripts/activate # bash
# source .venv/bin/activate # macOS/Linux
# 2. Install
pip install -e ".[dev]"
# 3. Configure
cp .env.example .env
# fill: GOOGLE_API_KEY TAVILY_API_KEY LANGSMITH_API_KEY
# 4. Smoke test β no API keys needed
python src/deid.py
# 5. Interactive de-id demo β no API keys needed
pip install -e ".[demo]"
streamlit run streamlit_app.py
# 6. Clinician web UI (full agent)
uvicorn app.api:app --reload --port 8000
# then open: http://localhost:8000
# 7. LangGraph dev server + Agent Chat UI
langgraph dev # requires pip install -e ".[dev]"
# then open: https://smith.langchain.com/studio/?baseUrl=http://127.0.0.1:2024
# 8. LangSmith evals
python -m eval.run_eval
Key files
| File | Purpose |
|---|---|
src/deid.py |
Dependency-free de-id core. python src/deid.py runs standalone. |
src/fetch_dataset.py |
Downloads the synthetic dataset into data/ (run once). |
agent/graph.py |
LangGraph graph exposed as noteguard for langgraph dev. |
app/api.py |
FastAPI backend β /, /health, /process, /summarise, /samples, /sample/{id}. |
app/static/index.html |
Single-file clinician web UI (vanilla JS, no build step). |
streamlit_app.py |
Interactive de-id demo β no API keys needed. |
eval/run_eval.py |
LangSmith evals: zero_phi_to_model (must be 1.0) + faithfulness. |
langgraph.json |
Graph manifest for langgraph dev. |
.env.example |
Required environment variables. |
docs/tool_card.md |
Five Safes, bias & fairness, use cases out of scope, DPIA prerequisites. |
docs/report.md |
gov.uk ATRS record (Tier 1 + Tier 2). |
The guarantee (non-negotiable)
deidentify_in β assert_clean() β [ONLY DE-IDENTIFIED TEXT] β Gemini / Tavily
β
reidentify_out β clinician
assert_clean() raises ValueError if any known identifier or regex pattern (NHS
number, email, phone, GMC, NMC, postcode) survives de-identification. The LangSmith
zero_phi_to_model evaluator verifies this on every run and must score 1.0.
Graph pipeline
deidentify_in β agent (Gemini + Tavily) β reidentify_out β compute_trust
| Node | Function |
|---|---|
deidentify_in |
NoteGuard.deidentify() + assert_clean() β strips PHI; raises if any identifier survives. |
agent |
create_react_agent (Gemini + Tavily) β drafts the eDischarge card; sees only de-identified text. |
reidentify_out |
NoteGuard.reidentify() β restores surrogates β real names for the clinician only. |
compute_trust |
Audits de-id correctness β scan_pii(deid_text) for residual PII the model saw, plus orphaned surrogate tokens for reversibility. |
REST API
POST /process
{
"note": "Pt Margaret Okafor (NHS 485 777 3456) admitted post-fall.",
"question": "Draft a discharge summary."
}
Response fields:
| Field | Description |
|---|---|
clinician_note |
Verbatim input note |
ai_note |
De-identified note the model saw (surrogate tokens) |
identifiers |
Original identifier strings that were redacted |
discharge_summary |
Gemini-drafted compact eDischarge card, re-identified for the clinician |
metrics.deid_ok |
Overall verdict β true only when nothing leaked and every surrogate is reversible |
metrics.identifiers_removed |
Count of PII spans pseudonymised this turn |
metrics.residual_pii |
List of {type, text} β suspected un-redacted PII the model still saw |
metrics.residual_pii_count |
Number of residual-PII findings (0 = de-identified) |
metrics.reversible |
true when every surrogate restores to a real value |
metrics.leaked_tokens |
Orphaned/unresolved surrogate tokens (the reversibility detail) |
A 422 response means assert_clean() detected surviving PHI β the request is
rejected before the model sees anything.
GET /samples
Paginated list of synthetic notes with optional search and note_type filter.
GET /samples?q=COPD¬e_type=Discharge&limit=50
GET /sample/{clinical_note_id}
Returns the full note text for a given note ID (used by the note-picker UI).
Components
| Stage | Service | Role |
|---|---|---|
| De-identification | src/deid.py |
Dependency-free NHS-aware recognisers; the trust boundary. |
| Reasoning | Google Gemini | Drafts the discharge summary; sees only de-identified text. |
| Grounding | Tavily | Pulls NICE / NHS public guidance; never receives patient text. |
| Orchestration | LangGraph | Wires the de-id β agent β re-id β trust pipeline. |
| Observability | LangSmith | Traces + privacy & faithfulness evals. |
Hugging Face Spaces deployment
The app ships as a Docker Space β FastAPI + vanilla JS UI, served by uvicorn on port 7860.
The image bundles Presidio + spaCy NER (en_core_web_md, via the [nlp] extra) so
free-text patient/clinician names with no vault entry are de-identified; set
NOTEGUARD_SPACY_MODEL=en_core_web_lg for higher recall at a larger image size.
Required secrets (Space β Settings β Variables and secrets):
GOOGLE_API_KEYTAVILY_API_KEYLANGSMITH_API_KEY(optional β enables tracing)
Auto-deploy: .github/workflows/deploy-hf.yml
pushes a fresh snapshot of main onto the Space (chaeyoona/noteguard-agent) on
every push, which triggers a Docker rebuild. (A snapshot β a single orphan commit β
is used so historical binary blobs that HF's git backend rejects are never sent.)
It needs an HF_TOKEN repository secret β a write-scoped
Hugging Face access token added under
Settings β Secrets and variables β Actions. Trigger the first deploy manually via
the workflow's Run workflow button once the secret is set.
Data
NHSEDataScience/synthetic_clinical_notes (Hugging Face, MIT licence, fully synthetic).
python src/fetch_dataset.py # downloads patients.csv, admissions.csv, synthetic_clinical_notes.csv into data/
load_known_from_csv("data/patients.csv", "data/admissions.csv") builds the
identifier vault from both structured tables β patient names and clinician names β
so residual leakage is measured against ground truth.
Ethics
- Pseudonymised β anonymous β still personal data under UK GDPR; don't over-claim.
- Synthetic β real β frame as methodology, not a finished product.
- Clinician stays in the loop and signs off every summary.
- See
docs/tool_card.mdfor the full Five Safes mapping and bias & fairness statement.