Instructions to use StanfordAIMI/chexanatomy-paligemma-10b-224 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use StanfordAIMI/chexanatomy-paligemma-10b-224 with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("google/paligemma2-10b-pt-224") model = PeftModel.from_pretrained(base_model, "StanfordAIMI/chexanatomy-paligemma-10b-224") - Transformers
How to use StanfordAIMI/chexanatomy-paligemma-10b-224 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="StanfordAIMI/chexanatomy-paligemma-10b-224")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("StanfordAIMI/chexanatomy-paligemma-10b-224", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use StanfordAIMI/chexanatomy-paligemma-10b-224 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "StanfordAIMI/chexanatomy-paligemma-10b-224" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "StanfordAIMI/chexanatomy-paligemma-10b-224", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/StanfordAIMI/chexanatomy-paligemma-10b-224
- SGLang
How to use StanfordAIMI/chexanatomy-paligemma-10b-224 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "StanfordAIMI/chexanatomy-paligemma-10b-224" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "StanfordAIMI/chexanatomy-paligemma-10b-224", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "StanfordAIMI/chexanatomy-paligemma-10b-224" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "StanfordAIMI/chexanatomy-paligemma-10b-224", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use StanfordAIMI/chexanatomy-paligemma-10b-224 with Docker Model Runner:
docker model run hf.co/StanfordAIMI/chexanatomy-paligemma-10b-224
License
This release contains CheXanatomy adapter weights for a PaliGemma-based model. Use of these weights is subject to:
- the license and usage terms of the underlying PaliGemma/Gemma base model
- the license and terms governing the training data and derived data pipeline
- the license terms in the CheXanatomy code repository for accompanying code
This release is provided for research use.
CheXanatomy: Anatomy-Aware Vision-Language Modeling for Chest Radiographs
This repository contains the released CheXanatomy model for anatomy-aware vision-language modeling on chest radiographs.
CheXanatomy is described in:
CheXanatomy: Anatomy-Aware Vision-Language Modeling for Chest Radiographs
Sergios Gatidis, Curtis Langlotz, Christian Bluethgen
Model Description
CheXanatomy augments a pretrained vision-language model with explicit anatomical supervision in token space. Instead of introducing task-specific decoder heads, the model is trained autoregressively to generate anatomical localization and segmentation outputs as structured tokens.
The model supports anatomy-aware tasks such as:
- anatomical detection
- bounding-box generation
- anatomical segmentation
- anatomy token identification
- transfer to related localization tasks
The training approach uses synthetic chest radiographs generated from CT volumes, with forward-projected anatomical labels providing anatomically consistent 2D supervision.
Intended Use
This model is intended for research use in:
- anatomy-aware medical vision-language modeling
- chest radiograph localization
- chest radiograph anatomical segmentation
- representation learning for radiology imaging
This release is not intended for:
- direct clinical deployment
- autonomous diagnosis
- unsupervised medical decision-making
Training Data
CheXanatomy uses synthetic chest radiograph training data derived from CT volumes. Synthetic chest radiographs can be generated from the CT-RATE dataset together with the CheXsynth pipeline:
- CT-RATE: https://huggingface.co/datasets/ibrahimhamamci/CT-RATE
- CheXsynth: https://github.com/sergiosgatidis/CheXsynth
The CheXanatomy code repository is available at:
Model Inputs and Outputs
The model takes:
- a chest radiograph image
- a textual prompt
Example prompts:
detect heartsegment left lungsegment aortacaption <loc0400><loc0312><loc0703><loc0625>
The model generates autoregressive token outputs that may include:
- location tokens such as
<loc0123> - segmentation tokens such as
<seg045> - anatomical labels
Segmentation masks may require postprocessing and token decoding using the utilities released in the CheXanatomy code repository.
Usage
Example
from pathlib import Path
import torch
from peft import PeftConfig, PeftModel
from PIL import Image
from transformers import PaliGemmaForConditionalGeneration, PaliGemmaProcessor
model_id = "StanfordAIMI/chexanatomy-paligemma-10b-224"
peft_config = PeftConfig.from_pretrained(model_id)
base_model_id = peft_config.base_model_name_or_path
base_model = PaliGemmaForConditionalGeneration.from_pretrained(base_model_id)
model = PeftModel.from_pretrained(base_model, model_id)
processor = PaliGemmaProcessor.from_pretrained(base_model_id)
image = Image.open("ct.png").convert("RGB")
prompt = "segment left lung"
inputs = processor(image, prompt, return_tensors="pt")
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)
inputs = {k: v.to(device) for k, v in inputs.items()}
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=256,
do_sample=False,
)
decoded = processor.decode(outputs[0], skip_special_tokens=True)
print(decoded)
- Downloads last month
- 2
Model tree for StanfordAIMI/chexanatomy-paligemma-10b-224
Base model
google/paligemma2-10b-pt-224