Instructions to use StanfordAIMI/chexanatomy-paligemma-10b-224 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use StanfordAIMI/chexanatomy-paligemma-10b-224 with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("google/paligemma2-10b-pt-224")
model = PeftModel.from_pretrained(base_model, "StanfordAIMI/chexanatomy-paligemma-10b-224")

Transformers

How to use StanfordAIMI/chexanatomy-paligemma-10b-224 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="StanfordAIMI/chexanatomy-paligemma-10b-224")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("StanfordAIMI/chexanatomy-paligemma-10b-224", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use StanfordAIMI/chexanatomy-paligemma-10b-224 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "StanfordAIMI/chexanatomy-paligemma-10b-224"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "StanfordAIMI/chexanatomy-paligemma-10b-224",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/StanfordAIMI/chexanatomy-paligemma-10b-224

SGLang

How to use StanfordAIMI/chexanatomy-paligemma-10b-224 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "StanfordAIMI/chexanatomy-paligemma-10b-224" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "StanfordAIMI/chexanatomy-paligemma-10b-224",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "StanfordAIMI/chexanatomy-paligemma-10b-224" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "StanfordAIMI/chexanatomy-paligemma-10b-224",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use StanfordAIMI/chexanatomy-paligemma-10b-224 with Docker Model Runner:
```
docker model run hf.co/StanfordAIMI/chexanatomy-paligemma-10b-224
```

License

This release contains CheXanatomy adapter weights for a PaliGemma-based model. Use of these weights is subject to:

the license and usage terms of the underlying PaliGemma/Gemma base model
the license and terms governing the training data and derived data pipeline
the license terms in the CheXanatomy code repository for accompanying code

This release is provided for research use.

CheXanatomy: Anatomy-Aware Vision-Language Modeling for Chest Radiographs

This repository contains the released CheXanatomy model for anatomy-aware vision-language modeling on chest radiographs.

CheXanatomy is described in:

CheXanatomy: Anatomy-Aware Vision-Language Modeling for Chest Radiographs
Sergios Gatidis, Curtis Langlotz, Christian Bluethgen

Model Description

CheXanatomy augments a pretrained vision-language model with explicit anatomical supervision in token space. Instead of introducing task-specific decoder heads, the model is trained autoregressively to generate anatomical localization and segmentation outputs as structured tokens.

The model supports anatomy-aware tasks such as:

anatomical detection
bounding-box generation
anatomical segmentation
anatomy token identification
transfer to related localization tasks

The training approach uses synthetic chest radiographs generated from CT volumes, with forward-projected anatomical labels providing anatomically consistent 2D supervision.

Intended Use

This model is intended for research use in:

anatomy-aware medical vision-language modeling
chest radiograph localization
chest radiograph anatomical segmentation
representation learning for radiology imaging

This release is not intended for:

direct clinical deployment
autonomous diagnosis
unsupervised medical decision-making

Training Data

CheXanatomy uses synthetic chest radiograph training data derived from CT volumes. Synthetic chest radiographs can be generated from the CT-RATE dataset together with the CheXsynth pipeline:

CT-RATE: https://huggingface.co/datasets/ibrahimhamamci/CT-RATE
CheXsynth: https://github.com/sergiosgatidis/CheXsynth

The CheXanatomy code repository is available at:

https://github.com/sergiosgatidis/CheXanatomy

Model Inputs and Outputs

The model takes:

a chest radiograph image
a textual prompt

Example prompts:

detect heart
segment left lung
segment aorta
caption <loc0400><loc0312><loc0703><loc0625>

The model generates autoregressive token outputs that may include:

location tokens such as <loc0123>
segmentation tokens such as <seg045>
anatomical labels

Segmentation masks may require postprocessing and token decoding using the utilities released in the CheXanatomy code repository.

Usage

Example

from pathlib import Path
import torch
from peft import PeftConfig, PeftModel
from PIL import Image
from transformers import PaliGemmaForConditionalGeneration, PaliGemmaProcessor

model_id = "StanfordAIMI/chexanatomy-paligemma-10b-224"

peft_config = PeftConfig.from_pretrained(model_id)
base_model_id = peft_config.base_model_name_or_path

base_model = PaliGemmaForConditionalGeneration.from_pretrained(base_model_id)
model = PeftModel.from_pretrained(base_model, model_id)
processor = PaliGemmaProcessor.from_pretrained(base_model_id)

image = Image.open("ct.png").convert("RGB")
prompt = "segment left lung"

inputs = processor(image, prompt, return_tensors="pt")

device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)
inputs = {k: v.to(device) for k, v in inputs.items()}

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        do_sample=False,
    )

decoded = processor.decode(outputs[0], skip_special_tokens=True)
print(decoded)

Downloads last month: 2

Model tree for StanfordAIMI/chexanatomy-paligemma-10b-224

Base model

google/paligemma2-10b-pt-224

Adapter

(32)

this model

StanfordAIMI
/

chexanatomy-paligemma-10b-224