| --- |
| license: other |
| license_name: license |
| license_link: LICENSE |
| tags: |
| - medical |
| - radiology |
| - swin-unetr |
| --- |
| |
| <div align="center"> |
|
|
| <br /> |
|
|
| <img src="./assets/banner-with-name.png"> |
|
|
| **CoralBay**: A Self-Supervised CT Foundation Model |
|
|
| <br /> |
|
|
| </div> |
|
|
|
|
| Quick Highlights |
|
|
| - **True 3D CT**: Moves beyond 2D slice models to understand full volumetric anatomy and spatial relationships. |
| - **Data Efficient**: Top-tier classification and segmentation performance from just 11K unlabeled CT scans. |
| - **One Model, Many Tasks**: From fine-grained analysis to global understanding and downstream applications. |
| - **Open & Ready for Impact**: Open-source, CT task robust, and built to accelerate the future of medical AI. |
|
|
| ### Quick Start |
|
|
| ```py3 |
| import urllib.request |
| import torch |
| |
| from monai import transforms, inferers |
| from transformers import AutoModel |
| |
| # 1. Download sample CT scan |
| url = "https://github.com/neurolabusc/niivue-images/raw/refs/heads/main/CT_Abdo.nii.gz" |
| urllib.request.urlretrieve(url, "CT_Abdo.nii.gz") |
| |
| # 2. Preprocess volume |
| preprocess = transforms.Compose([ |
| transforms.LoadImage(image_only=True), |
| transforms.EnsureChannelFirst(), |
| |
| transforms.Spacing( |
| pixdim=(1.5, 1.5, 1.5), |
| mode="bilinear" |
| ), |
| |
| # dev-only crop for speed |
| transforms.CenterSpatialCrop(roi_size=192), |
| |
| # any HU window (a_min, a_max) will do |
| transforms.ScaleIntensityRange( |
| a_min=-1000, a_max=1000, |
| b_min=0.0, b_max=1.0, |
| clip=True, |
| ), |
| ]) |
| |
| x = preprocess("CT_Abdo.nii.gz").float().unsqueeze(0) # (1, 1, D, H, W) |
| |
| # 3. Load model |
| device = "cuda" if torch.cuda.is_available() else "cpu" |
| |
| model = AutoModel.from_pretrained( |
| "kaiko-ai-user/coralbay", |
| trust_remote_code=True, |
| out_indices=None, |
| # out_indices=6, # = None for the aggregated feature vector |
| ).eval().to(device) |
| |
| # 4. Attach sliding-window inference |
| model.encoder._inferer = inferers.SlidingWindowInferer( |
| roi_size=(96, 96, 96), |
| sw_batch_size=2, |
| overlap=0.0, # 0.75 for better results |
| mode="gaussian", |
| sw_device=device, # run windows on GPU |
| device="cpu", # stitch results on CPU |
| ) |
| |
| # 5. Run inference |
| with torch.no_grad(): |
| features = model(x.to(device)) |
| |
| # 6. Output |
| # for `out_indices=6`: |
| # features is a multi-scale feature pyramid: |
| # [0] (1, 192, 96, 96, 96) |
| # [1] (1, 192, 48, 48, 48) |
| # [2] (1, 384, 24, 24, 24) |
| # [3] (1, 768, 12, 12, 12) |
| # [4] (1, 1536, 6, 6, 6) |
| # [5] (1, 3072, 3, 3, 3) |
| # for `out_indices=None`: |
| # features is a tensor 1, 4608) |
| ``` |
|
|
| ### Quantitative Performance |
|
|
| Quantitative performance across classification (Multiclass Accuracy/Binary AUROC) and segmentation (Dice score) tasks, as evaluated via the [eva](https://github.com/kaiko-ai/eva) framework. |
|
|
| <div align="center"> |
| <img src="./assets/coralbay-benchmark.png" alt="Radiology Leaderboard"> |
| </div> |
| |
| ## Citation |
|
|
| If you use this model, please cite it as follows: |
|
|
| ```bibtex |
| @misc{gatopoulos2026coralbayselfsupervisedctfoundation, |
| title={CoralBay: A Self-Supervised CT Foundation Model}, |
| author={Ioannis Gatopoulos and Nicolas Känzig and Sebastian Otálora and Fei Tang}, |
| year={2026}, |
| eprint={2606.03888}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.CV}, |
| url={https://arxiv.org/abs/2606.03888}, |
| } |
| ``` |
|
|
| <br /> |
|
|
| <div align="center"> |
| <img src="./assets/kaiko-logo.png" width="200"> |
| </div> |
|
|