You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

BrainScope Disease Fine-Tuned scGPT

This repository contains our disease-adapted fine-tuned scGPT model for brain single-cell / single-nucleus RNA-seq analysis.

It is intended to be used with the companion pipeline repository:

  • Pipeline repo: YOUR_USERNAME/brainscope-scgpt-pipeline

Model summary

This model starts from the original scGPT backbone and is then fine-tuned on disease-related brain single-cell / single-nucleus RNA-seq data for downstream annotation workflows.

This packaged release is intended for:

  • disease-aware cell-type annotation
  • embedding generation
  • comparison against the original scGPT baseline
  • downstream error analysis and reproducible model sharing

Data context

This release is associated with workflows built on:

  • LIBD for smaller pilot experiments and rapid iteration
  • BrainScope for larger-scale disease-focused fine-tuning and evaluation

The goal of this model family is to improve robustness on disease-altered cell states relative to healthy-only baselines.


Included files

Typical contents of this repository include:

  • model.pt
  • config.yaml
  • preprocessing.json
  • vocab.json
  • label_map.json
  • metrics.json
  • requirements.txt
  • inference.py
  • small example input / output files

Intended use

This model is intended for:

  • disease-aware annotation of sc/snRNA-seq data
  • controlled comparisons with the original scGPT baseline
  • reproducible research workflows on brain disease datasets

This release is for research use only and is not a clinical model.


Example usage

Download with the Hub

from huggingface_hub import snapshot_download

repo_dir = snapshot_download("YOUR_USERNAME/brainscope-scgpt-disease")
print(repo_dir)

Run through the pipeline

python -m brainscope_scgpt annotate   --input data/query.h5ad   --model-repo YOUR_USERNAME/brainscope-scgpt-disease   --output results/query_annotated.h5ad   --mode small

Large dataset mode:

python -m brainscope_scgpt annotate   --input data/brainscope_full.h5ad   --model-repo YOUR_USERNAME/brainscope-scgpt-disease   --output results/brainscope_full_annotated.h5ad   --mode large

Evaluation

Please fill in the exact benchmark numbers you want visible in the public model card.

Suggested structure:

Main metrics

  • Accuracy:
  • Precision:
  • Recall:
  • Macro F1:

Benchmark setting

  • Train / validation / test split:
  • Label space:
  • Small or large mode:
  • Any freeze / unfreeze strategy:
  • Whether MoE was used:

If this release corresponds specifically to one of your final selected models, state that explicitly here.


Comparison to the original baseline

This model is intended to be compared against:

  • original scGPT baseline
  • MoE-enhanced variants
  • alternative architectures such as Mamba-based approaches

Suggested points to summarize here after finalization:

  • which cell types improved the most
  • which confusions remained
  • whether disease-aware fine-tuning improved performance on disease-shifted cells

Limitations

  • Performance depends on preprocessing consistency and gene-vocabulary alignment.
  • Performance may change if label definitions differ across datasets.
  • This repository does not include all large intermediate artifacts used during training.
  • Reference mapping still depends on an external FAISS index if you use the RM workflow.
  • This is a research model and not validated for clinical use.

Citation

Please cite the original scGPT paper and your project paper when available.

@article{cui2024scgpt,
  title={scGPT: toward building a foundation model for single-cell multi-omics using generative AI},
  author={Cui, Haotian and Wang, Chloe and Maan, Hassaan and others},
  journal={Nature Methods},
  year={2024}
}

Contact

Yuesong Huang
University of Rochester
Email: yhu116@ur.rochester.edu

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support