Instructions to use ankur0050/nucleotide-transformer-v2-50m-genomicbenchmarks-ft with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ankur0050/nucleotide-transformer-v2-50m-genomicbenchmarks-ft with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="ankur0050/nucleotide-transformer-v2-50m-genomicbenchmarks-ft")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("ankur0050/nucleotide-transformer-v2-50m-genomicbenchmarks-ft", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Nucleotide Transformer v2 50M, fine-tuned on GenomicBenchmarks
Full weight-level fine-tuning of InstaDeepAI/nucleotide-transformer-v2-50m-multi-species for binary DNA sequence classification on two GenomicBenchmarks tasks. All parameters are updated rather than using LoRA or a frozen backbone, with a leakage-free train/validation/test protocol and multi-seed evaluation.
Weights are not redistributed: this is a card and code release
The base model is licensed CC-BY-NC-SA-4.0 (non-commercial, share-alike), and any fine-tuned derivative inherits those terms. To respect that license, the trained checkpoints are not hosted here. The full training and evaluation code is open source and the run is reproducible, with every seed, split, and hyperparameter fixed.
Code, reproduction steps, figures, and the full model card: github.com/ankurgenomics/genome-ft
Running the documented commands regenerates the checkpoints described below.
Results
All numbers are on the held-out test set, evaluated exactly once on the best checkpoint (selected by validation MCC). The base model is the pretrained backbone with an untrained classification head, measured under the identical pipeline.
Enhancers: human_enhancers_cohn (6,948 test examples, 3 seeds: 42, 0, 123)
| Model | Accuracy | F1 | MCC |
|---|---|---|---|
| Base (pretrained backbone, untrained head) | 0.499 | 0.021 | −0.009 |
| Fine-tuned (this work) | 0.735 ± 0.004 | 0.745 ± 0.016 | 0.478 ± 0.003 |
Per-seed test MCC: 0.481, 0.478, 0.474 (σ = 0.003), indicating a stable result across initialisations.
Promoters: human_nontata_promoters (9,034 test examples, seed 42)
| Model | Accuracy | F1 | MCC |
|---|---|---|---|
| Base (pretrained backbone, untrained head) | 0.451 | 0.064 | −0.044 |
| Fine-tuned (this work) | 0.872 | 0.878 | 0.747 |
Where it lands (published reference numbers, accuracy on enhancers)
| Model | Accuracy | Note |
|---|---|---|
| GB-CNN (Grešová et al. 2023) | 0.69 | published reference |
| DNABERT (Ji et al. 2021) | 0.706 | published reference |
| This work — NT-v2 50M full fine-tune | 0.735 | measured here, 3 seeds |
| HyenaDNA tiny-1k (Nguyen et al. 2023) | 0.74 | published reference |
| NT-v2 500M (Dalla-Torre et al. 2023) | 0.776 | published reference |
| DNABERT-2 (Zhou et al. 2023) | 0.785 | published reference |
On enhancers, the 50M model performs above the published CNN and DNABERT baselines and below the larger transformers, consistent with its parameter count. It does not outperform the 500M models, and is not intended to.
How it was trained
- Method: full fine-tuning of all 53.8M parameters with a 2-layer classification head
- Optimizer: AdamW, learning rate 1e-5, weight decay 0.01
- Schedule: linear warmup (300 steps) with cosine decay, gradient clipping (max-norm 1.0)
- Augmentation: reverse-complement
- Protocol: 15% of training data held out for validation; checkpoint selected by validation MCC; test set evaluated once; 3 seeds for the primary task
- Best epoch: 1–2, after which validation MCC declines; the low learning rate and validation-based selection limit overfitting beyond the early epochs
Intended use & limitations
- Intended use: research and educational demonstration of adapting a genomic foundation model at the weight level on standard DNA classification benchmarks.
- Not for: clinical, diagnostic, or any commercial use (the CC-BY-NC-SA-4.0 license forbids commercial use).
- Scope: two GenomicBenchmarks tasks only; results reflect those datasets and may not transfer to other genomic tasks, species, or sequence lengths.
License & attribution
- Fine-tuned weights: CC-BY-NC-SA-4.0 (inherited from the base model; non-commercial, share-alike, attribution).
- Training/evaluation code: MIT (see the GitHub repository).
- Attribution: base model by InstaDeep; benchmark datasets by the GenomicBenchmarks authors (Grešová et al. 2023).
Citation / related work
For the broader research context, see Fesser, Zhang, Li, Zitnik et al., How Post-Training Shapes Biological Reasoning Models (arXiv:2606.16517, 2026). Its Finding 1, that supervised fine-tuning improves in-domain accuracy while out-of-domain performance peaks early and declines, has an in-domain counterpart in this work: validation MCC peaked at epoch 1–2 and then declined, which motivates the use of a low learning rate and validation-based checkpoint selection rather than longer training. This is a single controlled fine-tune, not a comparable research program; the connection is one of shared principle, not scope.
Developed by Ankur Sharma — GitHub · LinkedIn
Code and full results: github.com/ankurgenomics/genome-ft
This is a personal open-source project, developed independently in a personal capacity. It is not affiliated with, endorsed by, or representative of any current or former employer, and uses only public models and public benchmark datasets. All views and results are the author's own.
Model tree for ankur0050/nucleotide-transformer-v2-50m-genomicbenchmarks-ft
Datasets used to train ankur0050/nucleotide-transformer-v2-50m-genomicbenchmarks-ft
katarinagresova/Genomic_Benchmarks_human_nontata_promoters
Paper for ankur0050/nucleotide-transformer-v2-50m-genomicbenchmarks-ft
Evaluation results
- Test MCC (mean of 3 seeds) on GenomicBenchmarks human_enhancers_cohnself-reported0.478
- Test accuracy (mean of 3 seeds) on GenomicBenchmarks human_enhancers_cohnself-reported0.735
- Test F1 (mean of 3 seeds) on GenomicBenchmarks human_enhancers_cohnself-reported0.745
- Test MCC (seed 42) on GenomicBenchmarks human_nontata_promotersself-reported0.747
- Test accuracy (seed 42) on GenomicBenchmarks human_nontata_promotersself-reported0.872
- Test F1 (seed 42) on GenomicBenchmarks human_nontata_promotersself-reported0.878