---
language:
- en
license: apache-2.0
library_name: timm
tags:
- vision
- image-classification
- vit
- mnist
- computer-vision
datasets:
- mnist
metrics:
- accuracy
model-index:
- name: SOTA-Blitz-997
  results:
  - task:
      type: image-classification
      name: Image Classification
    dataset:
      name: MNIST
      type: mnist
    metrics:
    - type: accuracy
      value: 99.72
      name: Test Accuracy
---

# SOTA-Blitz-997
**Near-SOTA Precision | 7-Minute T4 Training | Safetensors Native**

---

### Model Overview
**SOTA-Blitz-997** is a high-velocity Vision Transformer (ViT) architecture optimized for the MNIST handwritten digit classification task. While most "State-of-the-Art" models rely on massive ensembles and hours of GPU compute, **SOTA-Blitz-997** was engineered to achieve elite accuracy within a single 7-minute training window on a standard NVIDIA T4 by leveraging the global attention mechanisms of the Transformer block.

### Performance & Proof
The model achieves a verified **99.72% Test Accuracy**, leaving only **28 errors** out of 10,000 images. This performance exceeds the human baseline (~97.5%) and demonstrates that ViT architectures can effectively "solve" classic computer vision benchmarks with extreme efficiency.

#### Training Logs (Verified Convergence)
| Epoch | Loss | Train Acc | Test Acc | Best Acc |
| :--- | :--- | :--- | :--- | :--- |
| 05/30 | 0.6235 | 95.068% | 98.440% | 98.590% |
| 10/30 | 0.5923 | 96.287% | 98.840% | 99.030% |
| 15/30 | 0.5683 | 97.107% | 99.220% | 99.230% |
| 20/30 | 0.5485 | 97.927% | 99.460% | 99.550% |
| 25/30 | 0.5345 | 98.460% | 99.660% | 99.660% |
| **30/30** | **0.5296** | **98.700%** | **99.720%** | **99.720%** |

**Final Performance:** 28 Errors / 10,000 Digits (TTA Enabled).

### Technical Specifications
- **Architecture:** Optimized Vision Transformer (ViT) with Patch Embedding & Attention-heads.
- **Training Hardware:** NVIDIA T4 GPU (Kaggle).
- **Training Time:** ~7 Minutes.
- **Format:** `.safetensors` (Zero-copy loading, no-pickle security).
- **License:** Apache 2.0.
- **Architecture Note:** Based on a timm ViT-Small backbone with a custom 1-channel patch embedding layer and 32x32 input resolution.

### Usage
```python
from safetensors.torch import load_file
import torch

# Load the SOTA weights
model_weights = load_file("SOTA-Blitz-997.safetensors")

# Apply to your ViT architecture
# model.load_state_dict(model_weights)
```

### Made By
Andy-ML-And-AI