Tiny Models
Collection
Tiny models used for testing • 8 items • Updated
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
A tiny version of unsloth/gpt-oss-20b-BF16 designed for testing and development purposes.
| Parameter | Original Model | Tiny Model |
|---|---|---|
| Number of Layers | 24 | 6 |
| Layer Types | Alternating sliding_attention/full_attention | Alternating sliding_attention/full_attention |
| Hidden Size | 2880 | 2880 |
| Number of Experts | 32 | 8 |
| Experts per Token | 4 | 4 |
| Attention Heads | 64 | 64 |
| KV Heads | 8 | 8 |
| Vocab Size | 201088 | 201088 |
| Max Position Embeddings | 131072 | 131072 |
The model is saved as a single model.safetensors file (unlike the original which is sharded into 9 files). This is appropriate for the smaller model size.
This model was created by:
The model successfully passes validation tests:
Success: 1.0000132322311401 <= 10.0
==================================================
Generating sample text:
According to all known laws of aviation, there is no way a bee should be able to fly.
==================================================
Perplexity on test text: 1.00 (target: ≤10.0) ✓
The model demonstrates:
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("inference-optimization/gpt-oss-2.5B-A1.3B")
tokenizer = AutoTokenizer.from_pretrained("inference-optimization/gpt-oss-2.5B-A1.3B")
text = "According to all known laws"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
print(tokenizer.decode(outputs[0]))