---
title: StentorLabs Model Showcase
emoji: ⚡
colorFrom: yellow
colorTo: gray
sdk: gradio
sdk_version: 6.17.3
app_file: app.py
pinned: true
hf_oauth: true
hf_oauth_scopes:
- inference-api
license: apache-2.0
short_description: Streaming demo for Stentor base + instruct models
models:
- StentorLabs/Portimbria-150M
- StentorLabs/Stentor3-50M
- StentorLabs/Stentor3-20M
- StentorLabs/Stentor2-30M
- StentorLabs/Stentor2-12M
- StentorLabs/Stentor-30M
- StentorLabs/Stentor-12M
- StentorLabs/Stentor-30M-Instruct
- StentorLabs/Stentor-12M-Instruct
tags:
- text-generation
- llama
- small-language-model
- cpu-inference
- edge-deployment
---

# StentorLabs Model Showcase

Interactive demo for the **Stentor** series of compact Llama-architecture language models. Stream live text generation, explore token confidence, sweep temperature settings, and chat with the models — all running on CPU with no external API calls.

## Features
- **Streaming generation** — tokens appear in real time as the model generates
- **Parameter presets** — Creative / Balanced / Focused modes with one click
- **Live stats** — token count, elapsed time, tokens/sec displayed per generation
- **Example prompts** — one-click prompt starters to explore model behavior

## Models
- [`StentorLabs/Stentor-30M`](https://huggingface.co/StentorLabs/Stentor-30M) — 30M param base LM, trained on 600M tokens of FineWeb-Edu + Cosmopedia v2
- [`StentorLabs/Stentor-30M-Instruct`](https://huggingface.co/StentorLabs/Stentor-30M-Instruct) — instruction-tuned 30M variant for chat-style prompting
- [`StentorLabs/Stentor-12M`](https://huggingface.co/StentorLabs/Stentor-12M) — 12M param compact base LM
- [`StentorLabs/Stentor-12M-Instruct`](https://huggingface.co/StentorLabs/Stentor-12M-Instruct) — instruction-tuned 12M variant
- [`StentorLabs/Portimbria-150M`](https://huggingface.co/StentorLabs/Portimbria-150M) — Portimbria 150M base model
- [`StentorLabs/Stentor2-30M`](https://huggingface.co/StentorLabs/Stentor2-30M) — Stentor2 30M base model
- [`StentorLabs/Stentor2-12M`](https://huggingface.co/StentorLabs/Stentor2-12M) — Stentor2 12M base model

## GGUF Versions
Pre-quantized GGUF files available at [`mradermacher/Stentor-30M-GGUF`](https://huggingface.co/mradermacher/Stentor-30M-GGUF) for use with llama.cpp, LM Studio, and Ollama.

> ⚠️ This Space includes both base and instruct variants. Always set `max_new_tokens`.