--- title: StentorLabs Model Showcase emoji: ⚡ colorFrom: yellow colorTo: gray sdk: gradio sdk_version: 6.17.3 app_file: app.py pinned: true hf_oauth: true hf_oauth_scopes: - inference-api license: apache-2.0 short_description: Streaming demo for Stentor base + instruct models models: - StentorLabs/Portimbria-150M - StentorLabs/Stentor3-50M - StentorLabs/Stentor3-20M - StentorLabs/Stentor2-30M - StentorLabs/Stentor2-12M - StentorLabs/Stentor-30M - StentorLabs/Stentor-12M - StentorLabs/Stentor-30M-Instruct - StentorLabs/Stentor-12M-Instruct tags: - text-generation - llama - small-language-model - cpu-inference - edge-deployment --- # StentorLabs Model Showcase Interactive demo for the **Stentor** series of compact Llama-architecture language models. Stream live text generation, explore token confidence, sweep temperature settings, and chat with the models — all running on CPU with no external API calls. ## Features - **Streaming generation** — tokens appear in real time as the model generates - **Parameter presets** — Creative / Balanced / Focused modes with one click - **Live stats** — token count, elapsed time, tokens/sec displayed per generation - **Example prompts** — one-click prompt starters to explore model behavior ## Models - [`StentorLabs/Stentor-30M`](https://huggingface.co/StentorLabs/Stentor-30M) — 30M param base LM, trained on 600M tokens of FineWeb-Edu + Cosmopedia v2 - [`StentorLabs/Stentor-30M-Instruct`](https://huggingface.co/StentorLabs/Stentor-30M-Instruct) — instruction-tuned 30M variant for chat-style prompting - [`StentorLabs/Stentor-12M`](https://huggingface.co/StentorLabs/Stentor-12M) — 12M param compact base LM - [`StentorLabs/Stentor-12M-Instruct`](https://huggingface.co/StentorLabs/Stentor-12M-Instruct) — instruction-tuned 12M variant - [`StentorLabs/Portimbria-150M`](https://huggingface.co/StentorLabs/Portimbria-150M) — Portimbria 150M base model - [`StentorLabs/Stentor2-30M`](https://huggingface.co/StentorLabs/Stentor2-30M) — Stentor2 30M base model - [`StentorLabs/Stentor2-12M`](https://huggingface.co/StentorLabs/Stentor2-12M) — Stentor2 12M base model ## GGUF Versions Pre-quantized GGUF files available at [`mradermacher/Stentor-30M-GGUF`](https://huggingface.co/mradermacher/Stentor-30M-GGUF) for use with llama.cpp, LM Studio, and Ollama. > ⚠️ This Space includes both base and instruct variants. Always set `max_new_tokens`.