# 🔬 TAF Agent — predict transformer LLM viability before you spend GPU/$

Just shipped **TAF Agent**, a free browser-based diagnostic tool for transformer LLMs.
No server, no auth, no cost. Runs entirely in your browser.

🌐 **Try it**: https://huggingface.co/spaces/karlexmarin/taf-agent
📦 **Source**: https://github.com/karlesmarin/tafagent
📄 **Paper**: [Predicting How Transformers Attend](https://zenodo.org/records/20314038)

## What it answers

- *Will Llama-3-8B serve 32K context with NIAH retrieval?* ← **X-2 recipe**
- *Should I train custom or use GPT-4o for 50M tokens/month?* ← **X-1 recipe**
- *I have $5K — what model can I afford to train?* ← **X-3 recipe**
- *Cheapest GPU to serve Llama-70B at 100M tokens/day?* ← **X-5 recipe**
- *Soft KV decay or hard cutoff at 32K?* ← **X-19 recipe**

5 cross-section recipes, 5 UI modes, 4 languages (EN/ES/FR/ZH).

## Why it's different from "ask ChatGPT"

Every number is deterministic Python (the TAF formulas — closed-form, derivable
from RoPE aliasing geometry). No hallucination. The synthesis LLM only reads the
chain and writes plain English; it doesn't invent values.

The full computation chain is auditable per click — every step shows formula,
inputs, output, paper section reference.

## Architecture coverage

✓ RoPE-MHA · ✓ RoPE-GQA · ✓ ALiBi · ✓ AbsPE · ✓ SWA · ✓ SSM
✓ Any HuggingFace public model (paste model id, fetch config.json, profile)

## How it stays free + unlimited

- Static HTML/JS on GitHub Pages (unlimited bandwidth)
- Python computation in your browser via Pyodide
- Plain-English synthesis via WebLLM (Qwen2.5-0.5B local, your GPU)
- Configs fetched directly from HF Hub
- **Your data never leaves your browser**

If 1 user or 1M users hit it, our cost stays at **$0/month**.

## Built by an independent researcher

No funding, no team, no GPUs beyond a single consumer card. Built with the
help of large language models as research instruments. Open source. Apache-2.0.

The tool exists because the paper it complements needed a way for any reader
to *check the framework's predictions on their own model in seconds*.

## Looking for

- 🧪 **Falsifications**: run TAF Agent on a model where you have real
  measurements. If our verdict disagrees, please open a [refutation issue](https://github.com/karlesmarin/tafagent-registry/issues/new?template=refutation.md).
- 🌐 **Translations**: 4 languages so far. Add yours via PR (`js/i18n.js`).
- 💡 **New recipes**: we shipped 5 of 20 candidate recipes from the paper.
  Propose more in the [registry](https://github.com/karlesmarin/tafagent-registry).
- ➕ **Model presets**: 11 popular models curated. Add yours.

## What this is NOT

- Not a benchmark (we predict from config, don't measure)
- Not a leaderboard (no ranking, just per-model viability)
- Not a replacement for actual evaluation — *prediction* before *measurement*
- Not a vendor pitch — there's nothing to buy, ever

The point is to give the community a free, auditable, falsifiable lens for
evaluating transformer LLMs before spending compute on them.

If you find it useful even once, that's enough.

#transformer #llm #rope #diagnostic #free #opensource