# ๐Ÿ”ฌ TAF Agent โ€” predict transformer LLM viability before you spend GPU/$ Just shipped **TAF Agent**, a free browser-based diagnostic tool for transformer LLMs. No server, no auth, no cost. Runs entirely in your browser. ๐ŸŒ **Try it**: https://huggingface.co/spaces/karlexmarin/taf-agent ๐Ÿ“ฆ **Source**: https://github.com/karlesmarin/tafagent ๐Ÿ“„ **Paper**: [Predicting How Transformers Attend](https://zenodo.org/records/20314038) ## What it answers - *Will Llama-3-8B serve 32K context with NIAH retrieval?* โ† **X-2 recipe** - *Should I train custom or use GPT-4o for 50M tokens/month?* โ† **X-1 recipe** - *I have $5K โ€” what model can I afford to train?* โ† **X-3 recipe** - *Cheapest GPU to serve Llama-70B at 100M tokens/day?* โ† **X-5 recipe** - *Soft KV decay or hard cutoff at 32K?* โ† **X-19 recipe** 5 cross-section recipes, 5 UI modes, 4 languages (EN/ES/FR/ZH). ## Why it's different from "ask ChatGPT" Every number is deterministic Python (the TAF formulas โ€” closed-form, derivable from RoPE aliasing geometry). No hallucination. The synthesis LLM only reads the chain and writes plain English; it doesn't invent values. The full computation chain is auditable per click โ€” every step shows formula, inputs, output, paper section reference. ## Architecture coverage โœ“ RoPE-MHA ยท โœ“ RoPE-GQA ยท โœ“ ALiBi ยท โœ“ AbsPE ยท โœ“ SWA ยท โœ“ SSM โœ“ Any HuggingFace public model (paste model id, fetch config.json, profile) ## How it stays free + unlimited - Static HTML/JS on GitHub Pages (unlimited bandwidth) - Python computation in your browser via Pyodide - Plain-English synthesis via WebLLM (Qwen2.5-0.5B local, your GPU) - Configs fetched directly from HF Hub - **Your data never leaves your browser** If 1 user or 1M users hit it, our cost stays at **$0/month**. ## Built by an independent researcher No funding, no team, no GPUs beyond a single consumer card. Built with the help of large language models as research instruments. Open source. Apache-2.0. The tool exists because the paper it complements needed a way for any reader to *check the framework's predictions on their own model in seconds*. ## Looking for - ๐Ÿงช **Falsifications**: run TAF Agent on a model where you have real measurements. If our verdict disagrees, please open a [refutation issue](https://github.com/karlesmarin/tafagent-registry/issues/new?template=refutation.md). - ๐ŸŒ **Translations**: 4 languages so far. Add yours via PR (`js/i18n.js`). - ๐Ÿ’ก **New recipes**: we shipped 5 of 20 candidate recipes from the paper. Propose more in the [registry](https://github.com/karlesmarin/tafagent-registry). - โž• **Model presets**: 11 popular models curated. Add yours. ## What this is NOT - Not a benchmark (we predict from config, don't measure) - Not a leaderboard (no ranking, just per-model viability) - Not a replacement for actual evaluation โ€” *prediction* before *measurement* - Not a vendor pitch โ€” there's nothing to buy, ever The point is to give the community a free, auditable, falsifiable lens for evaluating transformer LLMs before spending compute on them. If you find it useful even once, that's enough. #transformer #llm #rope #diagnostic #free #opensource