TAF Agent — Test ANY Transformer LLM in Your Browser

🔬 TAF Agent

Test ANY transformer LLM before you spend GPU/$.

✓ RoPE-MHA ✓ RoPE-GQA ✓ ALiBi ✓ AbsPE ✓ SWA ✓ SSM (Mamba) ✓ Any HuggingFace public model

All computation runs locally in your browser. Free. Unlimited. Auditable.

Built by an independent researcher. Open source. Not affiliated with any model vendor.

⏳ Loading Python runtime...

🎯 Mode Four ways to use the tool.
📇 Profile: paste a model id → all 5 recipes at once = TAF Card.
🆚 Compare: 2-3 models side-by-side on one recipe.
💬 Ask: free-form question, browser LLM picks the recipe.
📋 Recipe: manual selection with full form control.

Quickest start: paste any HuggingFace model id (e.g. meta-llama/Meta-Llama-3-8B), click Profile. See all 5 recipes scored in seconds.

📇 Profile a model One-click full diagnosis. Paste any HF model id (or pick preset). Tool runs all 5 recipes (long-context, KV-compression, custom-vs-API, budget, hardware) and produces a single TAF Card showing verdict per dimension + key numbers + architecture classification.

Use case: "I'm evaluating Qwen2.5-32B for production — what's its full viability profile?" → paste id → Profile → done.

For technicians: when you need a complete viability snapshot of a candidate model. Outputs match paper §sec:gamma_decomposition format.

Preset:

HF model id:

θ (rope_theta) RoPE base frequency from config.rope_theta.

T_train Max training context. From max_position_embeddings.

T_eval (your target)

n_attention_heads Number of attention heads per layer. From num_attention_heads.

n_kv_heads

head_dim Per-head dimension. Typical 64, 96, 128. From head_dim or hidden_size / num_attention_heads.

n_layers Number of transformer blocks. From num_hidden_layers.

n_params (e.g. 8e9)

Has SWA? Sliding Window Attention. true for Mistral, gemma-2, phi-3. Calibration audit (v0.5.3) disabled the historical δ_SWA correction (n=1 fit).

📂 Import a shared TAF result

Got a JSON file from someone else's TAF analysis? Load it here to see the verdict + chain locally. Same view as if you'd run it yourself.

🌐 Recent community submissions

Live feed from the public registry. Click any submission to view full analysis. Browse all →

🔬 Paper predictions — falsification status

The TAF framework rests on falsifiable predictions (F1-F23). Each is empirically tested. Here's the live status of every prediction in the paper.

🔬 TAF Agent

🎯 Mode Four ways to use the tool.
📇 Profile: paste a model id → all 5 recipes at once = TAF Card.
🆚 Compare: 2-3 models side-by-side on one recipe.
💬 Ask: free-form question, browser LLM picks the recipe.
📋 Recipe: manual selection with full form control.

🔍 Architecture Inspector Paste any config.json directly. Tool parses it and runs the full Profile. Useful for: private models, in-development configs, models not yet on HuggingFace, or comparing what your custom architecture would do.

🆚 Compare models side-by-side Same recipe, multiple models. Pick 2-3 candidate models and one recipe. See verdicts in a single comparison table.

Use case: "I need long-context retrieval at 16K — which is best: Llama-3-8B, Mistral-7B, or Qwen-7B?" → pick 3 + X-2 + 16K → see winner.

Models to compare (add up to 3)

❓ Your question

Generated command:

📋 Recipe

🎯 Inputs

📊 Verdict

🔍 Computation Chain

💬 Plain-English Answer

📇 TAF Card — full model profile

🆚 Comparison Table

📂 Import a shared TAF result

🌐 Recent community submissions

🔬 Paper predictions — falsification status

📘 TAF Agent — User Manual

What does it do?

How to use — 7 modes

The 8 recipes available

— v0.4 (sesión 29 findings) —

🆕 v0.4 — New diagnostics (sesion 31)

🔬 v0.5 — Machine-verified consistency (sesion 32)

🆕 v0.6 — γ predicted-vs-observed + Cardy ΔH + Lean badges

Adding new models (3 ways)

The audit chain

The plain-English answer

Common parameters explained

What to look for in verdicts

Privacy

Source & paper

🎯 Mode Four ways to use the tool. 📇 Profile: paste a model id → all 5 recipes at once = TAF Card. 🆚 Compare: 2-3 models side-by-side on one recipe. 💬 Ask: free-form question, browser LLM picks the recipe. 📋 Recipe: manual selection with full form control.

🔍 Architecture Inspector Paste any config.json directly. Tool parses it and runs the full Profile. Useful for: private models, in-development configs, models not yet on HuggingFace, or comparing what your custom architecture would do.

🆚 Compare models side-by-side Same recipe, multiple models. Pick 2-3 candidate models and one recipe. See verdicts in a single comparison table. Use case: "I need long-context retrieval at 16K — which is best: Llama-3-8B, Mistral-7B, or Qwen-7B?" → pick 3 + X-2 + 16K → see winner.

Models to compare (add up to 3)

❓ Your question

Generated command:

📋 Recipe

🎯 Inputs

📊 Verdict

🔍 Computation Chain

💬 Plain-English Answer

📇 TAF Card — full model profile

🆚 Comparison Table

📂 Import a shared TAF result

🌐 Recent community submissions

🔬 Paper predictions — falsification status

🎯 Mode Four ways to use the tool.
📇 Profile: paste a model id → all 5 recipes at once = TAF Card.
🆚 Compare: 2-3 models side-by-side on one recipe.
💬 Ask: free-form question, browser LLM picks the recipe.
📋 Recipe: manual selection with full form control.

🆚 Compare models side-by-side Same recipe, multiple models. Pick 2-3 candidate models and one recipe. See verdicts in a single comparison table.

Use case: "I need long-context retrieval at 16K — which is best: Llama-3-8B, Mistral-7B, or Qwen-7B?" → pick 3 + X-2 + 16K → see winner.