Spaces:
Running
Running
| <html lang="en"> | |
| <head> | |
| <meta charset="UTF-8" /> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0" /> | |
| <title>TAF Agent — Test ANY Transformer LLM in Your Browser</title> | |
| <meta name="description" content="Free, auditable diagnostic for transformer LLMs. Predict viability (long-context, KV compression, training budget, hardware) from config alone. Runs entirely in your browser. No server, no auth, no cost." /> | |
| <meta name="keywords" content="transformer, LLM, diagnostic, RoPE, NIAH, KV cache, viability, free, browser, GPU, NeurIPS, TAF" /> | |
| <meta name="author" content="Carles Marin" /> | |
| <!-- OpenGraph for social sharing (Twitter, LinkedIn, WhatsApp, Discord, etc.) --> | |
| <meta property="og:type" content="website" /> | |
| <meta property="og:url" content="https://karlesmarin.github.io/tafagent/" /> | |
| <meta property="og:title" content="TAF Agent — Test ANY Transformer LLM in Your Browser" /> | |
| <meta property="og:description" content="Free, auditable transformer LLM diagnostic. 5 recipes, 5 modes, 4 languages. Runs in your browser. No server, no auth, $0/month forever." /> | |
| <meta property="og:site_name" content="TAF Agent" /> | |
| <!-- Twitter Card --> | |
| <meta name="twitter:card" content="summary_large_image" /> | |
| <meta name="twitter:title" content="TAF Agent — Test ANY Transformer LLM in Your Browser" /> | |
| <meta name="twitter:description" content="Free, auditable transformer LLM diagnostic. 5 recipes, 5 modes, 4 languages. Runs in your browser. $0 forever." /> | |
| <!-- Theme color for browser UI --> | |
| <meta name="theme-color" content="#0a0e14" /> | |
| <link rel="stylesheet" href="style.css" /> | |
| <script src="https://cdn.jsdelivr.net/pyodide/v0.26.4/full/pyodide.js"></script> | |
| </head> | |
| <body> | |
| <header> | |
| <!-- Language switcher (top-right, round flags) --> | |
| <div class="lang-switcher"> | |
| <button class="lang-btn" data-lang="en" data-label="English" title="English">🇬🇧</button> | |
| <button class="lang-btn" data-lang="es" data-label="Español" title="Español">🇪🇸</button> | |
| <button class="lang-btn" data-lang="fr" data-label="Français" title="Français">🇫🇷</button> | |
| <button class="lang-btn" data-lang="zh" data-label="中文" title="中文">🇨🇳</button> | |
| </div> | |
| <h1 data-i18n="hero.title">🔬 TAF Agent</h1> | |
| <p class="tagline" data-i18n="hero.tagline"> | |
| Test <strong>ANY</strong> transformer LLM before you spend GPU/$. | |
| </p> | |
| <div class="arch-badges"> | |
| <span class="badge">✓ RoPE-MHA</span> | |
| <span class="badge">✓ RoPE-GQA</span> | |
| <span class="badge">✓ ALiBi</span> | |
| <span class="badge">✓ AbsPE</span> | |
| <span class="badge">✓ SWA</span> | |
| <span class="badge">✓ SSM (Mamba)</span> | |
| <span class="badge">✓ Any HuggingFace public model</span> | |
| </div> | |
| <p class="subtle" style="margin-top:0.75rem;" data-i18n="hero.subtitle"> | |
| All computation runs locally in your browser. Free. Unlimited. Auditable. | |
| </p> | |
| <p class="subtle" style="margin-top:0.25rem; font-size:0.85rem;" data-i18n="hero.about"> | |
| Built by an independent researcher. Open source. Not affiliated with any model vendor. | |
| </p> | |
| <p style="margin-top:0.75rem;"> | |
| <button id="help-btn" type="button" data-i18n="hero.help">📘 Help & examples</button> | |
| </p> | |
| </header> | |
| <!-- Help modal --> | |
| <div id="help-modal"> | |
| <div class="help-content"> | |
| <button class="help-close" id="help-close">×</button> | |
| <h2 data-i18n="help.title">📘 TAF Agent — User Manual</h2> | |
| <h3 data-i18n="help.what.title">What does it do?</h3> | |
| <p data-i18n="help.what.body">Predicts <strong>practical viability</strong> of any transformer LLM | |
| <em>before you spend GPU/$</em>. Answers questions like "will this model work at L=32K?" or | |
| "should I train custom or use API?" using deterministic Python formulas (TAF — Thermodynamic Attention Framework).</p> | |
| <h3 data-i18n="help.modes.title">How to use — 7 modes</h3> | |
| <p data-i18n="help.modes.profile"><strong>📇 Profile</strong>: paste model id → all recipes at once = TAF Card. <strong>Best starting point</strong>.</p> | |
| <p data-i18n="help.modes.compare"><strong>🆚 Compare</strong>: 2-3 models side-by-side on same recipe. Best when choosing between candidates.</p> | |
| <p data-i18n="help.modes.inspector"><strong>🔍 Inspect config</strong>: paste raw <code>config.json</code> → tool parses + runs full Profile. For private models, in-development configs, or models not yet on HF Hub.</p> | |
| <p data-i18n="help.modes.ask"><strong>💬 Ask plain English</strong>: free-form question, in-browser LLM picks the recipe. Best for casual exploration.</p> | |
| <p data-i18n="help.modes.recipe"><strong>📋 Recipe + form</strong>: manual selection, full parameter control. Best when you want exact control.</p> | |
| <p data-i18n="help.modes.diagnose"><strong>🩺 Diagnose CLI</strong>: generate Python command to measure γ on your local machine (transformers + numpy). Fast ≈5 min CPU; full ≈20–60 min GPU. Output JSON re-uploadable via Inspect.</p> | |
| <p data-i18n="help.modes.phase"><strong>📊 Phase diagram</strong>: scatter plot of 23 panel models on (log θ, γ) plane. Hagedorn line γ=1 separates Phase A from Phase B. Click a dot to load that model into Recipe form.</p> | |
| <h3 data-i18n="help.recipes.title">The 8 recipes available</h3> | |
| <p data-i18n="help.recipe.x1.title"><strong>X-1 Custom training vs API</strong> — compares cost of training your own model vs paying for API access.</p> | |
| <div class="help-example" data-i18n="help.recipe.x1.example"> | |
| Try: <em>"Should I train an 8B custom model or use GPT-4o for 50M tokens/month?"</em><br> | |
| Answer types: YES (custom) / NO (API) with break-even months. | |
| </div> | |
| <p data-i18n="help.recipe.x2.title"><strong>X-2 Long Context Viability</strong> — predicts if a model serves a target context length reliably.</p> | |
| <div class="help-example" data-i18n="help.recipe.x2.example"> | |
| Try: <em>"Will Meta-Llama-3-8B handle 32000 tokens for retrieval?"</em><br> | |
| Chains: γ_Padé → decomposition → d_horizon → NIAH ceiling → hallucination → KV memory.<br> | |
| Verdict: YES / DEGRADED / NO with mitigation if needed. | |
| </div> | |
| <p data-i18n="help.recipe.x3.title"><strong>X-3 Budget pre-flight</strong> — given $ budget, what model is feasible to train?</p> | |
| <div class="help-example" data-i18n="help.recipe.x3.example"> | |
| Try: <em>"I have $5000, what model can I train?"</em><br> | |
| Answer: GO / TINY-MODEL / MEMORY-LIMITED with concrete N (params) and D (tokens). | |
| </div> | |
| <p data-i18n="help.recipe.x5.title"><strong>X-5 Hardware selection</strong> — which GPU should I use to serve at target throughput?</p> | |
| <div class="help-example" data-i18n="help.recipe.x5.example"> | |
| Try: <em>"Cheapest hardware to serve Llama-3-8B at 10M tokens/day"</em><br> | |
| Answer: best GPU + $/Mtok + capacity vs target. | |
| </div> | |
| <p data-i18n="help.recipe.x19.title"><strong>X-19 KV Compression decision</strong> — should I use soft decay, hard cutoff, or literature methods?</p> | |
| <div class="help-example" data-i18n="help.recipe.x19.example"> | |
| Try: <em>"How to compress KV cache for Qwen2.5-7B at 32K?"</em><br> | |
| Answer: USE SOFT DECAY / USE D_f CUTOFF / USE LITERATURE METHODS / USE HARD T_train. | |
| </div> | |
| <h3 style="margin-top: 1.5em;">— v0.4 (sesión 29 findings) —</h3> | |
| <p data-i18n="help.section.v04"><strong>What's new in v0.4</strong> (sesión 29 findings 2026-04-28): three diagnostic recipes derived from cross-model panel analysis (n=22 LLMs).</p> | |
| <p data-i18n="help.recipe.x21.title"><strong>X-21 Imprint Purity Diagnostic</strong> — predicts γ on RANDOM tokens via ν=−1/(2π); how clean is the model's RoPE prediction?</p> | |
| <div class="help-example" data-i18n="help.recipe.x21.example"> | |
| Try: <em>"How clean is the RoPE prediction on Llama-3-8B?"</em><br> | |
| Answer: predicted γ_random + purity diagnostic (CLEAN / OVER-IMPRINTED / UNDER-IMPRINTED). | |
| </div> | |
| <p data-i18n="help.v04.imprint" style="font-size: 0.9em; opacity: 0.85;"><strong>Learned-imprint slope ν = −1/(2π)</strong>: RoPE rotation period 2π drives a positional bias on weights, proportional to log(N_params). Even random tokens show this scaling. ν is DERIVED — not fitted (empirical err 0.3%).</p> | |
| <p data-i18n="help.recipe.x22.title"><strong>X-22 Compute-Context Invariant</strong> — does γ × log(N²·D) lie in panel band 51.2 ± 16.8? Detects scaling/training anomalies.</p> | |
| <div class="help-example" data-i18n="help.recipe.x22.example"> | |
| Try: <em>"Does Mistral-7B fit the compute-context invariant?"</em><br> | |
| Answer: K = γ·log(N²·D), z-score, IN-BAND or OUTLIER. | |
| </div> | |
| <p data-i18n="help.v04.invariant" style="font-size: 0.9em; opacity: 0.85;"><strong>Chinchilla-attention invariant K</strong>: γ × log(N²·D) ≈ 51.2 ± 16.8 (CV=0.329). Connects compute scaling and attention exponent into a single dimensionless number.</p> | |
| <p data-i18n="help.recipe.x23.title"><strong>X-23 IH-Phase Detector</strong> — pre- or post-induction-head? Cheap probe via sign(γ_text − γ_random).</p> | |
| <div class="help-example" data-i18n="help.recipe.x23.example"> | |
| Try: <em>"Is Qwen2.5-7B post-induction-head?"</em><br> | |
| Answer: CONFIRMED PRE-IH / CONFIRMED POST-IH / ANOMALY (with size-vs-Δγ consistency check). | |
| </div> | |
| <p data-i18n="help.v04.ih_probe" style="font-size: 0.9em; opacity: 0.85;"><strong>Δγ as IH probe</strong>: sign(γ_text − γ_random) > 0 ⟺ post-induction-head. Cheaper than running an in-context-learning benchmark.</p> | |
| <p data-i18n="help.v04.constants" style="font-size: 0.9em; opacity: 0.85;"><strong>γ-cluster on famous constants</strong> (intriguing, n=4): CodeLlama-13b γ=0.382 ≈ 1−1/φ (golden conjugate, err 0.0003); pythia-1.4b γ=0.705 ≈ 1/√2; Llama-2-7b γ=0.287 ≈ 1−1/√2; Mistral-Nemo γ=0.428 ≈ log_10(e). Caveat: could be coincidence.</p> | |
| <h3 style="margin-top: 1.5em;" data-i18n="v04.title">🆕 v0.4 — New diagnostics (sesion 31)</h3> | |
| <p style="opacity: 0.85;"><em data-i18n="v04.section.intro">Four new diagnostic functions derived sesion 31 (2026-04-30) from cross-of-crosses formula games + Sócratic interrogation. Available in <code>taf_browser.py</code> §33.</em></p> | |
| <p><strong data-i18n="v04.arch.label">Architectural Concentration</strong> — <span data-i18n="v04.arch.desc">γ_text ≈ γ_Padé − 0.012·n_kv. Cross-panel correlational law (R²=0.30). Caveat: not per-model predictor.</span></p> | |
| <p><strong data-i18n="v04.pdi.label">PDI — Padé Deviation Index</strong> — <span data-i18n="v04.pdi.desc">PDI = d_horizon_obs/T_eval. Traffic light: green (≈1), orange (>>1), yellow (<<1), red (Phase B negative).</span></p> | |
| <p><strong data-i18n="v04.4bit.label">4-bit Shift Predictor</strong> — <span data-i18n="v04.4bit.desc">MHA: R²(bf16)<0.9 → γ rises; R²>0.99 → γ drops. GQA: precision-robust regardless.</span></p> | |
| <p><strong data-i18n="v04.crit.label">Critical Exponents Bundle</strong> — <span data-i18n="v04.crit.desc">ν_c, β_c, η_c (=γ−1, CORRECTED), α_C, γ_susc with AM-GM minimum at γ=1−1/√2≈0.293.</span></p> | |
| <h3 data-i18n="help.add_models.title">Adding new models (3 ways)</h3> | |
| <ul> | |
| <li data-i18n="help.add_models.preset"><strong>Preset list</strong>: 11 popular models curated. Just select from dropdown.</li> | |
| <li data-i18n="help.add_models.hf"><strong>HF Hub fetch</strong>: paste any model id (e.g. <code>Qwen/Qwen2.5-32B-Instruct</code>), | |
| click 📥 Fetch. Browser downloads <code>config.json</code> directly from HuggingFace, fills the form. Works for any public model.</li> | |
| <li data-i18n="help.add_models.manual"><strong>Manual</strong>: fill the form fields directly with values from the model card.</li> | |
| </ul> | |
| <h3 data-i18n="help.audit.title">The audit chain</h3> | |
| <p data-i18n="help.audit.body">Every result shows the full <strong>Computation Chain</strong> — each formula step with its inputs, | |
| output, and interpretation. Click any step to expand. Cite section numbers (§26.1, §19.1, etc.) refer | |
| to the underlying paper for derivation.</p> | |
| <h3 data-i18n="help.synthesis.title">The plain-English answer</h3> | |
| <p data-i18n="help.synthesis.body">After the deterministic chain runs, an in-browser LLM (Qwen2.5-0.5B, ~350MB cached after first load) | |
| synthesizes a plain-English summary. The numbers above are <em>always correct</em> (deterministic Python); | |
| the synthesis is LLM-generated — verify against the chain if in doubt.</p> | |
| <h3 data-i18n="help.params.title">Common parameters explained</h3> | |
| <ul> | |
| <li data-i18n="help.param.theta"><strong>θ (rope_theta)</strong>: RoPE base frequency. Higher = more long-range capacity. Typical: 10000 (early), 500000 (Llama-3), 1000000 (Qwen2.5).</li> | |
| <li data-i18n="help.param.T_train"><strong>T_train</strong>: max context the model was trained on. From <code>max_position_embeddings</code>.</li> | |
| <li data-i18n="help.param.T_eval"><strong>T_eval</strong>: <em>your target</em> inference context length. The key knob.</li> | |
| <li data-i18n="help.param.gqa"><strong>n_kv_heads < n_attention_heads</strong>: model uses GQA (Grouped Query Attention). Reduces KV memory but pushes γ toward Hagedorn.</li> | |
| <li data-i18n="help.param.swa"><strong>has_SWA</strong>: model uses Sliding Window Attention (Mistral, gemma-2).</li> | |
| <li data-i18n="help.param.nparams"><strong>n_params</strong>: total parameter count. Threshold ~400M for induction-head emergence.</li> | |
| </ul> | |
| <h3 data-i18n="help.verdicts.title">What to look for in verdicts</h3> | |
| <ul> | |
| <li data-i18n="help.verdict.yes"><strong style="color:#3fb950;">YES / GO</strong> — proceed with confidence; numbers support the choice.</li> | |
| <li data-i18n="help.verdict.deg"><strong style="color:#d29922;">DEGRADED / TINY-MODEL</strong> — works but with caveats; read the action.</li> | |
| <li data-i18n="help.verdict.no"><strong style="color:#f85149;">NO / MEMORY-LIMITED</strong> — don't proceed as-is; mitigation provided.</li> | |
| </ul> | |
| <h3 data-i18n="help.privacy.title">Privacy</h3> | |
| <p data-i18n="help.privacy.body">Everything runs in your browser. No telemetry, no analytics, no data sent anywhere. Even the LLM model | |
| runs locally via WebGPU/WebAssembly. Your model_ids and questions never leave this page.</p> | |
| <h3 data-i18n="help.source.title">Source & paper</h3> | |
| <p data-i18n="help.source.body">Source code: <a href="https://github.com/karlesmarin/tafagent" target="_blank">github.com/karlesmarin/tafagent</a><br> | |
| Paper: <em>Marin 2026 — Predicting How Transformers Attend</em> (<a href="https://zenodo.org/records/19826343" target="_blank">Zenodo</a>; arXiv forthcoming)<br> | |
| Dataset: <a href="https://huggingface.co/datasets/karlexmarin/taf-attention-decay" target="_blank">taf-attention-decay</a> — 58 γ-measurements across 32 models (CC-BY-4.0)</p> | |
| </div> | |
| </div> | |
| <main> | |
| <!-- Status with loading bar --> | |
| <section id="status-bar"> | |
| <div id="status" data-i18n="status.loading_pyodide">⏳ Loading Python runtime...</div> | |
| <div id="loading-bar-wrap" style="display:none;"> | |
| <div id="loading-bar"></div> | |
| </div> | |
| </section> | |
| <!-- Mode toggle --> | |
| <section id="mode-section"> | |
| <h2><span data-i18n="modes.title">🎯 Mode</span> | |
| <span class="info"><span class="tooltip" data-i18n="modes.tip"><strong>Four ways to use the tool</strong>.<br> | |
| <strong>📇 Profile</strong>: paste a model id → all 5 recipes at once = TAF Card.<br> | |
| <strong>🆚 Compare</strong>: 2-3 models side-by-side on one recipe.<br> | |
| <strong>💬 Ask</strong>: free-form question, browser LLM picks the recipe.<br> | |
| <strong>📋 Recipe</strong>: manual selection with full form control. | |
| </span></span> | |
| </h2> | |
| <div class="mode-tabs"> | |
| <button class="mode-btn active" data-mode="profile" data-i18n="modes.profile">📇 Profile a model</button> | |
| <button class="mode-btn" data-mode="compare" data-i18n="modes.compare">🆚 Compare models</button> | |
| <button class="mode-btn" data-mode="inspector" data-i18n="modes.inspector">🔍 Inspect config</button> | |
| <button class="mode-btn" data-mode="ask" data-i18n="modes.ask">💬 Ask plain English</button> | |
| <button class="mode-btn" data-mode="recipe" data-i18n="modes.recipe">📋 Pick recipe</button> | |
| <button class="mode-btn" data-mode="diagnose" data-i18n="modes.diagnose">🩺 Diagnose CLI</button> | |
| <button class="mode-btn" data-mode="phase" data-i18n="modes.phase">📊 Phase diagram</button> | |
| </div> | |
| <p id="mode-desc" class="recipe-desc" data-i18n="modes.desc"> | |
| <strong>Quickest start</strong>: paste any HuggingFace model id (e.g. <code>meta-llama/Meta-Llama-3-8B</code>), | |
| click Profile. See all 5 recipes scored in seconds. | |
| </p> | |
| </section> | |
| <!-- PROFILE mode --> | |
| <section id="profile-section"> | |
| <div class="quickstart-banner" data-i18n="profile.quickstart"> | |
| 💡 Quick start: pick any preset → click Generate. Or paste a model id from <a href='https://huggingface.co/models?library=transformers&sort=trending' target='_blank'>HF Hub trending</a> → 📥 Fetch → Generate. | |
| </div> | |
| <h2><span data-i18n="profile.title">📇 Profile a model</span> | |
| <span class="info"><span class="tooltip" data-i18n="profile.tip"> | |
| <strong>One-click full diagnosis</strong>. Paste any HF model id (or pick preset). | |
| Tool runs all 5 recipes (long-context, KV-compression, custom-vs-API, budget, | |
| hardware) and produces a single <strong>TAF Card</strong> showing verdict per | |
| dimension + key numbers + architecture classification.<br><br> | |
| <strong>Use case</strong>: "I'm evaluating Qwen2.5-32B for production — | |
| what's its full viability profile?" → paste id → Profile → done. | |
| </span></span> | |
| </h2> | |
| <p class="recipe-desc" data-i18n="profile.desc"> | |
| <strong>For technicians</strong>: when you need a complete viability snapshot | |
| of a candidate model. Outputs match paper §sec:gamma_decomposition format. | |
| </p> | |
| <div class="form-row"> | |
| <label for="profile-preset" data-i18n="profile.preset_label">Preset:</label> | |
| <select id="profile-preset" disabled> | |
| <option value="" data-i18n="profile.preset_default">— or pick from list —</option> | |
| </select> | |
| </div> | |
| <div class="form-row"> | |
| <label for="profile-hf-id" data-i18n="profile.hf_label">HF model id:</label> | |
| <input type="text" id="profile-hf-id" | |
| data-i18n-placeholder="profile.hf_placeholder" | |
| placeholder="e.g. meta-llama/Meta-Llama-3-8B or Qwen/Qwen2.5-7B" style="flex:1;" /> | |
| <button id="profile-fetch-btn" type="button" class="secondary" data-i18n="profile.fetch_btn">📥 Fetch</button> | |
| </div> | |
| <div id="profile-hf-status" class="subtle" style="margin: -0.5rem 0 1rem; min-height:1.2em;"></div> | |
| <div class="form-grid" id="profile-form"> | |
| <div class="form-field"> | |
| <label><span data-i18n="param.theta">θ (rope_theta)</span> <span class="info"><span class="tooltip" data-i18n="param.theta.tip">RoPE base frequency from <code>config.rope_theta</code>.</span></span></label> | |
| <input type="number" id="profile-theta" value="500000" /> | |
| </div> | |
| <div class="form-field"> | |
| <label><span data-i18n="param.T_train">T_train</span> <span class="info"><span class="tooltip" data-i18n="param.T_train.tip">Max training context. From <code>max_position_embeddings</code>.</span></span></label> | |
| <input type="number" id="profile-T_train" value="8192" /> | |
| </div> | |
| <div class="form-field"> | |
| <label><span data-i18n="param.T_eval">T_eval (your target)</span> <span class="info"><span class="tooltip" data-i18n="param.T_eval.tip">Inference context length you'll actually serve. The key knob.</span></span></label> | |
| <input type="number" id="profile-T_eval" value="32000" /> | |
| </div> | |
| <div class="form-field"> | |
| <label data-i18n="param.n_attn">n_attention_heads</label> | |
| <input type="number" id="profile-n_attn" value="32" /> | |
| </div> | |
| <div class="form-field"> | |
| <label data-i18n="param.n_kv">n_kv_heads</label> | |
| <input type="number" id="profile-n_kv" value="8" /> | |
| </div> | |
| <div class="form-field"> | |
| <label data-i18n="param.d_head">head_dim</label> | |
| <input type="number" id="profile-d_head" value="128" /> | |
| </div> | |
| <div class="form-field"> | |
| <label data-i18n="param.n_layers">n_layers</label> | |
| <input type="number" id="profile-n_layers" value="32" /> | |
| </div> | |
| <div class="form-field"> | |
| <label data-i18n="param.n_params">n_params (e.g. 8e9)</label> | |
| <input type="text" id="profile-n_params" value="8e9" /> | |
| </div> | |
| <div class="form-field"> | |
| <label data-i18n="param.has_swa">Has SWA?</label> | |
| <select id="profile-has_swa"> | |
| <option value="false" selected data-i18n="common.no">No</option> | |
| <option value="true" data-i18n="common.yes">Yes</option> | |
| </select> | |
| </div> | |
| </div> | |
| <button id="profile-btn" disabled data-i18n="profile.btn">🚀 Generate full profile</button> | |
| </section> | |
| <!-- INSPECTOR mode (paste config.json directly) --> | |
| <section id="inspector-section" style="display:none;"> | |
| <div class="quickstart-banner" data-i18n="inspector.quickstart"> | |
| 💡 Use case: you have a private model not on HF Hub, or a config you're designing. Paste the raw JSON below and get a full TAF profile. | |
| </div> | |
| <h2><span data-i18n="inspector.title">🔍 Architecture Inspector</span> | |
| <span class="info"><span class="tooltip" data-i18n="inspector.tip"> | |
| <strong>Paste any config.json directly</strong>. Tool parses it and runs the full Profile. | |
| Useful for: private models, in-development configs, models not yet on HuggingFace, | |
| or comparing what your custom architecture would do. | |
| </span></span> | |
| </h2> | |
| <p class="recipe-desc" data-i18n="inspector.desc"> | |
| Paste the raw <code>config.json</code> contents. The tool extracts the architectural | |
| parameters and runs the full 5-recipe Profile. | |
| </p> | |
| <textarea id="inspector-json" rows="12" | |
| data-i18n-placeholder="inspector.placeholder" | |
| placeholder='{ | |
| "model_type": "llama", | |
| "rope_theta": 500000, | |
| "max_position_embeddings": 8192, | |
| "num_attention_heads": 32, | |
| "num_key_value_heads": 8, | |
| "hidden_size": 4096, | |
| "num_hidden_layers": 32, | |
| "vocab_size": 128256 | |
| }'></textarea> | |
| <div class="form-row" style="margin-top:0.5rem;"> | |
| <label for="inspector-T_eval" data-i18n="inspector.T_eval">T_eval (your target context):</label> | |
| <input type="number" id="inspector-T_eval" value="32000" /> | |
| </div> | |
| <button id="inspector-btn" disabled data-i18n="inspector.btn">🚀 Inspect & profile</button> | |
| <span id="inspector-status" class="subtle" style="margin-left:0.75rem;"></span> | |
| </section> | |
| <!-- COMPARE mode --> | |
| <section id="compare-section" style="display:none;"> | |
| <div class="quickstart-banner" data-i18n="compare.example"> | |
| 💡 Try: paste 3 popular 7-8B models (Meta-Llama-3-8B, Mistral-7B-v0.1, Qwen/Qwen2.5-7B), pick recipe X-2, T_eval=16000. See which best handles long context. | |
| </div> | |
| <h2><span data-i18n="compare.title">🆚 Compare models side-by-side</span> | |
| <span class="info"><span class="tooltip" data-i18n="compare.tip"> | |
| <strong>Same recipe, multiple models</strong>. Pick 2-3 candidate models and | |
| one recipe. See verdicts in a single comparison table.<br><br> | |
| <strong>Use case</strong>: "I need long-context retrieval at 16K — which is | |
| best: Llama-3-8B, Mistral-7B, or Qwen-7B?" → pick 3 + X-2 + 16K → see winner. | |
| </span></span> | |
| </h2> | |
| <p class="recipe-desc" data-i18n="compare.desc"> | |
| <strong>For technicians</strong>: when choosing between 2-3 candidate models for | |
| a specific deployment scenario. Compare their verdicts on the same recipe. | |
| </p> | |
| <div class="form-row"> | |
| <label for="compare-recipe" data-i18n="compare.recipe_label">Recipe:</label> | |
| <select id="compare-recipe" disabled> | |
| <option value="" data-i18n="recipe.default">— pick a recipe —</option> | |
| </select> | |
| </div> | |
| <div class="form-row"> | |
| <label for="compare-T_eval" data-i18n="compare.T_eval_label">T_eval (target context):</label> | |
| <input type="number" id="compare-T_eval" value="16000" style="flex:1;" /> | |
| <span class="info" style="margin-top:0.5rem;"><span class="tooltip"> | |
| For X-2 / X-19 only. The context length all compared models will be | |
| evaluated at. Other recipes use their own params. | |
| </span></span> | |
| </div> | |
| <div id="compare-models"> | |
| <h3 style="margin-top:1rem;" data-i18n="compare.models_title">Models to compare (add up to 3)</h3> | |
| <div class="compare-slot" data-slot="1"> | |
| <input type="text" class="compare-hf-id" | |
| data-i18n-placeholder="compare.slot1_placeholder" | |
| placeholder="HF model id (e.g. meta-llama/Meta-Llama-3-8B)" /> | |
| <select class="compare-preset"> | |
| <option value="" data-i18n="compare.preset_default">— or preset —</option> | |
| </select> | |
| </div> | |
| <div class="compare-slot" data-slot="2"> | |
| <input type="text" class="compare-hf-id" | |
| data-i18n-placeholder="compare.slot2_placeholder" | |
| placeholder="HF model id #2" /> | |
| <select class="compare-preset"> | |
| <option value="" data-i18n="compare.preset_default">— or preset —</option> | |
| </select> | |
| </div> | |
| <div class="compare-slot" data-slot="3"> | |
| <input type="text" class="compare-hf-id" | |
| data-i18n-placeholder="compare.slot3_placeholder" | |
| placeholder="HF model id #3 (optional)" /> | |
| <select class="compare-preset"> | |
| <option value="" data-i18n="compare.preset_default">— or preset —</option> | |
| </select> | |
| </div> | |
| </div> | |
| <button id="compare-btn" disabled style="margin-top:1rem;" data-i18n="compare.btn">🚀 Compare</button> | |
| </section> | |
| <!-- ASK mode (free-form question) --> | |
| <section id="ask-section" style="display:none;"> | |
| <h2 data-i18n="ask.title">❓ Your question</h2> | |
| <textarea id="question" rows="3" | |
| data-i18n-placeholder="ask.placeholder" | |
| placeholder="e.g. Will Mistral-7B handle 16K NIAH retrieval? Or: I have $5,000, what model can I train? Or: Cheapest GPU to serve Llama-70B at 100M tokens/day?"></textarea> | |
| <div style="display:flex; gap:0.5rem; margin-top:0.5rem; flex-wrap:wrap;"> | |
| <button id="ask-btn" disabled data-i18n="ask.btn">🚀 Analyze</button> | |
| <button id="example-btn" type="button" class="secondary" data-i18n="ask.example_btn">💡 Try an example</button> | |
| </div> | |
| </section> | |
| <!-- Diagnose mode: build the CLI command for diagnose_model.py --> | |
| <section id="diagnose-section" style="display:none;"> | |
| <h2><span data-i18n="diagnose.title">🩺 Diagnose CLI Command Builder</span> | |
| <span class="info"><span class="tooltip" data-i18n="diagnose.tip"> | |
| <strong>Measure γ_obs (not predict)</strong>. The browser tool predicts γ from | |
| config alone (Padé). To <em>measure</em> the actual decay on a real model | |
| you need GPU + Python. This builder produces the exact CLI command you | |
| run locally; the script is shipped in this repository at | |
| <code>cli/diagnose_model.py</code>.<br><br> | |
| <strong>Output</strong>: γ_obs, R², phase, KV cache budget D_90, KL anomaly, | |
| full thermodynamic profile (Z, U, S, F, C_V, χ). Saved as JSON. | |
| </span></span> | |
| </h2> | |
| <p class="recipe-desc" data-i18n="diagnose.desc"> | |
| Pick options below and copy-paste the generated command on your local | |
| machine (Python + transformers + numpy). Total wall time ≈ 5 min in | |
| <code>--fast</code> mode on CPU; full mode 20–60 min on GPU. | |
| </p> | |
| <div class="form-row"> | |
| <label for="diag-model" data-i18n="diagnose.model_label">HF model id:</label> | |
| <input type="text" id="diag-model" placeholder="EleutherAI/pythia-70m" value="EleutherAI/pythia-70m"> | |
| </div> | |
| <div class="form-row"> | |
| <label for="diag-theta" data-i18n="diagnose.theta_label">θ (auto if blank):</label> | |
| <input type="number" id="diag-theta" placeholder="auto-detect"> | |
| </div> | |
| <div class="form-row"> | |
| <label for="diag-N" data-i18n="diagnose.n_label">Context N:</label> | |
| <input type="number" id="diag-N" value="2000" min="100" max="32000"> | |
| </div> | |
| <div class="form-row"> | |
| <label data-i18n="diagnose.options_label">Options:</label> | |
| <span> | |
| <label><input type="checkbox" id="diag-fast" checked> | |
| <span data-i18n="diagnose.opt_fast">--fast (CPU, ~5 min)</span></label><br> | |
| <label><input type="checkbox" id="diag-cpu"> | |
| <span data-i18n="diagnose.opt_cpu">--cpu (force CPU)</span></label><br> | |
| <label><input type="checkbox" id="diag-4bit"> | |
| <span data-i18n="diagnose.opt_4bit">--load_in_4bit (≥7B models)</span></label> | |
| </span> | |
| </div> | |
| <div class="form-row"> | |
| <label for="diag-local" data-i18n="diagnose.local_label">--local path (optional):</label> | |
| <input type="text" id="diag-local" placeholder="/path/to/local/weights"> | |
| </div> | |
| <button id="diag-build-btn" data-i18n="diagnose.build_btn">📋 Build command</button> | |
| <div id="diag-output" style="display:none; margin-top:1em;"> | |
| <h3 data-i18n="diagnose.cmd_title">Generated command:</h3> | |
| <pre id="diag-cmd" class="diag-cmd-box"></pre> | |
| <button id="diag-copy-btn" data-i18n="diagnose.copy_btn">📋 Copy to clipboard</button> | |
| <p class="recipe-desc" data-i18n="diagnose.next_steps"> | |
| <strong>Next steps</strong>: | |
| (1) <code>git clone https://github.com/karlesmarin/tafagent</code> | |
| (2) <code>cd tafagent && pip install torch transformers numpy</code> | |
| (3) Run the command above. | |
| (4) Result JSON lands in <code>./diagnose_results/</code> — upload it | |
| to the <strong>📋 Pick recipe</strong> mode (or paste in <strong>🔍 Inspect config</strong>) for full TAF analysis. | |
| </p> | |
| </div> | |
| </section> | |
| <!-- Phase diagram mode: live scatter of measured γ vs θ --> | |
| <section id="phase-section" style="display:none;"> | |
| <h2><span data-i18n="phase.title">📊 Phase diagram (γ × θ)</span> | |
| <span class="info"><span class="tooltip" data-i18n="phase.tip"> | |
| Each dot is one model from the paper's empirical panel | |
| (data/master_gamma_results.json). The x-axis is RoPE base θ | |
| on log scale; y-axis is measured γ. | |
| The Hagedorn line γ=1 separates Phase A (γ<1, global) from | |
| Phase B (γ>1, local-collapsed). | |
| Hover dots for details; click to populate the recipe form. | |
| </span></span> | |
| </h2> | |
| <p class="recipe-desc" data-i18n="phase.desc"> | |
| 23 models in the panel; the Padé curve (line) is | |
| γ_pred(θ) = (2θ−T√2)/(2θ+T√2) at T=2000. | |
| </p> | |
| <canvas id="phase-canvas" width="900" height="500" style="max-width:100%; background: var(--card-bg); border-radius: 6px;"></canvas> | |
| <div id="phase-info" class="recipe-desc" style="margin-top:0.6em;"></div> | |
| </section> | |
| <!-- Recipe selector (mode=recipe) --> | |
| <section id="recipe-section" style="display:none;"> | |
| <h2 data-i18n="recipe.title">📋 Recipe</h2> | |
| <select id="recipe-select" disabled> | |
| <option value="" data-i18n="recipe.default">— select a recipe —</option> | |
| </select> | |
| <p id="recipe-desc-display" class="recipe-desc"></p> | |
| </section> | |
| <!-- Form (mode=recipe) --> | |
| <section id="form-section" style="display:none;"> | |
| <h2 data-i18n="recipe.input_title">🎯 Inputs</h2> | |
| <div class="form-row"> | |
| <label for="preset" data-i18n="profile.preset_label">Preset model:</label> | |
| <select id="preset" disabled> | |
| <option value="" data-i18n="profile.preset_default">— select to autofill —</option> | |
| </select> | |
| </div> | |
| <div class="form-row"> | |
| <label for="hf-id" data-i18n="profile.hf_label">Or any HF model:</label> | |
| <input type="text" id="hf-id" | |
| data-i18n-placeholder="profile.hf_placeholder" | |
| placeholder="e.g. Qwen/Qwen2.5-32B-Instruct" style="flex:1;" /> | |
| <button id="hf-fetch-btn" type="button" class="secondary" data-i18n="profile.fetch_btn">📥 Fetch</button> | |
| </div> | |
| <div id="hf-status" class="subtle" style="margin: -0.5rem 0 1rem; min-height:1.2em;"></div> | |
| <div id="dynamic-form" class="form-grid"></div> | |
| <button id="run-btn" disabled data-i18n="ask.btn">🚀 Analyze</button> | |
| </section> | |
| <!-- Output (single-recipe verdict + chain) --> | |
| <section id="output-section" style="display:none;"> | |
| <h2 data-i18n="verdict.title">📊 Verdict</h2> | |
| <div id="verdict-box"></div> | |
| <div class="share-bar"> | |
| <button id="share-btn" class="secondary" type="button" data-i18n="share.btn">🔗 Copy share link</button> | |
| <button id="recipe-download-btn" class="secondary" type="button" data-i18n="share.download">💾 Download JSON</button> | |
| <button id="recipe-submit-btn" class="secondary" type="button" data-i18n="share.submit">📤 Submit to registry</button> | |
| <span id="share-status" class="subtle"></span> | |
| </div> | |
| <h2 data-i18n="chain.title">🔍 Computation Chain</h2> | |
| <p class="subtle" data-i18n="chain.desc">Every number below is deterministic Python. Click a step to expand.</p> | |
| <div id="chain-box"></div> | |
| <h2 id="answer-header" style="display:none;" data-i18n="answer.title">💬 Plain-English Answer</h2> | |
| <div id="answer-box" style="display:none;"></div> | |
| </section> | |
| <!-- Profile output --> | |
| <section id="profile-output" style="display:none;"> | |
| <h2 data-i18n="tafcard.title">📇 TAF Card — full model profile</h2> | |
| <div id="profile-box"></div> | |
| </section> | |
| <!-- Compare output --> | |
| <section id="compare-output" style="display:none;"> | |
| <h2 data-i18n="compare.title_out">🆚 Comparison Table</h2> | |
| <div id="compare-box"></div> | |
| </section> | |
| <!-- Hidden file input for JSON upload (shared by all import buttons) --> | |
| <input type="file" id="import-file" accept=".json,application/json" style="display:none;" /> | |
| <!-- Floating import bar (always visible) --> | |
| <section id="import-section"> | |
| <h2 data-i18n="share.import_title">📂 Import a shared TAF result</h2> | |
| <p class="recipe-desc" data-i18n="share.import_desc"> | |
| Got a JSON file from someone else's TAF analysis? Load it here to see the verdict + chain locally. | |
| Same view as if you'd run it yourself. | |
| </p> | |
| <button id="import-btn" class="secondary" type="button" data-i18n="share.import_btn">📂 Load shared JSON</button> | |
| <span id="import-status" class="subtle" style="margin-left:0.75rem;"></span> | |
| </section> | |
| <!-- Browse community submissions (live from GitHub Issues) --> | |
| <section id="community-section"> | |
| <h2 data-i18n="community.title">🌐 Recent community submissions</h2> | |
| <p class="recipe-desc" data-i18n="community.desc"> | |
| Live feed from the public registry. Click any submission to view full analysis. | |
| <a href="https://github.com/karlesmarin/tafagent-registry/issues" target="_blank" data-i18n="community.browse_all">Browse all →</a> | |
| </p> | |
| <div id="community-feed" class="subtle"><span data-i18n="community.loading">Loading...</span></div> | |
| </section> | |
| <!-- FALSIFICATION dashboard (paper predictions status) --> | |
| <section id="falsification-section"> | |
| <h2 data-i18n="falsification.title">🔬 Paper predictions — falsification status</h2> | |
| <p class="recipe-desc" data-i18n="falsification.desc"> | |
| The TAF framework rests on falsifiable predictions (F1-F23). Each is empirically tested. | |
| Here's the live status of every prediction in the paper. | |
| </p> | |
| <div id="falsification-table"></div> | |
| </section> | |
| </main> | |
| <footer> | |
| <p data-i18n="footer.text"> | |
| © 2026 Carles Marin · Apache-2.0 · independent research · the tool that closes the loop of the paper. | |
| </p> | |
| <p> | |
| <a href="https://github.com/karlesmarin/tafagent" target="_blank">Source on GitHub</a> | |
| · | |
| <a href="https://github.com/karlesmarin/NeurIPS" target="_blank">Paper repo</a> | |
| </p> | |
| <p class="subtle"> | |
| Computation: Pyodide · Synthesis: WebLLM (Qwen2.5-0.5B local) · Hosting: GitHub Pages · Cost: $0 | |
| </p> | |
| </footer> | |
| <script type="module" src="js/main.js"></script> | |
| </body> | |
| </html> | |