Spaces:

karlexmarin
/

taf-agent

Running

App Files Files Community

taf-agent / index.html

karlexmarin

v0.4 help/info: 7 modes (was 4) + v0.4 sesion-31 section, 4 langs

c969a03 about 1 month ago

raw

history blame

37.8 kB

	<!DOCTYPE html>
	<html lang="en">
	<head>
	<meta charset="UTF-8" />
	<meta name="viewport" content="width=device-width, initial-scale=1.0" />
	<title>TAF Agent — Test ANY Transformer LLM in Your Browser</title>
	<meta name="description" content="Free, auditable diagnostic for transformer LLMs. Predict viability (long-context, KV compression, training budget, hardware) from config alone. Runs entirely in your browser. No server, no auth, no cost." />
	<meta name="keywords" content="transformer, LLM, diagnostic, RoPE, NIAH, KV cache, viability, free, browser, GPU, NeurIPS, TAF" />
	<meta name="author" content="Carles Marin" />

	<!-- OpenGraph for social sharing (Twitter, LinkedIn, WhatsApp, Discord, etc.) -->
	<meta property="og:type" content="website" />
	<meta property="og:url" content="https://karlesmarin.github.io/tafagent/" />
	<meta property="og:title" content="TAF Agent — Test ANY Transformer LLM in Your Browser" />
	<meta property="og:description" content="Free, auditable transformer LLM diagnostic. 5 recipes, 5 modes, 4 languages. Runs in your browser. No server, no auth, $0/month forever." />
	<meta property="og:site_name" content="TAF Agent" />

	<!-- Twitter Card -->
	<meta name="twitter:card" content="summary_large_image" />
	<meta name="twitter:title" content="TAF Agent — Test ANY Transformer LLM in Your Browser" />
	<meta name="twitter:description" content="Free, auditable transformer LLM diagnostic. 5 recipes, 5 modes, 4 languages. Runs in your browser. $0 forever." />

	<!-- Theme color for browser UI -->
	<meta name="theme-color" content="#0a0e14" />

	<link rel="stylesheet" href="style.css" />
	<script src="https://cdn.jsdelivr.net/pyodide/v0.26.4/full/pyodide.js"></script>
	</head>
	<body>
	<header>
	<!-- Language switcher (top-right, round flags) -->
	<div class="lang-switcher">
	<button class="lang-btn" data-lang="en" data-label="English" title="English">🇬🇧</button>
	<button class="lang-btn" data-lang="es" data-label="Español" title="Español">🇪🇸</button>
	<button class="lang-btn" data-lang="fr" data-label="Français" title="Français">🇫🇷</button>
	<button class="lang-btn" data-lang="zh" data-label="中文" title="中文">🇨🇳</button>
	</div>

	<h1 data-i18n="hero.title">🔬 TAF Agent</h1>
	<p class="tagline" data-i18n="hero.tagline">
	Test <strong>ANY</strong> transformer LLM before you spend GPU/$.
	</p>
	<div class="arch-badges">
	<span class="badge">✓ RoPE-MHA</span>
	<span class="badge">✓ RoPE-GQA</span>
	<span class="badge">✓ ALiBi</span>
	<span class="badge">✓ AbsPE</span>
	<span class="badge">✓ SWA</span>
	<span class="badge">✓ SSM (Mamba)</span>
	<span class="badge">✓ Any HuggingFace public model</span>
	</div>
	<p class="subtle" style="margin-top:0.75rem;" data-i18n="hero.subtitle">
	All computation runs locally in your browser. Free. Unlimited. Auditable.
	</p>
	<p class="subtle" style="margin-top:0.25rem; font-size:0.85rem;" data-i18n="hero.about">
	Built by an independent researcher. Open source. Not affiliated with any model vendor.
	</p>
	<p style="margin-top:0.75rem;">
	<button id="help-btn" type="button" data-i18n="hero.help">📘 Help & examples</button>
	</p>
	</header>

	<!-- Help modal -->
	<div id="help-modal">
	<div class="help-content">
	<button class="help-close" id="help-close">×</button>
	<h2 data-i18n="help.title">📘 TAF Agent — User Manual</h2>

	<h3 data-i18n="help.what.title">What does it do?</h3>
	<p data-i18n="help.what.body">Predicts <strong>practical viability</strong> of any transformer LLM
	<em>before you spend GPU/$</em>. Answers questions like "will this model work at L=32K?" or
	"should I train custom or use API?" using deterministic Python formulas (TAF — Thermodynamic Attention Framework).</p>

	<h3 data-i18n="help.modes.title">How to use — 7 modes</h3>
	<p data-i18n="help.modes.profile"><strong>📇 Profile</strong>: paste model id → all recipes at once = TAF Card. <strong>Best starting point</strong>.</p>
	<p data-i18n="help.modes.compare"><strong>🆚 Compare</strong>: 2-3 models side-by-side on same recipe. Best when choosing between candidates.</p>
	<p data-i18n="help.modes.inspector"><strong>🔍 Inspect config</strong>: paste raw <code>config.json</code> → tool parses + runs full Profile. For private models, in-development configs, or models not yet on HF Hub.</p>
	<p data-i18n="help.modes.ask"><strong>💬 Ask plain English</strong>: free-form question, in-browser LLM picks the recipe. Best for casual exploration.</p>
	<p data-i18n="help.modes.recipe"><strong>📋 Recipe + form</strong>: manual selection, full parameter control. Best when you want exact control.</p>
	<p data-i18n="help.modes.diagnose"><strong>🩺 Diagnose CLI</strong>: generate Python command to measure γ on your local machine (transformers + numpy). Fast ≈5 min CPU; full ≈20–60 min GPU. Output JSON re-uploadable via Inspect.</p>
	<p data-i18n="help.modes.phase"><strong>📊 Phase diagram</strong>: scatter plot of 23 panel models on (log θ, γ) plane. Hagedorn line γ=1 separates Phase A from Phase B. Click a dot to load that model into Recipe form.</p>

	<h3 data-i18n="help.recipes.title">The 8 recipes available</h3>

	<p data-i18n="help.recipe.x1.title"><strong>X-1 Custom training vs API</strong> — compares cost of training your own model vs paying for API access.</p>
	<div class="help-example" data-i18n="help.recipe.x1.example">
	Try: <em>"Should I train an 8B custom model or use GPT-4o for 50M tokens/month?"</em><br>
	Answer types: YES (custom) / NO (API) with break-even months.
	</div>

	<p data-i18n="help.recipe.x2.title"><strong>X-2 Long Context Viability</strong> — predicts if a model serves a target context length reliably.</p>
	<div class="help-example" data-i18n="help.recipe.x2.example">
	Try: <em>"Will Meta-Llama-3-8B handle 32000 tokens for retrieval?"</em><br>
	Chains: γ_Padé → decomposition → d_horizon → NIAH ceiling → hallucination → KV memory.<br>
	Verdict: YES / DEGRADED / NO with mitigation if needed.
	</div>

	<p data-i18n="help.recipe.x3.title"><strong>X-3 Budget pre-flight</strong> — given $ budget, what model is feasible to train?</p>
	<div class="help-example" data-i18n="help.recipe.x3.example">
	Try: <em>"I have $5000, what model can I train?"</em><br>
	Answer: GO / TINY-MODEL / MEMORY-LIMITED with concrete N (params) and D (tokens).
	</div>

	<p data-i18n="help.recipe.x5.title"><strong>X-5 Hardware selection</strong> — which GPU should I use to serve at target throughput?</p>
	<div class="help-example" data-i18n="help.recipe.x5.example">
	Try: <em>"Cheapest hardware to serve Llama-3-8B at 10M tokens/day"</em><br>
	Answer: best GPU + $/Mtok + capacity vs target.
	</div>

	<p data-i18n="help.recipe.x19.title"><strong>X-19 KV Compression decision</strong> — should I use soft decay, hard cutoff, or literature methods?</p>
	<div class="help-example" data-i18n="help.recipe.x19.example">
	Try: <em>"How to compress KV cache for Qwen2.5-7B at 32K?"</em><br>
	Answer: USE SOFT DECAY / USE D_f CUTOFF / USE LITERATURE METHODS / USE HARD T_train.
	</div>

	<h3 style="margin-top: 1.5em;">— v0.4 (sesión 29 findings) —</h3>

	<p data-i18n="help.section.v04"><strong>What's new in v0.4</strong> (sesión 29 findings 2026-04-28): three diagnostic recipes derived from cross-model panel analysis (n=22 LLMs).</p>

	<p data-i18n="help.recipe.x21.title"><strong>X-21 Imprint Purity Diagnostic</strong> — predicts γ on RANDOM tokens via ν=−1/(2π); how clean is the model's RoPE prediction?</p>
	<div class="help-example" data-i18n="help.recipe.x21.example">
	Try: <em>"How clean is the RoPE prediction on Llama-3-8B?"</em><br>
	Answer: predicted γ_random + purity diagnostic (CLEAN / OVER-IMPRINTED / UNDER-IMPRINTED).
	</div>
	<p data-i18n="help.v04.imprint" style="font-size: 0.9em; opacity: 0.85;"><strong>Learned-imprint slope ν = −1/(2π)</strong>: RoPE rotation period 2π drives a positional bias on weights, proportional to log(N_params). Even random tokens show this scaling. ν is DERIVED — not fitted (empirical err 0.3%).</p>

	<p data-i18n="help.recipe.x22.title"><strong>X-22 Compute-Context Invariant</strong> — does γ × log(N²·D) lie in panel band 51.2 ± 16.8? Detects scaling/training anomalies.</p>
	<div class="help-example" data-i18n="help.recipe.x22.example">
	Try: <em>"Does Mistral-7B fit the compute-context invariant?"</em><br>
	Answer: K = γ·log(N²·D), z-score, IN-BAND or OUTLIER.
	</div>
	<p data-i18n="help.v04.invariant" style="font-size: 0.9em; opacity: 0.85;"><strong>Chinchilla-attention invariant K</strong>: γ × log(N²·D) ≈ 51.2 ± 16.8 (CV=0.329). Connects compute scaling and attention exponent into a single dimensionless number.</p>

	<p data-i18n="help.recipe.x23.title"><strong>X-23 IH-Phase Detector</strong> — pre- or post-induction-head? Cheap probe via sign(γ_text − γ_random).</p>
	<div class="help-example" data-i18n="help.recipe.x23.example">
	Try: <em>"Is Qwen2.5-7B post-induction-head?"</em><br>
	Answer: CONFIRMED PRE-IH / CONFIRMED POST-IH / ANOMALY (with size-vs-Δγ consistency check).
	</div>
	<p data-i18n="help.v04.ih_probe" style="font-size: 0.9em; opacity: 0.85;"><strong>Δγ as IH probe</strong>: sign(γ_text − γ_random) > 0 ⟺ post-induction-head. Cheaper than running an in-context-learning benchmark.</p>

	<p data-i18n="help.v04.constants" style="font-size: 0.9em; opacity: 0.85;"><strong>γ-cluster on famous constants</strong> (intriguing, n=4): CodeLlama-13b γ=0.382 ≈ 1−1/φ (golden conjugate, err 0.0003); pythia-1.4b γ=0.705 ≈ 1/√2; Llama-2-7b γ=0.287 ≈ 1−1/√2; Mistral-Nemo γ=0.428 ≈ log_10(e). Caveat: could be coincidence.</p>

	<h3 style="margin-top: 1.5em;" data-i18n="v04.title">🆕 v0.4 — New diagnostics (sesion 31)</h3>
	<p style="opacity: 0.85;"><em data-i18n="v04.section.intro">Four new diagnostic functions derived sesion 31 (2026-04-30) from cross-of-crosses formula games + Sócratic interrogation. Available in <code>taf_browser.py</code> §33.</em></p>

	<p><strong data-i18n="v04.arch.label">Architectural Concentration</strong> — <span data-i18n="v04.arch.desc">γ_text ≈ γ_Padé − 0.012·n_kv. Cross-panel correlational law (R²=0.30). Caveat: not per-model predictor.</span></p>

	<p><strong data-i18n="v04.pdi.label">PDI — Padé Deviation Index</strong> — <span data-i18n="v04.pdi.desc">PDI = d_horizon_obs/T_eval. Traffic light: green (≈1), orange (>>1), yellow (<<1), red (Phase B negative).</span></p>

	<p><strong data-i18n="v04.4bit.label">4-bit Shift Predictor</strong> — <span data-i18n="v04.4bit.desc">MHA: R²(bf16)<0.9 → γ rises; R²>0.99 → γ drops. GQA: precision-robust regardless.</span></p>

	<p><strong data-i18n="v04.crit.label">Critical Exponents Bundle</strong> — <span data-i18n="v04.crit.desc">ν_c, β_c, η_c (=γ−1, CORRECTED), α_C, γ_susc with AM-GM minimum at γ=1−1/√2≈0.293.</span></p>

	<h3 data-i18n="help.add_models.title">Adding new models (3 ways)</h3>
	<ul>
	<li data-i18n="help.add_models.preset"><strong>Preset list</strong>: 11 popular models curated. Just select from dropdown.</li>
	<li data-i18n="help.add_models.hf"><strong>HF Hub fetch</strong>: paste any model id (e.g. <code>Qwen/Qwen2.5-32B-Instruct</code>),
	click 📥 Fetch. Browser downloads <code>config.json</code> directly from HuggingFace, fills the form. Works for any public model.</li>
	<li data-i18n="help.add_models.manual"><strong>Manual</strong>: fill the form fields directly with values from the model card.</li>
	</ul>

	<h3 data-i18n="help.audit.title">The audit chain</h3>
	<p data-i18n="help.audit.body">Every result shows the full <strong>Computation Chain</strong> — each formula step with its inputs,
	output, and interpretation. Click any step to expand. Cite section numbers (§26.1, §19.1, etc.) refer
	to the underlying paper for derivation.</p>

	<h3 data-i18n="help.synthesis.title">The plain-English answer</h3>
	<p data-i18n="help.synthesis.body">After the deterministic chain runs, an in-browser LLM (Qwen2.5-0.5B, ~350MB cached after first load)
	synthesizes a plain-English summary. The numbers above are <em>always correct</em> (deterministic Python);
	the synthesis is LLM-generated — verify against the chain if in doubt.</p>

	<h3 data-i18n="help.params.title">Common parameters explained</h3>
	<ul>
	<li data-i18n="help.param.theta"><strong>θ (rope_theta)</strong>: RoPE base frequency. Higher = more long-range capacity. Typical: 10000 (early), 500000 (Llama-3), 1000000 (Qwen2.5).</li>
	<li data-i18n="help.param.T_train"><strong>T_train</strong>: max context the model was trained on. From <code>max_position_embeddings</code>.</li>
	<li data-i18n="help.param.T_eval"><strong>T_eval</strong>: <em>your target</em> inference context length. The key knob.</li>
	<li data-i18n="help.param.gqa"><strong>n_kv_heads < n_attention_heads</strong>: model uses GQA (Grouped Query Attention). Reduces KV memory but pushes γ toward Hagedorn.</li>
	<li data-i18n="help.param.swa"><strong>has_SWA</strong>: model uses Sliding Window Attention (Mistral, gemma-2).</li>
	<li data-i18n="help.param.nparams"><strong>n_params</strong>: total parameter count. Threshold ~400M for induction-head emergence.</li>
	</ul>

	<h3 data-i18n="help.verdicts.title">What to look for in verdicts</h3>
	<ul>
	<li data-i18n="help.verdict.yes"><strong style="color:#3fb950;">YES / GO</strong> — proceed with confidence; numbers support the choice.</li>
	<li data-i18n="help.verdict.deg"><strong style="color:#d29922;">DEGRADED / TINY-MODEL</strong> — works but with caveats; read the action.</li>
	<li data-i18n="help.verdict.no"><strong style="color:#f85149;">NO / MEMORY-LIMITED</strong> — don't proceed as-is; mitigation provided.</li>
	</ul>

	<h3 data-i18n="help.privacy.title">Privacy</h3>
	<p data-i18n="help.privacy.body">Everything runs in your browser. No telemetry, no analytics, no data sent anywhere. Even the LLM model
	runs locally via WebGPU/WebAssembly. Your model_ids and questions never leave this page.</p>

	<h3 data-i18n="help.source.title">Source & paper</h3>
	<p data-i18n="help.source.body">Source code: <a href="https://github.com/karlesmarin/tafagent" target="_blank">github.com/karlesmarin/tafagent</a><br>
	Paper: <em>Marin 2026 — Predicting How Transformers Attend</em> (<a href="https://zenodo.org/records/19826343" target="_blank">Zenodo</a>; arXiv forthcoming)<br>
	Dataset: <a href="https://huggingface.co/datasets/karlexmarin/taf-attention-decay" target="_blank">taf-attention-decay</a> — 58 γ-measurements across 32 models (CC-BY-4.0)</p>
	</div>
	</div>

	<main>
	<!-- Status with loading bar -->
	<section id="status-bar">
	<div id="status" data-i18n="status.loading_pyodide">⏳ Loading Python runtime...</div>
	<div id="loading-bar-wrap" style="display:none;">
	<div id="loading-bar"></div>
	</div>
	</section>

	<!-- Mode toggle -->
	<section id="mode-section">
	<h2><span data-i18n="modes.title">🎯 Mode</span>
	<span class="info"><span class="tooltip" data-i18n="modes.tip"><strong>Four ways to use the tool</strong>.<br>
	<strong>📇 Profile</strong>: paste a model id → all 5 recipes at once = TAF Card.<br>
	<strong>🆚 Compare</strong>: 2-3 models side-by-side on one recipe.<br>
	<strong>💬 Ask</strong>: free-form question, browser LLM picks the recipe.<br>
	<strong>📋 Recipe</strong>: manual selection with full form control.
	</span></span>
	</h2>
	<div class="mode-tabs">
	<button class="mode-btn active" data-mode="profile" data-i18n="modes.profile">📇 Profile a model</button>
	<button class="mode-btn" data-mode="compare" data-i18n="modes.compare">🆚 Compare models</button>
	<button class="mode-btn" data-mode="inspector" data-i18n="modes.inspector">🔍 Inspect config</button>
	<button class="mode-btn" data-mode="ask" data-i18n="modes.ask">💬 Ask plain English</button>
	<button class="mode-btn" data-mode="recipe" data-i18n="modes.recipe">📋 Pick recipe</button>
	<button class="mode-btn" data-mode="diagnose" data-i18n="modes.diagnose">🩺 Diagnose CLI</button>
	<button class="mode-btn" data-mode="phase" data-i18n="modes.phase">📊 Phase diagram</button>
	</div>
	<p id="mode-desc" class="recipe-desc" data-i18n="modes.desc">
	<strong>Quickest start</strong>: paste any HuggingFace model id (e.g. <code>meta-llama/Meta-Llama-3-8B</code>),
	click Profile. See all 5 recipes scored in seconds.
	</p>
	</section>

	<!-- PROFILE mode -->
	<section id="profile-section">
	<div class="quickstart-banner" data-i18n="profile.quickstart">
	💡 Quick start: pick any preset → click Generate. Or paste a model id from <a href='https://huggingface.co/models?library=transformers&sort=trending' target='_blank'>HF Hub trending</a> → 📥 Fetch → Generate.
	</div>
	<h2><span data-i18n="profile.title">📇 Profile a model</span>
	<span class="info"><span class="tooltip" data-i18n="profile.tip">
	<strong>One-click full diagnosis</strong>. Paste any HF model id (or pick preset).
	Tool runs all 5 recipes (long-context, KV-compression, custom-vs-API, budget,
	hardware) and produces a single <strong>TAF Card</strong> showing verdict per
	dimension + key numbers + architecture classification.<br><br>
	<strong>Use case</strong>: "I'm evaluating Qwen2.5-32B for production —
	what's its full viability profile?" → paste id → Profile → done.
	</span></span>
	</h2>
	<p class="recipe-desc" data-i18n="profile.desc">
	<strong>For technicians</strong>: when you need a complete viability snapshot
	of a candidate model. Outputs match paper §sec:gamma_decomposition format.
	</p>

	<div class="form-row">
	<label for="profile-preset" data-i18n="profile.preset_label">Preset:</label>
	<select id="profile-preset" disabled>
	<option value="" data-i18n="profile.preset_default">— or pick from list —</option>
	</select>
	</div>

	<div class="form-row">
	<label for="profile-hf-id" data-i18n="profile.hf_label">HF model id:</label>
	<input type="text" id="profile-hf-id"
	data-i18n-placeholder="profile.hf_placeholder"
	placeholder="e.g. meta-llama/Meta-Llama-3-8B or Qwen/Qwen2.5-7B" style="flex:1;" />
	<button id="profile-fetch-btn" type="button" class="secondary" data-i18n="profile.fetch_btn">📥 Fetch</button>
	</div>
	<div id="profile-hf-status" class="subtle" style="margin: -0.5rem 0 1rem; min-height:1.2em;"></div>

	<div class="form-grid" id="profile-form">
	<div class="form-field">
	<label><span data-i18n="param.theta">θ (rope_theta)</span> <span class="info"><span class="tooltip" data-i18n="param.theta.tip">RoPE base frequency from <code>config.rope_theta</code>.</span></span></label>
	<input type="number" id="profile-theta" value="500000" />
	</div>
	<div class="form-field">
	<label><span data-i18n="param.T_train">T_train</span> <span class="info"><span class="tooltip" data-i18n="param.T_train.tip">Max training context. From <code>max_position_embeddings</code>.</span></span></label>
	<input type="number" id="profile-T_train" value="8192" />
	</div>
	<div class="form-field">
	<label><span data-i18n="param.T_eval">T_eval (your target)</span> <span class="info"><span class="tooltip" data-i18n="param.T_eval.tip">Inference context length you'll actually serve. The key knob.</span></span></label>
	<input type="number" id="profile-T_eval" value="32000" />
	</div>
	<div class="form-field">
	<label data-i18n="param.n_attn">n_attention_heads</label>
	<input type="number" id="profile-n_attn" value="32" />
	</div>
	<div class="form-field">
	<label data-i18n="param.n_kv">n_kv_heads</label>
	<input type="number" id="profile-n_kv" value="8" />
	</div>
	<div class="form-field">
	<label data-i18n="param.d_head">head_dim</label>
	<input type="number" id="profile-d_head" value="128" />
	</div>
	<div class="form-field">
	<label data-i18n="param.n_layers">n_layers</label>
	<input type="number" id="profile-n_layers" value="32" />
	</div>
	<div class="form-field">
	<label data-i18n="param.n_params">n_params (e.g. 8e9)</label>
	<input type="text" id="profile-n_params" value="8e9" />
	</div>
	<div class="form-field">
	<label data-i18n="param.has_swa">Has SWA?</label>
	<select id="profile-has_swa">
	<option value="false" selected data-i18n="common.no">No</option>
	<option value="true" data-i18n="common.yes">Yes</option>
	</select>
	</div>
	</div>

	<button id="profile-btn" disabled data-i18n="profile.btn">🚀 Generate full profile</button>
	</section>

	<!-- INSPECTOR mode (paste config.json directly) -->
	<section id="inspector-section" style="display:none;">
	<div class="quickstart-banner" data-i18n="inspector.quickstart">
	💡 Use case: you have a private model not on HF Hub, or a config you're designing. Paste the raw JSON below and get a full TAF profile.
	</div>
	<h2><span data-i18n="inspector.title">🔍 Architecture Inspector</span>
	<span class="info"><span class="tooltip" data-i18n="inspector.tip">
	<strong>Paste any config.json directly</strong>. Tool parses it and runs the full Profile.
	Useful for: private models, in-development configs, models not yet on HuggingFace,
	or comparing what your custom architecture would do.
	</span></span>
	</h2>
	<p class="recipe-desc" data-i18n="inspector.desc">
	Paste the raw <code>config.json</code> contents. The tool extracts the architectural
	parameters and runs the full 5-recipe Profile.
	</p>
	<textarea id="inspector-json" rows="12"
	data-i18n-placeholder="inspector.placeholder"
	placeholder='{
	"model_type": "llama",
	"rope_theta": 500000,
	"max_position_embeddings": 8192,
	"num_attention_heads": 32,
	"num_key_value_heads": 8,
	"hidden_size": 4096,
	"num_hidden_layers": 32,
	"vocab_size": 128256
	}'></textarea>
	<div class="form-row" style="margin-top:0.5rem;">
	<label for="inspector-T_eval" data-i18n="inspector.T_eval">T_eval (your target context):</label>
	<input type="number" id="inspector-T_eval" value="32000" />
	</div>
	<button id="inspector-btn" disabled data-i18n="inspector.btn">🚀 Inspect & profile</button>
	<span id="inspector-status" class="subtle" style="margin-left:0.75rem;"></span>
	</section>

	<!-- COMPARE mode -->
	<section id="compare-section" style="display:none;">
	<div class="quickstart-banner" data-i18n="compare.example">
	💡 Try: paste 3 popular 7-8B models (Meta-Llama-3-8B, Mistral-7B-v0.1, Qwen/Qwen2.5-7B), pick recipe X-2, T_eval=16000. See which best handles long context.
	</div>
	<h2><span data-i18n="compare.title">🆚 Compare models side-by-side</span>
	<span class="info"><span class="tooltip" data-i18n="compare.tip">
	<strong>Same recipe, multiple models</strong>. Pick 2-3 candidate models and
	one recipe. See verdicts in a single comparison table.<br><br>
	<strong>Use case</strong>: "I need long-context retrieval at 16K — which is
	best: Llama-3-8B, Mistral-7B, or Qwen-7B?" → pick 3 + X-2 + 16K → see winner.
	</span></span>
	</h2>
	<p class="recipe-desc" data-i18n="compare.desc">
	<strong>For technicians</strong>: when choosing between 2-3 candidate models for
	a specific deployment scenario. Compare their verdicts on the same recipe.
	</p>

	<div class="form-row">
	<label for="compare-recipe" data-i18n="compare.recipe_label">Recipe:</label>
	<select id="compare-recipe" disabled>
	<option value="" data-i18n="recipe.default">— pick a recipe —</option>
	</select>
	</div>

	<div class="form-row">
	<label for="compare-T_eval" data-i18n="compare.T_eval_label">T_eval (target context):</label>
	<input type="number" id="compare-T_eval" value="16000" style="flex:1;" />
	<span class="info" style="margin-top:0.5rem;"><span class="tooltip">
	For X-2 / X-19 only. The context length all compared models will be
	evaluated at. Other recipes use their own params.
	</span></span>
	</div>

	<div id="compare-models">
	<h3 style="margin-top:1rem;" data-i18n="compare.models_title">Models to compare (add up to 3)</h3>
	<div class="compare-slot" data-slot="1">
	<input type="text" class="compare-hf-id"
	data-i18n-placeholder="compare.slot1_placeholder"
	placeholder="HF model id (e.g. meta-llama/Meta-Llama-3-8B)" />
	<select class="compare-preset">
	<option value="" data-i18n="compare.preset_default">— or preset —</option>
	</select>
	</div>
	<div class="compare-slot" data-slot="2">
	<input type="text" class="compare-hf-id"
	data-i18n-placeholder="compare.slot2_placeholder"
	placeholder="HF model id #2" />
	<select class="compare-preset">
	<option value="" data-i18n="compare.preset_default">— or preset —</option>
	</select>
	</div>
	<div class="compare-slot" data-slot="3">
	<input type="text" class="compare-hf-id"
	data-i18n-placeholder="compare.slot3_placeholder"
	placeholder="HF model id #3 (optional)" />
	<select class="compare-preset">
	<option value="" data-i18n="compare.preset_default">— or preset —</option>
	</select>
	</div>
	</div>

	<button id="compare-btn" disabled style="margin-top:1rem;" data-i18n="compare.btn">🚀 Compare</button>
	</section>

	<!-- ASK mode (free-form question) -->
	<section id="ask-section" style="display:none;">
	<h2 data-i18n="ask.title">❓ Your question</h2>
	<textarea id="question" rows="3"
	data-i18n-placeholder="ask.placeholder"
	placeholder="e.g. Will Mistral-7B handle 16K NIAH retrieval? Or: I have $5,000, what model can I train? Or: Cheapest GPU to serve Llama-70B at 100M tokens/day?"></textarea>
	<div style="display:flex; gap:0.5rem; margin-top:0.5rem; flex-wrap:wrap;">
	<button id="ask-btn" disabled data-i18n="ask.btn">🚀 Analyze</button>
	<button id="example-btn" type="button" class="secondary" data-i18n="ask.example_btn">💡 Try an example</button>
	</div>
	</section>

	<!-- Diagnose mode: build the CLI command for diagnose_model.py -->
	<section id="diagnose-section" style="display:none;">
	<h2><span data-i18n="diagnose.title">🩺 Diagnose CLI Command Builder</span>
	<span class="info"><span class="tooltip" data-i18n="diagnose.tip">
	<strong>Measure γ_obs (not predict)</strong>. The browser tool predicts γ from
	config alone (Padé). To <em>measure</em> the actual decay on a real model
	you need GPU + Python. This builder produces the exact CLI command you
	run locally; the script is shipped in this repository at
	<code>cli/diagnose_model.py</code>.<br><br>
	<strong>Output</strong>: γ_obs, R², phase, KV cache budget D_90, KL anomaly,
	full thermodynamic profile (Z, U, S, F, C_V, χ). Saved as JSON.
	</span></span>
	</h2>
	<p class="recipe-desc" data-i18n="diagnose.desc">
	Pick options below and copy-paste the generated command on your local
	machine (Python + transformers + numpy). Total wall time ≈ 5 min in
	<code>--fast</code> mode on CPU; full mode 20–60 min on GPU.
	</p>

	<div class="form-row">
	<label for="diag-model" data-i18n="diagnose.model_label">HF model id:</label>
	<input type="text" id="diag-model" placeholder="EleutherAI/pythia-70m" value="EleutherAI/pythia-70m">
	</div>

	<div class="form-row">
	<label for="diag-theta" data-i18n="diagnose.theta_label">θ (auto if blank):</label>
	<input type="number" id="diag-theta" placeholder="auto-detect">
	</div>

	<div class="form-row">
	<label for="diag-N" data-i18n="diagnose.n_label">Context N:</label>
	<input type="number" id="diag-N" value="2000" min="100" max="32000">
	</div>

	<div class="form-row">
	<label data-i18n="diagnose.options_label">Options:</label>
	<span>
	<label><input type="checkbox" id="diag-fast" checked>
	<span data-i18n="diagnose.opt_fast">--fast (CPU, ~5 min)</span></label><br>
	<label><input type="checkbox" id="diag-cpu">
	<span data-i18n="diagnose.opt_cpu">--cpu (force CPU)</span></label><br>
	<label><input type="checkbox" id="diag-4bit">
	<span data-i18n="diagnose.opt_4bit">--load_in_4bit (≥7B models)</span></label>
	</span>
	</div>

	<div class="form-row">
	<label for="diag-local" data-i18n="diagnose.local_label">--local path (optional):</label>
	<input type="text" id="diag-local" placeholder="/path/to/local/weights">
	</div>

	<button id="diag-build-btn" data-i18n="diagnose.build_btn">📋 Build command</button>

	<div id="diag-output" style="display:none; margin-top:1em;">
	<h3 data-i18n="diagnose.cmd_title">Generated command:</h3>
	<pre id="diag-cmd" class="diag-cmd-box"></pre>
	<button id="diag-copy-btn" data-i18n="diagnose.copy_btn">📋 Copy to clipboard</button>
	<p class="recipe-desc" data-i18n="diagnose.next_steps">
	<strong>Next steps</strong>:
	(1) <code>git clone https://github.com/karlesmarin/tafagent</code>
	(2) <code>cd tafagent && pip install torch transformers numpy</code>
	(3) Run the command above.
	(4) Result JSON lands in <code>./diagnose_results/</code> — upload it
	to the <strong>📋 Pick recipe</strong> mode (or paste in <strong>🔍 Inspect config</strong>) for full TAF analysis.
	</p>
	</div>
	</section>

	<!-- Phase diagram mode: live scatter of measured γ vs θ -->
	<section id="phase-section" style="display:none;">
	<h2><span data-i18n="phase.title">📊 Phase diagram (γ × θ)</span>
	<span class="info"><span class="tooltip" data-i18n="phase.tip">
	Each dot is one model from the paper's empirical panel
	(data/master_gamma_results.json). The x-axis is RoPE base θ
	on log scale; y-axis is measured γ.
	The Hagedorn line γ=1 separates Phase A (γ<1, global) from
	Phase B (γ>1, local-collapsed).
	Hover dots for details; click to populate the recipe form.
	</span></span>
	</h2>
	<p class="recipe-desc" data-i18n="phase.desc">
	23 models in the panel; the Padé curve (line) is
	γ_pred(θ) = (2θ−T√2)/(2θ+T√2) at T=2000.
	</p>
	<canvas id="phase-canvas" width="900" height="500" style="max-width:100%; background: var(--card-bg); border-radius: 6px;"></canvas>
	<div id="phase-info" class="recipe-desc" style="margin-top:0.6em;"></div>
	</section>

	<!-- Recipe selector (mode=recipe) -->
	<section id="recipe-section" style="display:none;">
	<h2 data-i18n="recipe.title">📋 Recipe</h2>
	<select id="recipe-select" disabled>
	<option value="" data-i18n="recipe.default">— select a recipe —</option>
	</select>
	<p id="recipe-desc-display" class="recipe-desc"></p>
	</section>

	<!-- Form (mode=recipe) -->
	<section id="form-section" style="display:none;">
	<h2 data-i18n="recipe.input_title">🎯 Inputs</h2>

	<div class="form-row">
	<label for="preset" data-i18n="profile.preset_label">Preset model:</label>
	<select id="preset" disabled>
	<option value="" data-i18n="profile.preset_default">— select to autofill —</option>
	</select>
	</div>

	<div class="form-row">
	<label for="hf-id" data-i18n="profile.hf_label">Or any HF model:</label>
	<input type="text" id="hf-id"
	data-i18n-placeholder="profile.hf_placeholder"
	placeholder="e.g. Qwen/Qwen2.5-32B-Instruct" style="flex:1;" />
	<button id="hf-fetch-btn" type="button" class="secondary" data-i18n="profile.fetch_btn">📥 Fetch</button>
	</div>
	<div id="hf-status" class="subtle" style="margin: -0.5rem 0 1rem; min-height:1.2em;"></div>

	<div id="dynamic-form" class="form-grid"></div>

	<button id="run-btn" disabled data-i18n="ask.btn">🚀 Analyze</button>
	</section>

	<!-- Output (single-recipe verdict + chain) -->
	<section id="output-section" style="display:none;">
	<h2 data-i18n="verdict.title">📊 Verdict</h2>
	<div id="verdict-box"></div>

	<div class="share-bar">
	<button id="share-btn" class="secondary" type="button" data-i18n="share.btn">🔗 Copy share link</button>
	<button id="recipe-download-btn" class="secondary" type="button" data-i18n="share.download">💾 Download JSON</button>
	<button id="recipe-submit-btn" class="secondary" type="button" data-i18n="share.submit">📤 Submit to registry</button>
	<span id="share-status" class="subtle"></span>
	</div>

	<h2 data-i18n="chain.title">🔍 Computation Chain</h2>
	<p class="subtle" data-i18n="chain.desc">Every number below is deterministic Python. Click a step to expand.</p>
	<div id="chain-box"></div>

	<h2 id="answer-header" style="display:none;" data-i18n="answer.title">💬 Plain-English Answer</h2>
	<div id="answer-box" style="display:none;"></div>
	</section>

	<!-- Profile output -->
	<section id="profile-output" style="display:none;">
	<h2 data-i18n="tafcard.title">📇 TAF Card — full model profile</h2>
	<div id="profile-box"></div>
	</section>

	<!-- Compare output -->
	<section id="compare-output" style="display:none;">
	<h2 data-i18n="compare.title_out">🆚 Comparison Table</h2>
	<div id="compare-box"></div>
	</section>

	<!-- Hidden file input for JSON upload (shared by all import buttons) -->
	<input type="file" id="import-file" accept=".json,application/json" style="display:none;" />

	<!-- Floating import bar (always visible) -->
	<section id="import-section">
	<h2 data-i18n="share.import_title">📂 Import a shared TAF result</h2>
	<p class="recipe-desc" data-i18n="share.import_desc">
	Got a JSON file from someone else's TAF analysis? Load it here to see the verdict + chain locally.
	Same view as if you'd run it yourself.
	</p>
	<button id="import-btn" class="secondary" type="button" data-i18n="share.import_btn">📂 Load shared JSON</button>
	<span id="import-status" class="subtle" style="margin-left:0.75rem;"></span>
	</section>

	<!-- Browse community submissions (live from GitHub Issues) -->
	<section id="community-section">
	<h2 data-i18n="community.title">🌐 Recent community submissions</h2>
	<p class="recipe-desc" data-i18n="community.desc">
	Live feed from the public registry. Click any submission to view full analysis.
	<a href="https://github.com/karlesmarin/tafagent-registry/issues" target="_blank" data-i18n="community.browse_all">Browse all →</a>
	</p>
	<div id="community-feed" class="subtle"><span data-i18n="community.loading">Loading...</span></div>
	</section>

	<!-- FALSIFICATION dashboard (paper predictions status) -->
	<section id="falsification-section">
	<h2 data-i18n="falsification.title">🔬 Paper predictions — falsification status</h2>
	<p class="recipe-desc" data-i18n="falsification.desc">
	The TAF framework rests on falsifiable predictions (F1-F23). Each is empirically tested.
	Here's the live status of every prediction in the paper.
	</p>
	<div id="falsification-table"></div>
	</section>
	</main>

	<footer>
	<p data-i18n="footer.text">
	© 2026 Carles Marin · Apache-2.0 · independent research · the tool that closes the loop of the paper.
	</p>
	<p>
	<a href="https://github.com/karlesmarin/tafagent" target="_blank">Source on GitHub</a>
	·
	<a href="https://github.com/karlesmarin/NeurIPS" target="_blank">Paper repo</a>
	</p>
	<p class="subtle">
	Computation: Pyodide · Synthesis: WebLLM (Qwen2.5-0.5B local) · Hosting: GitHub Pages · Cost: $0
	</p>
	</footer>

	<script type="module" src="js/main.js"></script>
	</body>
	</html>