Spaces:
Running
Running
File size: 5,737 Bytes
c11b76c 1a6c909 c11b76c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 | # 🌐 TAF Agent — Public Registry
> Community-curated archive of TAF (Thermodynamic Attention Framework) analyses
> for transformer LLMs. Submitted by users of [TAF Agent](https://karlesmarin.github.io/tafagent).
This repository **stores no code**. It exists purely as a **public Issues board**
where users of the TAF Agent web tool submit their model analyses for the
community to verify, refute, comment on, or reuse.
---
## How it works
1. A user runs the [TAF Agent](https://karlesmarin.github.io/tafagent) on a model
2. They click **📤 Submit to registry**
3. A new GitHub Issue opens with the analysis pre-filled in this repo
4. The user reviews, optionally adds a comment, and clicks Submit
5. The analysis becomes a permanent public record
---
## Browsing
- 📂 [All issues](https://github.com/karlesmarin/tafagent-registry/issues) —
every submission ever made
- 🟢 [Verified](https://github.com/karlesmarin/tafagent-registry/issues?q=label%3Averified) —
marked as independently verified
- 🔴 [Refuted](https://github.com/karlesmarin/tafagent-registry/issues?q=label%3Arefuted) —
empirical measurement contradicts the prediction
- 🔍 Search by **input hash** to find existing analyses for the same config:
e.g. `#8d29feb8` finds all analyses for the same model+T_eval+arch params
---
## The hash system (deduplication)
Every TAF analysis is hashed from its **canonical inputs**. Identical inputs
(same model, same T_eval, same flags) always produce the same 8-character
hex hash. Different inputs produce different hashes.
This means:
- **Searching `#a1b2c3d4`** finds all submissions for the exact same config
- **Independent verification** of an existing analysis = comment on the
existing issue (not a new one)
- **Refutation** = reply with empirical evidence, the maintainers will add
the `refuted` label
- **No duplicate spam**: contributors are nudged to search before submitting
---
## What submissions look like
Each issue follows the title pattern:
```
[TAF Profile] Meta-Llama-3-8B @ T=32000 #8d29feb8
[TAF X-2] Meta-Llama-3-8B → YES #a1b2c3d4
[TAF Compare] X-2 × 3 models #c5d6e7f8
```
Body contains the verdict, key numbers, and a collapsible JSON of the full
analysis chain. See any [recent issue](https://github.com/karlesmarin/tafagent-registry/issues)
for examples.
---
## Contributing
### To submit an analysis
Just run the [TAF Agent](https://karlesmarin.github.io/tafagent) and click
**📤 Submit to registry**. The form pre-fills everything.
### To verify an existing analysis
1. Find an issue (search by hash if you know one, or browse)
2. Run the same analysis yourself
3. If your result matches → comment "✅ Verified — [evidence link / setup details]"
4. A maintainer will add the `verified` label
### To refute a prediction
1. Find an issue with a verdict you disagree with
2. Run the **actual measurement** (not just TAF prediction) — e.g. for
Long-Context (X-2), run NIAH evaluation on real GPU
3. Comment with:
- Your measurement value + std
- Hardware + software setup (vLLM version, GPU, etc.)
- Repro recipe (script or command)
4. A maintainer will add the `refuted` label and link to your evidence
Refutations are first-class citizens here. The TAF framework is designed to
be falsifiable — if a prediction is wrong, we want to know.
### To propose a new recipe
Open an issue with title `[Proposal] X-NN — <name>` describing:
- The practical question the recipe answers
- The chain of formulas it would use
- An example use case
If the recipe is feasible, the maintainer adds it to the
[TAF Agent codebase](https://github.com/karlesmarin/tafagent) and labels
your issue `recipe-proposed`.
### To add a model preset
Open an issue with title `[Preset] <model-id>` listing:
- `rope_theta`, `max_position_embeddings`, `num_attention_heads`,
`num_key_value_heads`, `head_dim`, `num_hidden_layers`, `n_params`,
`has_SWA`
- A link to the model's HuggingFace page
These get bundled into the next release of TAF Agent.
---
## Labels
- `verified` — analysis independently confirmed by another user
- `refuted` — empirical measurement contradicts TAF prediction
- `recipe-proposed` — request for a new TAF recipe
- `preset-proposed` — request for a new model preset
- `discussion` — ongoing community discussion (no consensus yet)
- `question` — clarification request
- `frontier` — recently published model (< 1 month old) being evaluated
---
## What we DON'T accept
- Closed/proprietary model analyses without permission to share publicly
- API keys, tokens, or credentials of any kind
- Commercial advertisements or unrelated content
- Submissions without input hash in title (suggests not from the official tool)
---
## Code of conduct
- Be technical and specific. Disagreements are about the math, not people.
- Refutations require evidence. Opinions don't count, measurements do.
- Cite your sources (paper sections, GitHub commits, vendor docs).
- Assume good faith. Most "wrong" submissions are misunderstandings,
not bad actors.
---
## License
Submissions are released under [CC0 (public domain dedication)](https://creativecommons.org/publicdomain/zero/1.0/)
unless otherwise noted by the contributor. The TAF Agent code itself is
[Apache-2.0](https://github.com/karlesmarin/tafagent/blob/main/LICENSE).
---
## Related
- 🔬 [TAF Agent web tool](https://karlesmarin.github.io/tafagent) — the diagnostic itself
- 📦 [TAF Agent source](https://github.com/karlesmarin/tafagent) — open source
- 📄 [Underlying paper](https://zenodo.org/records/19826343) — Marin 2026,
*Predicting How Transformers Attend*
---
*Maintained by Carles Marin and the TAF community.*
|