CSR β Conceptual Sparse Reconstruction (Gemma 3 4B)
Research artifact: an LLM extracted into a directed concept graph instead of weight matrices.
Built from google/gemma-3-4b-pt using google/gemma-scope-2-4b-pt SAEs.
What this is
- 157,781 concept nodes (across 34 layers, width-16k SAE features)
- 32,315,146 directed edges (co-activation, with excess-conditional-probability filter)
- 23,494 pre-compiled bundles (co-activating subgraphs for fast-path execution)
- A trained transition model + decoder that lets the graph reconstruct activations at inference time
This is not a drop-in HF model β it's the artifact half of a research pipeline. To run inference, pair this repo with csr/runtime/ from the source repo.
Files
| Path | Size | Description |
|---|---|---|
meta.json |
349 B | Package metadata |
graph/ |
640 MB | Nodes + edges + token index |
bundles/ |
80 MB | 23k pre-compiled subgraphs |
decoder/ |
222 MB | Trained reconstruction decoder |
transition_model.pt |
154 MB | Trained layer-transition model |
residuals/ |
184 KB | Residual store |
Not included (re-downloadable from upstream):
- Gemma input/output embeddings β
google/gemma-3-4b-pt - Per-layer SAE weights β
google/gemma-scope-2-4b-pt - Tokenizer β
google/gemma-3-4b-pt
Construction
- 5 M tokens of FineWeb-Edu + The Stack v2-dedup + open-web-math
- A100 80 GB (RunPod), ~36 hours wall clock
- Filters: node freq β₯ 1 %, edge freq β₯ 2 %, |excess| β₯ 0.25, max 300 k edges per layer-pair
- Skipped at this scale: RelP attribution, full LM-Eval validation
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support
Model tree for Zabo4203/csr-gemma-3-4b-pt
Base model
google/gemma-3-4b-pt