CSR — Conceptual Sparse Reconstruction (Gemma 3 4B)

Research artifact: an LLM extracted into a directed concept graph instead of weight matrices. Built from google/gemma-3-4b-pt using google/gemma-scope-2-4b-pt SAEs.

What this is

157,781 concept nodes (across 34 layers, width-16k SAE features)
32,315,146 directed edges (co-activation, with excess-conditional-probability filter)
23,494 pre-compiled bundles (co-activating subgraphs for fast-path execution)
A trained transition model + decoder that lets the graph reconstruct activations at inference time

This is not a drop-in HF model — it's the artifact half of a research pipeline. To run inference, pair this repo with csr/runtime/ from the source repo.

Files

Path	Size	Description
`meta.json`	349 B	Package metadata
`graph/`	640 MB	Nodes + edges + token index
`bundles/`	80 MB	23k pre-compiled subgraphs
`decoder/`	222 MB	Trained reconstruction decoder
`transition_model.pt`	154 MB	Trained layer-transition model
`residuals/`	184 KB	Residual store

Not included (re-downloadable from upstream):

Gemma input/output embeddings → google/gemma-3-4b-pt
Per-layer SAE weights → google/gemma-scope-2-4b-pt
Tokenizer → google/gemma-3-4b-pt

Construction

5 M tokens of FineWeb-Edu + The Stack v2-dedup + open-web-math
A100 80 GB (RunPod), ~36 hours wall clock
Filters: node freq ≥ 1 %, edge freq ≥ 2 %, |excess| ≥ 0.25, max 300 k edges per layer-pair
Skipped at this scale: RelP attribution, full LM-Eval validation

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Zabo4203/csr-gemma-3-4b-pt

Base model

google/gemma-3-4b-pt

Finetuned

(308)

this model