CSR β€” Conceptual Sparse Reconstruction (Gemma 3 4B)

Research artifact: an LLM extracted into a directed concept graph instead of weight matrices. Built from google/gemma-3-4b-pt using google/gemma-scope-2-4b-pt SAEs.

What this is

  • 157,781 concept nodes (across 34 layers, width-16k SAE features)
  • 32,315,146 directed edges (co-activation, with excess-conditional-probability filter)
  • 23,494 pre-compiled bundles (co-activating subgraphs for fast-path execution)
  • A trained transition model + decoder that lets the graph reconstruct activations at inference time

This is not a drop-in HF model β€” it's the artifact half of a research pipeline. To run inference, pair this repo with csr/runtime/ from the source repo.

Files

Path Size Description
meta.json 349 B Package metadata
graph/ 640 MB Nodes + edges + token index
bundles/ 80 MB 23k pre-compiled subgraphs
decoder/ 222 MB Trained reconstruction decoder
transition_model.pt 154 MB Trained layer-transition model
residuals/ 184 KB Residual store

Not included (re-downloadable from upstream):

  • Gemma input/output embeddings β†’ google/gemma-3-4b-pt
  • Per-layer SAE weights β†’ google/gemma-scope-2-4b-pt
  • Tokenizer β†’ google/gemma-3-4b-pt

Construction

  • 5 M tokens of FineWeb-Edu + The Stack v2-dedup + open-web-math
  • A100 80 GB (RunPod), ~36 hours wall clock
  • Filters: node freq β‰₯ 1 %, edge freq β‰₯ 2 %, |excess| β‰₯ 0.25, max 300 k edges per layer-pair
  • Skipped at this scale: RelP attribution, full LM-Eval validation
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Zabo4203/csr-gemma-3-4b-pt

Finetuned
(308)
this model