mikeumus-divincian commited on
Commit
6641eb5
·
verified ·
1 Parent(s): 9529975

Add files using upload-large-folder tool

Browse files
Files changed (10) hide show
  1. README.md +109 -0
  2. SHA256SUMS +9 -0
  3. down_features.bin +3 -0
  4. down_meta.bin +3 -0
  5. embeddings.bin +3 -0
  6. gate_vectors.bin +3 -0
  7. index.json +327 -0
  8. manifest.json +16 -0
  9. norms.bin +3 -0
  10. router_weights.bin +3 -0
README.md ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ tags:
4
+ - larql
5
+ - vindex
6
+ - mechanistic-interpretability
7
+ - feature-extraction
8
+ model_name: MedGemma 1.5-4B-IT
9
+ base_model: google/medgemma-1.5-4b-it
10
+ ---
11
+
12
+ # MedGemma 1.5-4B-IT — LarQL Vindex
13
+
14
+ **Source model**: [google/medgemma-1.5-4b-it](https://huggingface.co/google/medgemma-1.5-4b-it)
15
+ **Vindex short ID**: `75eeb232`
16
+ **Layers**: 34 **Hidden size**: 2560 **Features per layer**: 128
17
+
18
+ ## What This Is
19
+
20
+ A **LarQL vindex** (vector index) — a compact binary representation of the feature geometry of `google/medgemma-1.5-4b-it`. It contains the top-128 SVD directions of every MLP gate_proj and down_proj matrix in the network, plus token embeddings, layer norms, and vocabulary projection metadata.
21
+
22
+ ## What This Is NOT
23
+
24
+ This is **not** a model you can run for inference. It has no weights sufficient to generate text. It is a mechanistic interpretability artifact: a feature database for probing, editing, and comparing what `google/medgemma-1.5-4b-it` has learned.
25
+
26
+ ## Universal Constants (Phase 2 Measurements)
27
+
28
+ Measured via forward-pass hooks on a 256-token factual probe text.
29
+
30
+ | Constant | Symbol | Value | Interpretation |
31
+ |----------|--------|-------|----------------|
32
+ | FFN Sparsity | C1 | 0.181 | Fraction of near-zero SwiGLU activations |
33
+ | Top-8 Prob Mass | C2 | 0.795 | Probability mass on top-8 output tokens |
34
+ | Gate Coherence | C3 | 0.800 | Mean cosine sim of adjacent gate_proj directions |
35
+ | Layer Temperature | C4 | 1.898 | Mean per-neuron SwiGLU activation variance |
36
+ | Circuit Stages | C5 | 2 | CKA transition count + 1 |
37
+
38
+ **Notes**: Multimodal PaliGemma 1.5 architecture — text-only probing may explain C4=1.898 anomaly (50× above Gemma4-E2B). C3=0.800 matches Llama-3.1-8B gate coherence pattern, not Qwen3 collapse. Phase 2 C4 should be treated as provisional until vision+text probing is implemented.
39
+
40
+ ## Gate 3 Status (DELETE Patch Test)
41
+
42
+ PENDING (~7 days).
43
+
44
+ Gate 3 tests whether a rank-1 ΔW patch to `gate_proj.weight` at the top Paris→capital feature layer suppresses P(Paris) by ≥70% with ≤30% Berlin collateral damage.
45
+
46
+ ## Files
47
+
48
+ | File | Description |
49
+ |------|-------------|
50
+ | `gate_vectors.bin` | Top-128 SVD directions of gate_proj per layer \[L×F×H, f16\] |
51
+ | `down_features.bin` | Top-128 SVD directions of down_proj per layer \[L×F×H, f16\] |
52
+ | `embeddings.bin` | Token embedding matrix \[V×H, f16\] |
53
+ | `norms.bin` | Layer norm weight vectors |
54
+ | `down_meta.bin` | Per-feature top-k vocabulary projections |
55
+ | `index.json` | Vindex metadata (layers, hidden_size, num_feats, etc.) |
56
+ | `manifest.json` | Build provenance (source SHA, extraction timestamp) |
57
+ | `SHA256SUMS` | File integrity checksums |
58
+
59
+ ## How to Use
60
+
61
+ ```python
62
+ import numpy as np, json
63
+
64
+ vindex_dir = "path/to/downloaded/vindex"
65
+
66
+ with open(f"{vindex_dir}/index.json") as f:
67
+ idx = json.load(f)
68
+
69
+ L, F, H = idx["num_layers"], idx["num_feats"], idx["hidden_size"]
70
+ V = idx["vocab_size"]
71
+
72
+ # Load gate feature directions [L, F, H]
73
+ gate = np.frombuffer(
74
+ open(f"{vindex_dir}/gate_vectors.bin", "rb").read(),
75
+ dtype=np.float16
76
+ ).reshape(L, F, H).astype(np.float32)
77
+
78
+ # Load embeddings [V, H]
79
+ emb = np.frombuffer(
80
+ open(f"{vindex_dir}/embeddings.bin", "rb").read(),
81
+ dtype=np.float16
82
+ ).reshape(V, H).astype(np.float32)
83
+
84
+ # Score a token against all features (cosine similarity)
85
+ emb_n = emb / (np.linalg.norm(emb, axis=1, keepdims=True) + 1e-8)
86
+ gate_n = gate / (np.linalg.norm(gate, axis=2, keepdims=True) + 1e-8)
87
+
88
+ token_id = 12379 # e.g., " Paris"
89
+ scores = gate_n @ emb_n[token_id] # [L, F]
90
+ l_max, f_max = np.unravel_index(scores.argmax(), scores.shape)
91
+ print(f"Top feature: layer={l_max}, feature={f_max}, score={scores[l_max, f_max]:.4f}")
92
+ ```
93
+
94
+ ## License
95
+
96
+ CC-BY-NC 4.0 — same terms as the source model. Research use only.
97
+
98
+ ## Citation
99
+
100
+ If you use this vindex in published work, please cite:
101
+
102
+ ```
103
+ @misc{divinci2026larql,
104
+ title = {LarQL Vindex: MedGemma 1.5-4B-IT},
105
+ author = {Divinci AI},
106
+ year = {2026},
107
+ url = {https://huggingface.co/Divinci-AI/medgemma-1.5-4b-vindex}
108
+ }
109
+ ```
SHA256SUMS ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ 24c6e21919c8e5f94125d18fbda6eb3f187cf77040fe5773f6013d03225be05b README.md
2
+ 6f16a515fa02c81a5dad2210bdd6d9ccf84840b6d7c7fd84a7d6ea3bed001143 down_features.bin
3
+ fbbbf54f41beb5f0450b99a8df5516a56a3d5f94be10625b70ba56a5f491d5ac down_meta.bin
4
+ 278ab70f2bbd880bd6300d73bb2f8c161dd531fc6df3b3490348c72dacf991de embeddings.bin
5
+ 6e4c8dc7eb32239504ecdc90455e979e193655d2aaef298fe39362da20e7b449 gate_vectors.bin
6
+ 9d0d16b4ac917a7a7d54fa4406bed3fe689dcc059278e351006c3ac22db868ed index.json
7
+ 5450d33c20f16a92c13353e8a841c5a5268412b72411fb99f36714c5053bed8a manifest.json
8
+ 17e5161c2a98abe0a66748aa9d256972fec6f049cf8b9db1c3962fedcd942835 norms.bin
9
+ 4f669d46559ca1da48b4f2414927d85ad19af0054791a506d84a90271614db87 router_weights.bin
down_features.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6f16a515fa02c81a5dad2210bdd6d9ccf84840b6d7c7fd84a7d6ea3bed001143
3
+ size 22282240
down_meta.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fbbbf54f41beb5f0450b99a8df5516a56a3d5f94be10625b70ba56a5f491d5ac
3
+ size 383128
embeddings.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:278ab70f2bbd880bd6300d73bb2f8c161dd531fc6df3b3490348c72dacf991de
3
+ size 1342504960
gate_vectors.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e4c8dc7eb32239504ecdc90455e979e193655d2aaef298fe39362da20e7b449
3
+ size 22282240
index.json ADDED
@@ -0,0 +1,327 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "version": 1,
3
+ "model": "google/medgemma-1.5-4b-it",
4
+ "family": "gemma3",
5
+ "source": {
6
+ "huggingface_repo": "/home/ubuntu/hf/hub/models--google--medgemma-1.5-4b-it/snapshots/91850547d9f0b2fdd21aa7c5f4f3d1a8a52c243b",
7
+ "huggingface_revision": null,
8
+ "safetensors_sha256": null,
9
+ "extracted_at": "2026-04-21T09:08:17.402243Z",
10
+ "larql_version": "0.2.0-python"
11
+ },
12
+ "num_layers": 34,
13
+ "hidden_size": 2560,
14
+ "intermediate_size": 2560,
15
+ "vocab_size": 256000,
16
+ "embed_scale": 1.0,
17
+ "extract_level": "features",
18
+ "dtype": "f16",
19
+ "has_model_weights": false,
20
+ "down_top_k": 10,
21
+ "layer_bands": {
22
+ "syntax": [
23
+ 0,
24
+ 8
25
+ ],
26
+ "knowledge": [
27
+ 8,
28
+ 25
29
+ ],
30
+ "output": [
31
+ 25,
32
+ 33
33
+ ]
34
+ },
35
+ "layers": [
36
+ {
37
+ "layer": 0,
38
+ "num_features": 128,
39
+ "offset": 0,
40
+ "length": 655360,
41
+ "num_experts": null,
42
+ "num_features_per_expert": null
43
+ },
44
+ {
45
+ "layer": 1,
46
+ "num_features": 128,
47
+ "offset": 655360,
48
+ "length": 655360,
49
+ "num_experts": null,
50
+ "num_features_per_expert": null
51
+ },
52
+ {
53
+ "layer": 2,
54
+ "num_features": 128,
55
+ "offset": 1310720,
56
+ "length": 655360,
57
+ "num_experts": null,
58
+ "num_features_per_expert": null
59
+ },
60
+ {
61
+ "layer": 3,
62
+ "num_features": 128,
63
+ "offset": 1966080,
64
+ "length": 655360,
65
+ "num_experts": null,
66
+ "num_features_per_expert": null
67
+ },
68
+ {
69
+ "layer": 4,
70
+ "num_features": 128,
71
+ "offset": 2621440,
72
+ "length": 655360,
73
+ "num_experts": null,
74
+ "num_features_per_expert": null
75
+ },
76
+ {
77
+ "layer": 5,
78
+ "num_features": 128,
79
+ "offset": 3276800,
80
+ "length": 655360,
81
+ "num_experts": null,
82
+ "num_features_per_expert": null
83
+ },
84
+ {
85
+ "layer": 6,
86
+ "num_features": 128,
87
+ "offset": 3932160,
88
+ "length": 655360,
89
+ "num_experts": null,
90
+ "num_features_per_expert": null
91
+ },
92
+ {
93
+ "layer": 7,
94
+ "num_features": 128,
95
+ "offset": 4587520,
96
+ "length": 655360,
97
+ "num_experts": null,
98
+ "num_features_per_expert": null
99
+ },
100
+ {
101
+ "layer": 8,
102
+ "num_features": 128,
103
+ "offset": 5242880,
104
+ "length": 655360,
105
+ "num_experts": null,
106
+ "num_features_per_expert": null
107
+ },
108
+ {
109
+ "layer": 9,
110
+ "num_features": 128,
111
+ "offset": 5898240,
112
+ "length": 655360,
113
+ "num_experts": null,
114
+ "num_features_per_expert": null
115
+ },
116
+ {
117
+ "layer": 10,
118
+ "num_features": 128,
119
+ "offset": 6553600,
120
+ "length": 655360,
121
+ "num_experts": null,
122
+ "num_features_per_expert": null
123
+ },
124
+ {
125
+ "layer": 11,
126
+ "num_features": 128,
127
+ "offset": 7208960,
128
+ "length": 655360,
129
+ "num_experts": null,
130
+ "num_features_per_expert": null
131
+ },
132
+ {
133
+ "layer": 12,
134
+ "num_features": 128,
135
+ "offset": 7864320,
136
+ "length": 655360,
137
+ "num_experts": null,
138
+ "num_features_per_expert": null
139
+ },
140
+ {
141
+ "layer": 13,
142
+ "num_features": 128,
143
+ "offset": 8519680,
144
+ "length": 655360,
145
+ "num_experts": null,
146
+ "num_features_per_expert": null
147
+ },
148
+ {
149
+ "layer": 14,
150
+ "num_features": 128,
151
+ "offset": 9175040,
152
+ "length": 655360,
153
+ "num_experts": null,
154
+ "num_features_per_expert": null
155
+ },
156
+ {
157
+ "layer": 15,
158
+ "num_features": 128,
159
+ "offset": 9830400,
160
+ "length": 655360,
161
+ "num_experts": null,
162
+ "num_features_per_expert": null
163
+ },
164
+ {
165
+ "layer": 16,
166
+ "num_features": 128,
167
+ "offset": 10485760,
168
+ "length": 655360,
169
+ "num_experts": null,
170
+ "num_features_per_expert": null
171
+ },
172
+ {
173
+ "layer": 17,
174
+ "num_features": 128,
175
+ "offset": 11141120,
176
+ "length": 655360,
177
+ "num_experts": null,
178
+ "num_features_per_expert": null
179
+ },
180
+ {
181
+ "layer": 18,
182
+ "num_features": 128,
183
+ "offset": 11796480,
184
+ "length": 655360,
185
+ "num_experts": null,
186
+ "num_features_per_expert": null
187
+ },
188
+ {
189
+ "layer": 19,
190
+ "num_features": 128,
191
+ "offset": 12451840,
192
+ "length": 655360,
193
+ "num_experts": null,
194
+ "num_features_per_expert": null
195
+ },
196
+ {
197
+ "layer": 20,
198
+ "num_features": 128,
199
+ "offset": 13107200,
200
+ "length": 655360,
201
+ "num_experts": null,
202
+ "num_features_per_expert": null
203
+ },
204
+ {
205
+ "layer": 21,
206
+ "num_features": 128,
207
+ "offset": 13762560,
208
+ "length": 655360,
209
+ "num_experts": null,
210
+ "num_features_per_expert": null
211
+ },
212
+ {
213
+ "layer": 22,
214
+ "num_features": 128,
215
+ "offset": 14417920,
216
+ "length": 655360,
217
+ "num_experts": null,
218
+ "num_features_per_expert": null
219
+ },
220
+ {
221
+ "layer": 23,
222
+ "num_features": 128,
223
+ "offset": 15073280,
224
+ "length": 655360,
225
+ "num_experts": null,
226
+ "num_features_per_expert": null
227
+ },
228
+ {
229
+ "layer": 24,
230
+ "num_features": 128,
231
+ "offset": 15728640,
232
+ "length": 655360,
233
+ "num_experts": null,
234
+ "num_features_per_expert": null
235
+ },
236
+ {
237
+ "layer": 25,
238
+ "num_features": 128,
239
+ "offset": 16384000,
240
+ "length": 655360,
241
+ "num_experts": null,
242
+ "num_features_per_expert": null
243
+ },
244
+ {
245
+ "layer": 26,
246
+ "num_features": 128,
247
+ "offset": 17039360,
248
+ "length": 655360,
249
+ "num_experts": null,
250
+ "num_features_per_expert": null
251
+ },
252
+ {
253
+ "layer": 27,
254
+ "num_features": 128,
255
+ "offset": 17694720,
256
+ "length": 655360,
257
+ "num_experts": null,
258
+ "num_features_per_expert": null
259
+ },
260
+ {
261
+ "layer": 28,
262
+ "num_features": 128,
263
+ "offset": 18350080,
264
+ "length": 655360,
265
+ "num_experts": null,
266
+ "num_features_per_expert": null
267
+ },
268
+ {
269
+ "layer": 29,
270
+ "num_features": 128,
271
+ "offset": 19005440,
272
+ "length": 655360,
273
+ "num_experts": null,
274
+ "num_features_per_expert": null
275
+ },
276
+ {
277
+ "layer": 30,
278
+ "num_features": 128,
279
+ "offset": 19660800,
280
+ "length": 655360,
281
+ "num_experts": null,
282
+ "num_features_per_expert": null
283
+ },
284
+ {
285
+ "layer": 31,
286
+ "num_features": 128,
287
+ "offset": 20316160,
288
+ "length": 655360,
289
+ "num_experts": null,
290
+ "num_features_per_expert": null
291
+ },
292
+ {
293
+ "layer": 32,
294
+ "num_features": 128,
295
+ "offset": 20971520,
296
+ "length": 655360,
297
+ "num_experts": null,
298
+ "num_features_per_expert": null
299
+ },
300
+ {
301
+ "layer": 33,
302
+ "num_features": 128,
303
+ "offset": 21626880,
304
+ "length": 655360,
305
+ "num_experts": null,
306
+ "num_features_per_expert": null
307
+ }
308
+ ],
309
+ "model_config": {
310
+ "model_type": "gemma3",
311
+ "hidden_size": 2560,
312
+ "moe": {
313
+ "num_experts": 1,
314
+ "top_k": 1,
315
+ "moe_intermediate_size": 2560,
316
+ "aggregated_features": 128,
317
+ "aggregation": "router_weighted_svd"
318
+ }
319
+ },
320
+ "checksums": {
321
+ "gate_vectors.bin": "6e4c8dc7eb32239504ecdc90455e979e193655d2aaef298fe39362da20e7b449",
322
+ "embeddings.bin": "278ab70f2bbd880bd6300d73bb2f8c161dd531fc6df3b3490348c72dacf991de",
323
+ "norms.bin": "17e5161c2a98abe0a66748aa9d256972fec6f049cf8b9db1c3962fedcd942835",
324
+ "down_features.bin": "6f16a515fa02c81a5dad2210bdd6d9ccf84840b6d7c7fd84a7d6ea3bed001143",
325
+ "down_meta.bin": "fbbbf54f41beb5f0450b99a8df5516a56a3d5f94be10625b70ba56a5f491d5ac"
326
+ }
327
+ }
manifest.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "vindexSha256": "75eeb232eacc960fd77a2889ae9534511ad37858955bc40d656b1a69790cf9eb",
3
+ "shortId": "75eeb232",
4
+ "baseModel": "google/medgemma-1.5-4b-it",
5
+ "extractLevel": "features",
6
+ "f16": true,
7
+ "totalBytes": 1388328156,
8
+ "files": {
9
+ "gate_vectors.bin": "6e4c8dc7eb32239504ecdc90455e979e193655d2aaef298fe39362da20e7b449",
10
+ "embeddings.bin": "278ab70f2bbd880bd6300d73bb2f8c161dd531fc6df3b3490348c72dacf991de",
11
+ "norms.bin": "17e5161c2a98abe0a66748aa9d256972fec6f049cf8b9db1c3962fedcd942835",
12
+ "down_features.bin": "6f16a515fa02c81a5dad2210bdd6d9ccf84840b6d7c7fd84a7d6ea3bed001143",
13
+ "down_meta.bin": "fbbbf54f41beb5f0450b99a8df5516a56a3d5f94be10625b70ba56a5f491d5ac",
14
+ "router_weights.bin": "4f669d46559ca1da48b4f2414927d85ad19af0054791a506d84a90271614db87"
15
+ }
16
+ }
norms.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:17e5161c2a98abe0a66748aa9d256972fec6f049cf8b9db1c3962fedcd942835
3
+ size 701440
router_weights.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4f669d46559ca1da48b4f2414927d85ad19af0054791a506d84a90271614db87
3
+ size 174148