gary-neuron-emergent
The dispatcher cortex behind neuron-db β now a ~6.97M-parameter, pure-NumPy GPT that reads its context window and decides what to do with each turn. It's the optional "thinking" tier on top of the neuron-db store: the store retrieves a small working set, this cortex routes over it.
π Live site & in-browser demos: https://gary23w.github.io/neuron-db/
v5 β a larger, better-grounded model than the original 1.13M emergent checkpoint. Same pure-NumPy format; new weights, tokenizer, and config.
What it does β one route per turn
| route | meaning |
|---|---|
ANSWER <value> |
the working set has it β copy the value straight out |
ESCALATE |
the working set can't answer β hand up to a larger host model |
FETCH <topic> |
a live-world question β go to the web |
STORE <fact> |
a declarative β remember it |
It also chains multi-fact lookups (e.g. "who is the manager of Aurora's owner?").
What it is
- arch: gpt-numpy β 8 layers, 8 heads, dim 256, 512-token context, vocab 2048
- params: 6,973,952
- trained: ~23,784 steps (12 epochs), final train loss 0.006
Held-out results
Fresh values (a different random stream than training), greedy decode. ANSWER accuracy vs. the size of the working set β the hard axis, since more facts in the window means more distractors:
| working set | 1 | 6 | 12 | 18 |
|---|---|---|---|---|
| ANSWER correct | 100% | 96% | 98% | 76% |
Route-action accuracy: ESCALATE 96% Β· FETCH 100% Β· STORE 100% Β· multi-hop ANSWER 100%.
Files
cortex.npz float32 weights (P/ keys; gpt-numpy layout)
petite_vocab.json byte-level BPE vocab (2048)
petite_merges.txt BPE merges
config.json E/H/L/BLK, vocab, param count, trained step
Use
from huggingface_hub import snapshot_download
import numpy as np
d = snapshot_download("gary23w/gary-neuron-emergent")
z = np.load(f"{d}/cortex.npz")
P = {k[2:]: z[k] for k in z.files if k.startswith("P/")} # tensor name -> float32 array
CFG = dict(E=256, H=8, L=8, BLK=512, vocab=2048)
# pre-LN transformer, learned positions, tanh-GELU, weight-tied output;
# greedy-decode over a "U: <fact>\nG: noted.\n... U: <question>\nG:" working set.
The same weights also ship int8-quantized and baked into the neuron-db Rust crate (and its WASM
build), where route() does recall + dispatch in one call β see the repo.
Honest notes
- Trained on a ~2k everyday-token vocabulary in a
U:/G:fact format. It excels at routing and copying normalized facts out of a bounded window; it is not a general chatbot. - Large working sets are the hard axis: ANSWER accuracy stays high to ~12 facts and tapers by ~18.
- For exact recall you don't need the model at all β neuron-db's store returns the value deterministically. The cortex is for the route decision and generation/association over the set.
MIT. Part of the gary-neuron family by gary23w. Site: https://gary23w.github.io/neuron-db/
- Downloads last month
- 19