ToricBLM: Thought Alcoves, Tropical Attention, and Small Models With Serious Geometry

Community Article
Published June 19, 2026

BPB


TLDR: ToricBLM presents a fundamental evolution of the vanilla transformer, with nothing more than a clever new tokenization method that tokenizes edges in graphs along with the vertices, and subsequent orthonormal positional encodings that encode adjacency and other graph structure and properties, systematically extending it into a standardized, universal graph-to-graph function approximator capable of natively reasoning over arbitrary data structures, multimodal inputs, boolean circuits, and complex graph grammars. The framework critically addresses the limitations of current scalar alignment methodologies by rigorously generalizing Anthropic’s one-dimensional "assistant axis" into a multidimensional, geometrically bounded latent space; this structural shift allows for the precise disentanglement of competing scientific constraints within highly optimized, isolated "Thought Alcoves."

By embedding deep algebraic and topological priors into a remarkably compact 100M-parameter biomolecular model—specifically utilizing tropical attention for sparse, auditable inference, toric sidecars for algebraic certificates, and persistent homology to capture invariant structural memory—ToricBLM actively mitigates the representation collapse inherent in standard softmax attention and the brittleness of linear single-path chain-of-thought. Instead, it employs a robust, GFlowNet-guided "Forest-of-Thought" to navigate vast solution spaces, yielding profound, mathematically provable increases in both model expressiveness and out-of-distribution (OOD) generalization capacity. For researchers skeptical of the diminishing returns of brute-force parameter scaling and seeking a rigorous paradigm for machine intelligence, this blog post and its accompanying preprint provide a highly compelling, meticulously engineered blueprint for achieving true universal reasoning in targeted scientific discovery.


What would it take for a small biomolecular model to reason like a careful protein engineer, a medicinal chemist, a structural biologist, and a mathematically trained search agent at the same time?

That is the motivating question behind ToricBLM, a proposed 100M-parameter biomolecular language model for de novo design, scientific reasoning, dynamics-aware generation, and compact edge deployment. The short version is simple:

  • ToricBLM is a small transformer.
  • Its inputs are not only strings, but graphs.
  • Its reasoning is not only a chain, but a forest.
  • Its steering is not only a vector, but a geometry.
  • Its memory is not only retrieval, but structured analogy.

The longer version is the fun part.

ToricBLM takes inspiration from UniGenX-style multimodal scientific generation, TokenGT-style graph tokenization, GraphCG-style disentangled latent bases, GFlowNets, flow matching, tropical geometry, toric geometry, persistent homology, sheaves, and algebraic/combinatorial tools that are usually not invited into machine learning architecture meetings nearly often enough. It is designed to be deceptively small: a model that can run in medical or research settings with modest hardware, while hiding a substantial amount of geometry and training discipline under the hood.

The goal is not to build a giant model that memorizes all of biology. The goal is to build a compact biomolecular reasoning agent that knows how to propose, critique, refine, validate, and remember.

This post explains the main ideas behind ToricBLM in a didactic way. It is meant for ML researchers, Hugging Face builders, computational biologists, medicinal chemists, protein designers, mathematicians, and anyone curious about what happens when we stop treating geometry as decorative and start treating it as an operating system for reasoning.

1. What Is ToricBLM?

ToricBLM is a proposed biomolecular language model with the following core ingredients:

  • a decoder-only transformer backbone near 100M parameters,
  • TokenGT-style vertex and edge tokens for scientific graphs,
  • orthonormal endpoint encodings for graph incidence,
  • tropical attention heads for hard support decisions,
  • toric sidecars for finite algebraic/geometric certificates,
  • GraphCG-style disentangled basis training,
  • embedding-space Forest-of-Thought reasoning with GFlowNets,
  • analogical memory retrieval over vectors, graphs, topology, and certificates,
  • flow-matching heads for continuous structures and dynamics,
  • optional UMA/PITA-inspired oracle calibration for force and energy feedback,
  • a validation ladder using modern structure, affinity, dynamics, and atomistic models.

The model is called ToricBLM because it is a biomolecular language model whose internal reasoning geometry is organized using tropical and toric ideas. Its intended modalities include:

  • proteins,
  • antibodies,
  • minibinders,
  • enzymes,
  • RNA,
  • DNA,
  • small molecules,
  • materials,
  • protein-ligand complexes,
  • functional annotations,
  • property annotations,
  • assay facts,
  • reasoning trajectories,
  • structural and dynamical states.

The simplest input record looks like this:

r=(μ,S,X,A,G,M,T,Y,Π). r=(\mu,S,X,A,G,M,T,Y,\Pi).

Here μ\mu is the modality, SS is a symbolic stream, XX is optional continuous data such as coordinates or forces, AA is an annotation dictionary, GG is a typed graph, MM is memory, TT is an optional trajectory, YY is the training target, and Π\Pi records provenance, licensing, leakage policy, and safety constraints.

The important point is that ToricBLM does not pretend biomolecules are just text.

A protein is a sequence, yes. But it is also a contact graph, a motif graph, a domain graph, an evolutionary family, an assay object, a structure, a set of dynamical states, and a collection of scientific claims with evidence. A molecule is a SMILES string, yes. But it is also an atom-bond graph, a conformer ensemble, a scaffold, a pharmacophore, a synthesis object, and a property vector. A material is a formula, but also a lattice, a periodic graph, a relaxation problem, and a phase-stability claim.

So ToricBLM uses a unified stream, but augments it with graph tokens. A vertex token might represent a residue, atom, nucleotide, material site, claim, memory item, or reasoning state. An edge token might represent a bond, contact, base pair, coevolution link, retrieval link, edit step, evidence relation, periodic neighbor, or contradiction.

The TokenGT-style representation is:

zvi=eτ(i)+ϕ(xi)+2pi+ui, z_{v_i}=e_{\tau(i)}+\phi(x_i)+2p_i+u_i,

for a vertex, and

zeij(r)=eτ(r)+ψ(xi,xj,r)+pi+pj+ωr(pjpi)+uij(r) z_{e_{ij}^{(r)}}=e_{\tau(r)}+\psi(x_i,x_j,r)+p_i+p_j+\omega_r(p_j-p_i)+u_{ij}^{(r)}

for an edge. The vectors pip_i are endpoint identifiers. If they are orthonormal, then incidence is visible by inner product:

2pk,pi+pj=2(1k=i+1k=j). \langle 2p_k,p_i+p_j\rangle=2(\mathbf{1}_{k=i}+\mathbf{1}_{k=j}).

That little equation is a quiet but important design choice. It means the transformer does not need to infer from scratch which vertices are attached to which edges. The geometry tells it.

Inspirations

ToricBLM borrows from several families of models and methods:

  • UniGenX: unified scientific generation over proteins, molecules, materials, structure, and function.
  • NatureLM / scientific foundation models: cross-domain scientific tokenization and instruction mixtures.
  • TokenGT: pure transformer graph learning by tokenizing both vertices and edges.
  • GraphCG: disentangling steerable graph-generation factors.
  • GFlowNets: sampling diverse high-reward objects through trajectory-balance training.
  • Flow matching: fast continuous generation for coordinates, conformers, dynamics, and structures.
  • UMA: a universal model for atoms, useful as a force/energy oracle in late-stage calibration.
  • PITA: temperature-annealed force/energy-only sampling ideas that suggest how to use expensive atomistic oracles judiciously.
  • Tropical and toric geometry: hard support, polyhedral cells, fans, divisors, cones, and algebraic certificates.
  • Persistent homology, sheaves, and combinatorial commutative algebra: compact signatures of shape, overlap consistency, and constraint structure.

The result is not just “a protein model with RAG.” It is a small model trained to move through structured scientific design spaces.

2. Tropical Geometry in One Friendly Page

Tropical geometry begins with a strange but useful arithmetic. In the max-plus tropical semiring, addition is replaced by maximum and multiplication is replaced by ordinary addition:

ab=max(a,b),ab=a+b. a\oplus b=\max(a,b),\qquad a\odot b=a+b.

At first this looks like a mathematical parlor trick. Then one notices that many neural operations already behave this way at important moments.

When a model chooses the strongest retrieved document, the most likely contact edge, the best reasoning branch, the lowest edit cost, the nearest analog, or the dominant attention support, it is doing something tropical in spirit. It is making a hard support decision.

Softmax attention is diffuse:

softmax(z)i=exp(zi)jexp(zj). \operatorname{softmax}(z)_i=\frac{\exp(z_i)}{\sum_j\exp(z_j)}.

Tropical attention is decisive:

yi=maxj{sij+vj}. y_i=\max_j\{s_{ij}+v_j\}.

Softmax says, “blend many pieces of evidence.” Tropical attention says, “tell me which piece of evidence is active.”

Both are useful. ToricBLM uses both.

The tropical view becomes especially valuable when the model must explain why it believes something. If an enzyme design is scored highly because of a catalytic triad, a substrate pocket, a transition-state analog, and a retrieved homolog, we want to know which supports are active. If a minibinder looks good because of a particular interface patch, we want that patch to be auditable. If a molecule is optimized for affinity but quietly violates ADMET constraints, we want the hard contradiction to be visible.

Tropical heads provide margins. If the active candidate is jj^\star, define:

γ=zjmaxjjzj. \gamma=z_{j^\star}-\max_{j\neq j^\star}z_j.

If quantization, retrieval approximation, or numerical noise perturbs all scores by at most ϵ\epsilon, then the active support stays fixed whenever:

γ>2ϵ. \gamma>2\epsilon.

This gives a practical engineering metric. A tropical margin is not merely pretty geometry. It tells us when a support decision is stable under compression, pruning, low precision, or approximate retrieval.

3. Why Embed Tropical Geometry Into Toric Geometry?

Tropical geometry gives piecewise-linear structure: maxima, cones, fans, cells, hard supports, shortest paths, and polyhedral decompositions.

Toric geometry gives algebraic and combinatorial tools for working with that structure: toric varieties, divisors, Cox coordinates, one-dimensional cones, monomial maps, fans, initial ideals, semigroups, and compactifications.

The practical reason to embed tropical structure into toric structure is that tropical geometry tells us which support is active, while toric geometry gives us a richer language for organizing, comparing, transporting, and auditing those supports.

For a finite max-linear probe, suppose we have candidates:

r(u)=ar,u+br. \ell_r(u)=\langle a_r,u\rangle+b_r.

We can form a Laurent polynomial over a non-Archimedean field:

f=r=1Rτbrχar. f=\sum_{r=1}^R \tau^{-b_r}\chi^{a_r}.

With the right sign convention, the active tropical candidate corresponds to an initial term of ff. That gives a bridge:

  • neural max support,
  • tropical initial form,
  • toric/algebraic certificate.

ToricBLM uses finite toric sidecars on small audited windows:

  • mutation semigroups,
  • reaction-route semigroups,
  • residue-contact supports,
  • molecular scaffold edits,
  • base-pair motif complexes,
  • lattice-site moves,
  • reasoning-action traces.

The student model receives compact labels:

  • cone identifiers,
  • Cox-degree summaries,
  • active support ids,
  • forbidden nonfaces,
  • Hilbert-basis move ids,
  • Betti vectors,
  • rank-defect warnings,
  • local cohomology warnings.

This is the central compromise: exact geometry in the teacher stack, compressed geometry in the student.

4. Anthropic’s Assistant Axis: Useful, But Not Enough

Anthropic’s “Assistant Axis” work studies a direction in activation space corresponding to whether a model is operating in its default helpful assistant persona. Their paper, The Assistant Axis: Situating and Stabilizing the Default Persona of Language Models, argues that models have a leading persona direction that can help monitor and stabilize default assistant behavior.

This is important. It says there is structure in activation space. It also says model behavior can drift, and that white-box directions can help detect or constrain that drift.

But one axis is a very small instrument.

A single assistant direction might help answer questions like:

  • Is the model acting like its default assistant persona?
  • Is it drifting toward unusual roleplay?
  • Is it becoming less helpful, less harmless, or more theatrical?

Those are useful questions. They are not enough for scientific design.

ToricBLM needs to ask much harder questions:

  • Is this enzyme design preserving its catalytic geometry?
  • Is the apparent activity improvement actually coming from a stability artifact?
  • Did the binder improve affinity while losing specificity?
  • Is this antibody moving into an aggregation-prone alcove?
  • Is the model hallucinating an EC annotation because retrieval support is stale?
  • Did a low-precision cache flip the active evidence edge?
  • Is a reasoning branch improving because of a real analogy or because of superficial token overlap?
  • Which local mechanism changed when the design jumped from one solution family to another?

These are not one-axis questions. They are questions about cells, cones, supports, graphs, analogies, topological signatures, and trajectories.

That is why ToricBLM uses a richer geometry. The Assistant Axis is a useful coordinate. It is not a coordinate system.

5. Thought and Reasoning Alcoves

This is the heart of ToricBLM.

A thought alcove is a stable region in the model’s embedding space where the same scientific and reasoning supports remain active.

Informally, a thought alcove is a place where the model is “thinking in the same mode.” Inside one alcove, the same catalytic residue, the same retrieved assay, the same binder hotspot, the same base-pair register, the same material coordination shell, or the same reasoning analogy remains active. Crossing an alcove wall changes the mechanism.

This is different from saying the model has “a direction” for helpfulness or truthfulness. Directions are useful, but they are too poor for the combinatorial richness of scientific reasoning. A model designing an enzyme is not merely moving toward “helpful.” It is moving through a structured design landscape with discrete supports and continuous refinements.

Definition: Thought Alcove

Let zRdz\in\mathbb{R}^d be a pooled embedding state after GraphCG projection. Let H\mathcal{H} be a hyperplane arrangement induced by:

  • active tropical comparisons,
  • oracle thresholds,
  • GraphCG basis signs,
  • toric cone labels,
  • validation inequalities,
  • safety and provenance constraints.

Write:

H={α,z+β=0:=1,,L}. \mathcal{H}=\{\langle \alpha_\ell,z\rangle+\beta_\ell=0:\ell=1,\dots,L\}.

The thought alcoves are connected components of:

Rd=1LH. \mathbb{R}^d\setminus \bigcup_{\ell=1}^L H_\ell.

In ToricBLM, these cells are refined by a toric fan Σ\Sigma, GraphCG basis chambers, and validator states.

A design thought alcove is a tuple:

A=(cell,σA,ρA,ηA,χA). A=(\operatorname{cell},\sigma_A,\rho_A,\eta_A,\chi_A).

Here:

  • σAΣ\sigma_A\in\Sigma is a toric cone,
  • ρA\rho_A is the vector of active tropical supports,
  • ηA\eta_A is the GraphCG basis sign pattern,
  • χA\chi_A is the validator state.

The alcove is admissible only if chemistry, structure, safety, and provenance constraints are satisfied.

Why Alcoves Matter

Suppose ToricBLM is designing an enzyme. It has a candidate sequence, a predicted fold, a candidate active site, a substrate, a retrieved family, and a catalytic reward. The model may improve the design by changing residues around the pocket. Some mutations preserve the catalytic story. Others break it and create a different mechanism.

Inside one thought alcove, the model can do local optimization:

  • improve stability,
  • reduce aggregation,
  • tune pocket volume,
  • improve expression,
  • reduce off-target motifs,
  • preserve catalytic residues.

Crossing an alcove wall is different. It might switch:

  • which residue acts as the acid/base,
  • which loop controls specificity,
  • which homolog is the main analogy,
  • which ligand pose is active,
  • which interface patch anchors a minibinder,
  • which base-pair register defines an RNA motif.

This gives the model a language for a distinction expert scientists already make intuitively:

Are we improving the same idea, or did we change the idea?

Thought alcoves make that distinction trainable.

Theorem: Alcove-Safe Steering

Let AA be a thought alcove defined by signed inequalities:

s(α,z+β)>0. s_\ell(\langle\alpha_\ell,z\rangle+\beta_\ell)>0.

Assume every inequality has margin at least γ>0\gamma>0 at zz. If an update Δz\Delta z satisfies:

maxα,Δzγ/2, \max_\ell |\langle\alpha_\ell,\Delta z\rangle|\leq \gamma/2,

then z+Δzz+\Delta z remains in the same alcove.

Proof. Each signed inequality starts with value at least γ\gamma. The update changes each signed value by at most γ/2\gamma/2. Therefore every signed value remains positive. Since tropical support changes occur only on equality walls, the active supports remain unchanged. \square

This theorem is small but powerful. It lets us steer the model while preserving mechanism.

If a latent edit stays inside the alcove, we can say:

  • the same evidence edge stayed active,
  • the same catalytic support stayed active,
  • the same analogy stayed active,
  • the same validator state stayed active.

If it crosses a wall, we know something discrete changed. That wall crossing can be allowed, but it should be logged, scored, and justified.

Forest-of-Thought Through Alcoves

ToricBLM does not use only one chain of reasoning. It uses a forest:

F=(T1,,TB). \mathfrak{F}=(\mathcal{T}_1,\ldots,\mathcal{T}_B).

Each tree can specialize:

  • catalysis,
  • affinity,
  • specificity,
  • foldability,
  • dynamics,
  • developability,
  • safety,
  • evidence.

A node in the forest is:

qb,t=(hb,t,Gb,t,Mb,t,Cb,t,ab,t,ξb,t). q_{b,t}=(h_{b,t},G_{b,t},M_{b,t},C_{b,t},a_{b,t},\xi_{b,t}).

That means the reasoning state includes an embedding, a graph, memory, certificate state, the latest action, and a role.

The GFlowNet reward for a forest trajectory T\Tau can be written:

logRF(T)=λcatΔlog(kcat/KM)+λbind(ΔGbind)+λspecSspecificity+λfoldSfold+λdynSensemble+λdevSdevelopability+λnovSnovelty+λvalSvalidatorλriskSriskλcostC(T). \begin{aligned} \log R_F(\Tau)=& \lambda_{\text{cat}}\Delta\log(k_{\text{cat}}/K_M) +\lambda_{\text{bind}}(-\Delta G_{\text{bind}}) +\lambda_{\text{spec}}S_{\text{specificity}}\\ &+\lambda_{\text{fold}}S_{\text{fold}} +\lambda_{\text{dyn}}S_{\text{ensemble}} +\lambda_{\text{dev}}S_{\text{developability}}\\ &+\lambda_{\text{nov}}S_{\text{novelty}} +\lambda_{\text{val}}S_{\text{validator}} -\lambda_{\text{risk}}S_{\text{risk}} -\lambda_{\text{cost}}C(\Tau). \end{aligned}

The model is not merely rewarded for sounding plausible. It is rewarded for producing designs that survive structure, dynamics, affinity, specificity, novelty, and safety checks.

Theorem: Forest Trajectory Balance

Assume every activated tree and merge edge belongs to an acyclic augmented state graph. If forward and backward policies satisfy trajectory balance:

logZθ+eTlogPF(e)eTlogPB(e)logRF(T)=0, \log Z_\theta+\sum_{e\in\Tau}\log P_F(e)-\sum_{e\in\Tau}\log P_B(e)-\log R_F(\Tau)=0,

then the terminal probability of a design yy is proportional to the total reward of all forest trajectories ending at yy:

P(y)=T:term(T)=yRF(T)yT:term(T)=yRF(T). P(y)= \frac{\sum_{\Tau:\operatorname{term}(\Tau)=y}R_F(\Tau)} {\sum_{y'}\sum_{\Tau:\operatorname{term}(\Tau)=y'}R_F(\Tau)}.

Proof. Trajectory balance gives:

ZθPF(T)=RF(T)PB(Ty). Z_\theta P_F(\Tau)=R_F(\Tau)P_B(\Tau\mid y).

Sum over all complete forest trajectories ending at yy. The backward probabilities over histories ending at yy sum to one. Therefore:

ZθP(y)=T:term(T)=yRF(T). Z_\theta P(y)=\sum_{\Tau:\operatorname{term}(\Tau)=y}R_F(\Tau).

Normalizing over terminal designs gives the result. \square

GraphCG and Steering

GraphCG-style training gives ToricBLM a disentangled basis:

D=[d1dk]. D=\begin{bmatrix} d_1\\ \vdots\\ d_k \end{bmatrix}.

Each direction can correspond to a factor:

  • stability,
  • solubility,
  • catalytic geometry,
  • pocket volume,
  • affinity,
  • specificity,
  • expression,
  • aggregation,
  • immunogenicity,
  • RNA secondary structure,
  • molecular polarity,
  • material stability.

The basic losses are:

Lortho=DDIF2, \mathcal{L}_{\text{ortho}}=\|DD^\top-I\|_F^2,

Lsparse=D1, \mathcal{L}_{\text{sparse}}=\|D\|_1,

Lrank=1ri=1r[ησi(D)]+2. \mathcal{L}_{\text{rank}}=\frac{1}{r}\sum_{i=1}^r[\eta-\sigma_i(D)]_+^2.

Inside an alcove, steering can be:

z=ΠAψ(z+ηDw). z'=\Pi_A^\psi(z+\eta D^\top w).

Here ww encodes the desired property shift and ΠAψ\Pi_A^\psi projects the update back into the alcove.

This is where ToricBLM differs sharply from simple activation steering. We are not saying “push the model in the assistant direction.” We are saying:

  • choose a cell,
  • know its active supports,
  • choose disentangled directions,
  • steer within margins,
  • log wall crossings,
  • use validators,
  • distill successful trajectories.

That is a much more serious control theory for neural behavior.

A Friendly but Firm Complaint

The current interpretability discourse is still too fond of one-dimensional stories.

One axis for refusal. One vector for truthfulness. One direction for helpfulness. One persona coordinate. These are not useless. They are useful first instruments. But if we are going to train models that reason about medicine, enzymes, molecular interactions, dynamics, assays, and design under uncertainty, one-dimensional steering is not enough.

The field needs more mathematicians in the room.

Not just people borrowing the word “manifold” as a vibe. Actual tropical geometers. Actual toric geometers. People who know what a fan is, what a divisor is, what a Cox ring is, what a Schubert problem feels like, why Young tableaux are combinatorial machines rather than antique notation, and how intersection-theoretic thinking can become a training signal. People who can look at a model’s branching reasoning and ask: what are the cells, what are the walls, what are the multiplicities, what are the degenerations, what are the invariants?

This is not academic ornamentation. It is engineering.

If a model crosses from one reasoning alcove into another, that is a discrete event. If two design paths are equivalent under a semigroup relation, that should be a training constraint. If a family of supports degenerates, that should be measured. If a retrieval graph has persistent holes, that is not just a retrieval bug; it is a topological signal. If a set of constraints has syzygies, those relations should not be thrown away because they look too algebraic for a leaderboard.

Modern ML does not need less abstraction. It needs better abstraction, disciplined enough to compile into losses, features, metrics, audits, and cheap student-side surrogates.

ToricBLM is one attempt at that.

6. Analogical Memory Beyond RAG

The second major idea in ToricBLM is analogical memory.

RAG was a necessary stage in model development. It taught us that models should not rely only on weights. They should retrieve. They should cite. They should ground. But ordinary RAG is a blunt instrument:

  • retrieve chunks,
  • stuff tokens into context,
  • hope the model reads them correctly.

That works for many tasks. It is not enough for deep scientific reasoning.

Scientific analogy is not just “find a nearby paragraph.” It is more like:

  • find a protein family with similar active-site geometry,
  • find a previous mutation path with the same failure mode,
  • find a molecule scaffold whose ADMET tradeoff resembles this one,
  • find an antibody CDR loop with analogous interface topology,
  • find a material coordination pattern with a matching local lattice environment,
  • find a reasoning trajectory whose graph has the same shape.

So ToricBLM memory records are structured packets:

M=(e,Γ,β,ρ,,μ,π). M=(e,\Gamma,\beta,\rho,\partial,\mu,\pi).

Here:

  • ee is an embedding,
  • Γ\Gamma is a graph signature,
  • β\beta is topological data, such as persistent homology,
  • ρ\rho is provenance,
  • \partial is a chain or differential summary,
  • μ\mu is multiplicative or cochain data,
  • π\pi is policy and licensing metadata.

Retrieval becomes:

S(q,M)=αcos(eq,eM)+βcos(Γq,ΓM)+γcos(βq,βM)+δSprovλSstale. S(q,M)= \alpha\cos(e_q,e_M) +\beta\cos(\Gamma_q,\Gamma_M) +\gamma\cos(\beta_q,\beta_M) +\delta S_{\text{prov}} -\lambda S_{\text{stale}}.

This is a different philosophy from dumping markdown files into a vector database. The model stores high-value reasoning trajectories, graph signatures, topological summaries, and provenance-linked analogies.

Successful Trajectories as Optimization Templates

Analogical memory becomes especially powerful during iterative de novo design.

Suppose ToricBLM is trying to improve a new candidate xnewx_{\text{new}} for a collection of properties:

  • higher catalytic efficiency,
  • better binding affinity,
  • improved specificity,
  • lower aggregation,
  • better solubility,
  • lower strain energy,
  • improved thermal stability,
  • safer off-target profile.

The model may not need to solve the optimization from scratch. It can retrieve a previous Forest-of-Thought trajectory that successfully optimized a different candidate xoldx_{\text{old}} for a similar property bundle.

That retrieved trajectory is not copied blindly. It is used as a template, initial bias, or search prior.

Let a successful stored trajectory be:

τ=(q0,a0,q1,a1,,qT), \tau^\star=(q_0,a_0,q_1,a_1,\ldots,q_T),

where each qtq_t is a reasoning/design state and each ata_t is an action such as mutate, retrieve, fold, dock, relax, verify, reject, or stop. The trajectory also stores:

  • which thought alcoves were visited,
  • which walls were crossed,
  • which GraphCG directions were useful,
  • which validators improved,
  • which branches failed,
  • which evidence sources mattered,
  • which final design properties improved.

For a new candidate, ToricBLM retrieves τ\tau^\star when a similarity score is high:

sim(xnew,τ)=αSembed+βSgraph+γSalcove+δSproperty+ηSvalidator. \operatorname{sim}(x_{\text{new}},\tau^\star) = \alpha S_{\text{embed}} +\beta S_{\text{graph}} +\gamma S_{\text{alcove}} +\delta S_{\text{property}} +\eta S_{\text{validator}}.

The retrieved trajectory then biases the new GFlowNet policy:

logPF(aq,xnew)=logPF0(aq,xnew)+λτBτ(a,q). \log P_F(a\mid q,x_{\text{new}}) = \log P_F^{0}(a\mid q,x_{\text{new}}) +\lambda_{\tau}\,B_{\tau^\star}(a,q).

Here PF0P_F^{0} is the model’s base forward policy, while BτB_{\tau^\star} is a trajectory-template bias. It can encourage actions that were useful in the old optimization, discourage actions that failed, and suggest an order of operations.

For example:

  • if an old enzyme optimization succeeded by first repairing a pocket wall and only then tuning active-site electrostatics, the new enzyme search can try that order earlier;
  • if a minibinder optimization succeeded by rejecting high-affinity but low-specificity branches, the new binder search can inherit that skepticism;
  • if an antibody optimization improved developability by staying inside one CDR-shape alcove while steering solubility, the new antibody search can reuse that pattern;
  • if a molecule optimization learned that increasing affinity and lowering logP conflicted in one scaffold family, the new molecule search can test whether the same conflict appears in the new scaffold.

This is a large improvement over ordinary RAG. The model is not retrieving an answer. It is retrieving an optimization strategy.

Non-Orthogonal Property Bundles

The desired property list may or may not be orthogonal in GraphCG space.

If DD is the GraphCG basis and ww is the desired property vector, a steering update might look like:

Δz=Dw. \Delta z=D^\top w.

If the relevant directions are orthogonal, life is pleasant:

DDI. DD^\top\approx I.

But biology is rarely that courteous. Catalytic activity may trade off with stability. Affinity may trade off with specificity. Solubility may trade off with membrane permeability. Lower energy may trade off with novelty. A desired property bundle can contain entangled directions:

GD=DDI. G_D=DD^\top\neq I.

ToricBLM should therefore retrieve not only trajectories that optimized one property, but trajectories that navigated similar property conflicts.

The transferable object is not merely a direction dkd_k. It is a local policy:

πτ:(alcove,property conflict,validator state)next useful action. \pi_{\tau^\star}: (\text{alcove},\text{property conflict},\text{validator state}) \longrightarrow \text{next useful action}.

This matters. If the old trajectory discovered that improving p1p_1 damages p2p_2 unless the model first crosses a particular alcove wall, that is precious information. The new candidate may live in a different sequence, scaffold, fold, or modality, but the shape of the optimization problem can still transfer.

A Template Is Not a Command

The retrieved FoT collection should bias, not dominate.

ToricBLM uses the memory trajectory as a prior, then lets validators and GFlowNet rewards decide whether it actually transfers. The policy can be written:

PFadapted(aq)PF0(aq)exp(λτBτ(a,q))exp(λRRlocal(a,q)). P_F^{\text{adapted}}(a\mid q) \propto P_F^{0}(a\mid q) \exp(\lambda_{\tau}B_{\tau^\star}(a,q)) \exp(\lambda_R R_{\text{local}}(a,q)).

If the template helps, it will generate better branches and receive higher reward. If it misleads, oracle scores, support flips, failed validators, and low local reward will suppress it.

That is the right epistemic attitude for scientific memory:

  • analogies are useful,
  • analogies are dangerous,
  • good models should exploit them,
  • good models should also distrust them.

This gives ToricBLM a natural optimization loop:

  1. propose a de novo candidate,
  2. retrieve successful FoT trajectories with similar property geometry,
  3. adapt their action patterns as search priors,
  4. steer inside thought alcoves when mechanisms should be preserved,
  5. cross walls deliberately when a mechanism change is justified,
  6. score with structure, dynamics, energy, evidence, and safety validators,
  7. store the new successful or failed trajectory back into memory.

Over time, the memory becomes less like a document store and more like a library of design moves.

From Knowledge Graphs to Graphs of Vectors

Knowledge graphs are useful, but many are still trapped in token-space or symbolic-space thinking. They encode entities and relations, but they often do not encode the geometry of reasoning.

ToricBLM suggests a different object: a graph-of-vectors database.

In a graph-of-vectors database:

  • nodes have embeddings,
  • edges have embeddings,
  • subgraphs have embeddings,
  • trajectories have embeddings,
  • persistence summaries are attached,
  • toric cone codes are attached,
  • provenance and license metadata are first-class,
  • retrieval can happen at the level of objects, paths, motifs, or reasoning episodes.

A knowledge graph says:

Enzyme A catalyzes reaction B.

A graph-of-vectors memory can say:

This enzyme-reaction-support trajectory has a catalytic geometry analogous to that earlier family, but its pocket-volume alcove differs, and its DMS evidence suggests the stabilizing mutation should come from a different branch.

That is a much richer retrieval object.

Category Theory View

Analogical reasoning can be described categorically without becoming mystical.

Let C\mathcal{C} be a category of scientific objects: proteins, molecules, RNA motifs, materials, assays, or reasoning states. A morphism f:XYf:X\to Y is a structure-preserving transformation: mutation, alignment, scaffold edit, fold mapping, reaction mapping, analogy, or evidence transfer.

An analogy is not just a nearest neighbor. It is a partially structure-preserving map:

F:CsourceCtarget. F:\mathcal{C}_{\text{source}}\to\mathcal{C}_{\text{target}}.

Good analogies preserve what matters:

  • contact topology,
  • catalytic roles,
  • binding geometry,
  • property tradeoffs,
  • evidence structure,
  • uncertainty,
  • failure modes.

Bad analogies preserve only superficial text.

Persistent homology adds shape summaries. Sheaves add local-to-global consistency. Toric sidecars add combinatorial certificates. Semigroup relations add equivalence between edit paths. Together, these structures help the model distinguish deep analogy from mere resemblance.

Persistent Homology in Memory

Persistent homology summarizes how connected components, loops, and higher-dimensional holes appear and disappear across a filtration.

For a protein contact map, a filtration might be built from distance thresholds. For a molecule, it might use bond distance, conformer energy, or pharmacophore proximity. For a reasoning graph, it might use confidence, relevance, or time.

The barcode is not the full object. It is a compressed shape signature.

That compression is the point.

ToricBLM does not store every detail of every previous trajectory. It stores the parts likely to matter later:

  • embeddings,
  • graph signatures,
  • persistence vectors,
  • cone codes,
  • support margins,
  • validator states,
  • provenance.

This improves efficiency and compression. It also gives the model mathematical priors: shape, relation, support, and trajectory matter.

Why This Beats Ordinary RAG

Ordinary RAG is good when the answer is in a document. ToricBLM-style analogical memory is better when the answer requires remembering a pattern.

Compare:

  1. Token RAG

    • efficient for factual lookup,
    • weak at structural analogy,
    • often bloats context,
    • depends heavily on chunk quality.
  2. Knowledge graphs

    • good for explicit relations,
    • brittle when relations are incomplete,
    • often symbolic but not geometric,
    • weak at continuous similarity and trajectory memory.
  3. Long-context markdown piles

    • easy to build,
    • expensive to read,
    • hard to audit,
    • poor compression of reasoning structure.
  4. ToricBLM analogical memory

    • stores high-value trajectories,
    • retrieves by vector, graph, topology, and provenance,
    • uses mathematical priors,
    • supports branch scoring and GFlowNet rewards,
    • compresses evidence into structured memory packets.

The point is not that RAG is obsolete everywhere. The point is that scientific design needs more than retrieval. It needs structured remembrance.

7. Flow Matching, UMA, and PITA-Inspired Calibration

ToricBLM uses flow matching for structures and dynamics. A conditional path can be:

xt=(1t)x0+tx1. x_t=(1-t)x_0+tx_1.

The target velocity is:

v(xt,t,c)=x1x0. v^\star(x_t,t,c)=x_1-x_0.

The model trains:

LFM=Et,x0,x1vθ(xt,t,c)v(xt,t,c)22. \mathcal{L}_{\text{FM}} = \mathbb{E}_{t,x_0,x_1} \|v_\theta(x_t,t,c)-v^\star(x_t,t,c)\|_2^2.

This is useful for:

  • protein conformers,
  • antibody loops,
  • minibinder interfaces,
  • molecule conformers,
  • RNA folds,
  • crystal structures,
  • local pocket dynamics.

UMA enters later. It is too expensive to call inside every training step, but very useful as an oracle for forces and energies. A candidate state xx under context cc can be scored by:

EU(x,c),FU(x,c)=xEU(x,c). E_U(x,c),\qquad F_U(x,c)=-\nabla_xE_U(x,c).

Inspired by PITA, we can use a temperature ladder:

0<β0<β1<<βK. 0<\beta_0<\beta_1<\cdots<\beta_K.

Define an annealed target:

πβ(xc)exp{βEU(x,c)+βλUθ(x,c)βC(x,c)}. \pi_\beta(x\mid c) \propto \exp\{-\beta E_U(x,c)+\beta\lambda U_\theta(x,c)-\beta C(x,c)\}.

At high temperature, the model explores. At lower temperature, it concentrates on low-energy, high-utility candidates. SMC-style reweighting and cached UMA calls let us spend expensive oracle compute only where it matters.

This gives a late-stage training phase:

  • sample high-value Forest-of-Thought candidates,
  • score or relax them with UMA,
  • reweight with a PITA-style annealing schedule,
  • distill energy and force heads,
  • distill flow endpoints,
  • update GFlowNet rewards.

The model does not need UMA at inference. It learns from UMA, then carries a compressed approximation.

8. Datasets and Training

ToricBLM’s training mixture is intentionally curated rather than maximal.

Important sources include:

  • AmelieSchreiber/toricgt-curated-splits for graph-reasoning mechanics,
  • reviewed UniProt and Swiss-Prot records with rich annotations,
  • PDB experimental structures,
  • AFDB and selected predicted structures,
  • ReactZyme for enzyme-reaction prediction,
  • ProteinGym, MaveDB, and BindingGYM for deep mutational scans,
  • PDBbind and MISATO-style complex data,
  • RNAcentral and Rfam for RNA,
  • PubChem, ChEMBL, and GEOM for molecules,
  • Materials Project, NOMAD, and OQMD for materials.

The toricgt-curated-splits dataset is used carefully. It is valuable because it stores reasoning as graph JSON with provenance, hashes, split assignments, and quality flags. But it is not treated as a biomedical truth source. Reasoning traces are verifier-gated. Unverified traces can teach graph syntax, but not scientific reward.

The reference run is small:

  • 100M dense parameters,
  • below 25B training tokens,
  • preferred first run around 12-15B packed tokens,
  • RTX 4090-class feasibility target,
  • larger Soft-MoE and UMA/PITA phases only after the dense model proves itself.

9. Why Small Matters

Small models are not just cheaper. In medicine and biology, small models are often more usable.

A useful biomedical model should be able to run:

  • near private patient or lab data,
  • in constrained medical settings,
  • on modest hardware,
  • with auditable outputs,
  • with explicit validation ladders,
  • without requiring a datacenter for every idea.

ToricBLM aims for a compact model that can still reason over:

  • sequence,
  • structure,
  • dynamics,
  • evidence,
  • analogy,
  • topology,
  • geometry,
  • force and energy feedback,
  • wet-lab validation.

The guiding belief is that quality does not come only from parameter count. It also comes from choosing the right structure.

10. Closing

ToricBLM is a proposal for a different kind of scientific model: small, graph-native, geometry-aware, memory-rich, and trained to reason through design spaces rather than merely complete strings.

The most important idea is the thought alcove.

Thought alcoves give us a language for stable reasoning modes. They let a model improve an idea without silently changing the idea. They let us distinguish local optimization from mechanism switching. They connect activation geometry to tropical supports, toric fans, GraphCG bases, GFlowNet trajectories, and validator states.

That is a lot of machinery. But biology is a lot of machinery too.

If we want models that design enzymes, antibodies, minibinders, molecules, RNA, DNA, and materials with real scientific judgment, we should stop pretending that one axis, one retrieval chunk, or one giant context window is enough.

The future of small scientific models may look less like a bigger autocomplete engine and more like a compact geometric laboratory:

  • part transformer,
  • part graph learner,
  • part dynamical generator,
  • part analogical memory,
  • part tropical fan,
  • part toric audit system,
  • part experimental design assistant.

That is the wager of ToricBLM.

And if the model is only 100M parameters, all the better. The elegance is in making the small thing know where it is.

Further Reading

Community

Sign up or log in to comment