File size: 6,796 Bytes
f28a48c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
---
license: mit
tags:
  - quantum-computing
  - molecular-simulation
  - shadow-spectroscopy
  - regression
  - pytorch
library_name: pytorch
---

# molecular-shadows-h4

**The H4 direct-observable regressor** for fermionic shadow spectroscopy on
linear $\mathrm{H_4}$ / STO-3G. Given an equal nearest-neighbor bond length $R$
and a propagation time $t$, it predicts the length-120 vector of time-evolved
Majorana expectation values

$$ D_\mu(R, t) = \langle \psi_0(R) \,|\, \Gamma_\mu(t) \,|\, \psi_0(R) \rangle,
\qquad \Gamma_\mu(t) = e^{iH(R)t}\,\Gamma_\mu\,e^{-iH(R)t}, $$

which feed the downstream matchgate-shadow spectroscopy post-processing
(FFT $\to$ peaks $\to$ energy gaps). This is the four-hydrogen model; a
single-reference $\mathrm{H_2}$ companion exists separately.

> **What's new vs the earlier H4 release.** This model uses an
> **explicit-amplitude composition** head and **adaptive Fourier bandwidth**,
> with **chemically-informed orbital-energy inputs**. Short-bond accuracy — the
> structural weak spot of the earlier release ($R < 1.0$ Å, $r \approx 0.40$) —
> is now strong ($r \approx 0.90$). Overall held-out Pearson $0.83 \to 0.93$.

## Architecture

A linear-in-time Fourier head with geometry-conditioned frequencies and an
explicit amplitude/phase factorization (no nonlinear trunk):

$$ \hat{y}_\mu(R, t) = \sum_{k=1}^{256} a_{k\mu}(R)\cos(\omega_k(R)\, t)
   + b_{k\mu}(R)\sin(\omega_k(R)\, t) + \mathrm{dc}_\mu(R), $$

with geometry-conditioned, bandwidth-limited frequencies

$$ \omega_k(R) = \omega_{\mathrm{op}}(R)\cdot
   \sigma\!\big(\mathrm{freq\_net}(\varepsilon(R))\big)_k, $$

where $\varepsilon(R)$ are the HF orbital energies and $\omega_{\mathrm{op}}(R)$
is the per-geometry operational frequency ceiling (soft-floored at 8.0). The
amplitude tensors $a_{k\mu}, b_{k\mu}$ are low-rank factorized (rank 16).

```
(R, t) + HF orbital energies ε(R)

      ├── freq_net(ε(R)) ─ sigmoid ─ × ω_op(R)  → 256 frequencies ω_k(R)   (adaptive bandwidth)

      ├── amp_net(ε(R)) → low-rank a_kμ(R), b_kμ(R), dc_μ(R)   (rank 16)

      └── y_μ(R,t) = Σ_k a_kμ cos(ω_k t) + b_kμ sin(ω_k t) + dc_μ   (explicit-amplitude head)
```

| Hyperparameter | Value |
|---|---|
| n_observables | 120 (degree-1 Majorana monomials, 8 spin-orbitals) |
| n_fourier | 256 |
| explicit_amplitude / amp_rank | True / 16 |
| adaptive_bandwidth | True ($\omega_{\mathrm{op}}$ floor 8.0, soft) |
| conditioned_frequencies | True |
| trunk depth × width | 6 × 768 |
| freq_net depth × width | 3 × 128 |
| residual trunk | None |
| n_orb_features | 4 (HF spatial-orbital energies of H4/STO-3G) |
| Parameter count | ~12.3 M |
| Training | 150k steps, AdamW, cosine LR $10^{-3}\to10^{-7}$, grad-clip 1.0, seed 42 |

## Chemically-informed inputs

The model conditions on **HF orbital energies $\varepsilon(R)$** rather than the
bare scalar $R$. On a paired held-out comparison (identical geometries, same
architecture, same data), orbital-energy inputs beat geometry-only inputs
decisively where multi-reference character lives, and converge with it at
dissociation:

| $R$ bin (Å) | orbital-energy input | scalar-$R$ input |
|---|---|---|
| $<0.74$ | **0.89** | 0.57 |
| $[0.74, 1.0)$ | **0.98** | 0.71 |
| $[1.0, 1.5)$ | 0.96 | 0.88 |
| $[1.5, 2.0)$ | 0.94 | 0.89 |
| $\geq 2.0$ | 0.93 | 0.94 |

This repo ships the orbital-energy model.

## Held-out evaluation

50 held-out geometries on the dense $R \in [0.5, 3.0]$ Å grid (251 total),
per-observable temporal Pearson $r$:

| $R$ bin (Å) | pearson_mean |
|---|---|
| $<0.74$ | 0.90 |
| $[0.74, 1.0)$ | 0.97 |
| $[1.0, 1.5)$ | 0.96 |
| $[1.5, 2.0)$ | 0.93 |
| $\geq 2.0$ | 0.92 |

**Aggregate:** mean Pearson **0.928** across all 50 held-out geometries; top-1
spectral-peak match 44/50. The result is seed-robust — pooling a second seed
($n=100$) gives overall 0.94 with borderline-bin seed spread $|\Delta| < 0.02$.
See `eval_results.json` for per-$R$ numbers.

## Inputs / outputs

- **Input.** $(R, t)$ — equal nearest-neighbor bond length in Å (linear chain:
  H atoms at $0, R, 2R, 3R$) and propagation time in a.u.
- **Output.** Length-120 vector of expectation values $D_\mu(R,t)$ for degree-1
  Majorana observables on H4/STO-3G's 8 spin-orbital JW encoding.
- **Valid range.** $R \in [0.5, 3.0]$ Å, $t \in [0, 300]$ a.u. Accuracy is now
  reasonably uniform across the curve; the only mild residual deficit is in the
  long-$R$ time-domain fit, which the spectral post-processing absorbs.

## Quickstart

```python
from inference import MolecularShadowsRegressor
import numpy as np

m = MolecularShadowsRegressor.from_hub(
    "aniketdesh/molecular-shadows-h4",
    revision="v18-orb",    # immutable architecture pin
    token="hf_...",
)

t_grid = np.linspace(0, 300, 1500)
y = m.predict_trajectory(R=0.8, t_grid=t_grid)   # (1500, 120) — short bond now reliable
```

## Notes & limitations

Earlier H4 releases were structurally weak at short bond ($R < 1.0$ Å): linear
H4 has a multi-reference singlet manifold whose eigenvectors rotate
near-discontinuously through avoided crossings as the chain compresses, and the
old GELU trunk could not encode that rapid amplitude rotation. The
explicit-amplitude head (which factorizes the prediction into per-geometry
amplitudes and frequencies) plus orbital-energy conditioning resolve most of
this — short-bond Pearson is now $\approx 0.90$. The remaining limitation is a
small loss of time-domain fidelity at large $R$ (slowly-varying non-sinusoidal
structure the linear-in-time basis cannot represent); because the downstream
Chan-style spectral analysis reads off frequencies, not waveforms, this does
not degrade recovered energy gaps.

## Files in this repo

| File | Purpose |
|---|---|
| `regressor.pt` | torch payload (state_dict + config + R/t grids) |
| `observable_regressor.py` | architecture (single file) |
| `inference.py` | loader (`MolecularShadowsRegressor`) |
| `orbital_energies.npz` | R-grid + HF orbital-energy table (+ $\omega_{\mathrm{op}}$) |
| `eval_results.json` | per-R held-out metrics (50 geoms) |
| `eval_summary.json` | aggregate |
| `history.json` | training curves |
| `README.md` | this file |

## Versioning

- `v18-orb` (current): explicit-amplitude composition + adaptive bandwidth +
  orbital-energy conditioning, grad-clip 1.0. Mean Pearson 0.928, short-bond
  recovered. Pin via `revision="v18-orb"`.
- Future architectures push as new commits with new tags; existing pins keep
  loading the exact committed version.

## Citation

Method: matchgate-shadow spectroscopy following
[arXiv:2212.11036](https://arxiv.org/abs/2212.11036) and
[matchgate-shadow theory](https://link.springer.com/article/10.1007/s00220-023-04844-0).

## License

MIT.