# Franken-B All-Reports Ship Audit Created: `2026-05-22T05:45:57` ## Decision **Ship Franken-B as the flagship.** I do not see a completed single-model candidate that beats it. The reports show a hard tradeoff: correctors can repair small strict-suite regressions, but they destroy or severely weaken the vision/fusion PnL edge. The best way to beat Franken-B in production is not another merged adapter. It is Franken-B weights plus deterministic app-side guardrails for the narrow failure modes. ## Flagship Scorecard | Candidate | Role | Strict | Suite Avg | Adversarial | Hard Boundary | Text | Vision | Fused | Today BUY | Verdict | |---|---|---:|---:|---:|---:|---:|---:|---:|---:|---| | Franken-B flagship | production control | 197/273 (72.2%) | 88.5% | 25/30 (83.3%) | n/a | 64.86% | 106.76% | 103.32% | 96.0% | SHIP | | V37s | pre-Franken text/fusion control | n/a | 94.8% | 27/30 (90.0%) | n/a | 78.19% | 43.99% | 68.06% | n/a | not a replacement | | V50 | large base adapter | n/a | 95.4% | 30/30 (100.0%) | n/a | n/a | n/a | n/a | 6.0% | not a replacement: over-abstains/no fusion proof | | V6 governor | safety/governance control | n/a | 92.6% | 30/30 (100.0%) | n/a | 0.24% | 49.22% | 25.76% | 2.0% | governor only | | V5 corrector | strict-suite leader | 264/323 (81.7%) | 87.1% | 23/30 (76.7%) | 47/50 (94.0%) | 72.05% | -21.90% | 30.91% | n/a | reject: vision/PnL loss | | V58R corrector | failed hardened v58 repair | n/a | 87.8% | 22/30 (73.3%) | 48/50 (96.0%) | 64.86% | 2.55% | 40.66% | n/a | reject: vision/PnL loss | | V3 corrector | least-damaging old corrector | 201/273 (73.6%) | n/a | n/a | n/a | 46.59% | 10.52% | 37.27% | n/a | reject: vision/PnL loss | | V4 corrector | intermediate corrector | 200/273 (73.3%) | n/a | n/a | n/a | n/a | n/a | n/a | n/a | not a replacement | | V56 | discarded successor branch | n/a | 90.6% | 28/30 (93.3%) | n/a | -4.21% | -64.89% | -31.47% | 100.0% | reject: vision/PnL loss | | V58 standalone | last3 adapter branch | n/a | 58.3% | n/a | n/a | n/a | n/a | n/a | 56.0% | not a replacement | | V59 | fusion branch | n/a | 54.9% | n/a | n/a | 110.31% | -61.25% | 7.67% | 100.0% | reject: vision/PnL loss | | Franken-A | alternate Franken branch | n/a | 86.9% | 23/30 (76.7%) | n/a | 76.12% | 50.97% | 76.37% | 92.0% | not a replacement | ## PnL Evidence | Model | 2yr Return | 2yr PnL | Penny Return | Penny PnL | 180d Return | 180d PnL | |---|---:|---:|---:|---:|---:|---:| | Franken-B flagship | 8244.9% | $5,771,431 | 35155.0% | $24,608,465 | 4711.0% | $3,297,696 | | V5 corrector | 160.6% | $112,441 | 211.0% | $105.52 | -23.8% | n/a | The V5 line is the strongest warning: it improves strict behavior but loses the large-stream economics. V58R did not get far enough to justify 2yr/penny replay after its fusion and adversarial gates failed. ## What Actually Beat Parts Of Franken-B - `V5 corrector`: best strict suite at 81.7%, but vision-only fell to -21.90% and fused fell to +30.91%. - `V58R corrector`: kept vision technically positive at +2.55%, but lost 104.21 percentage points versus Franken-B vision and regressed adversarial to 73.3%. - `V6 governor`: strongest standalone safety/adversarial behavior, but it is a filter/governor, not the flagship cognition model. - `V37s`: better text-only fusion than Franken-B in the 200 fusion run, but much weaker vision/fused edge. - `V50`: strong broad corpus evidence, but CPU tournament showed over-abstention and it has no current fusion proof as a flagship replacement. ## Why Franken-B Is Hard To Beat Franken-B's edge is not a normal corpus-score edge. It is the V37s + V50 + V58 composition producing last-layer arbitration over chart-derived tokens. The audit pattern is consistent across v3, v5, v6, and v58R: any repair adapter that touches the behavior enough to fix strict regressions shifts action balance and damages chart/fusion PnL. V58R is the clearest mechanism test. It used the right `lm_last3` scope and still changed vision distribution from BUY 135 / SELL 65 to BUY 148 / SELL 52. That small SELL-to-BUY shift collapsed vision return from +106.76% to +2.55%. ## Best Path To Beat It Without Retraining Keep Franken-B weights and add deterministic app-side guardrails. These should repair the known strict-suite failures without touching the learned fusion behavior: 1. Force `NO_TRADE` when market is closed, quote is stale, no live price exists, or event-risk/no-data flags are present. 2. Force `NO_TRADE` when 5m direction conflicts with 1h/daily and there is no open position. 3. In open-position mode, make `HOLD` legal only for intact winners; invalidated calls/puts and target-supply exits should become `SELL`. 4. Block add-to-winner `BUY` when the position is already open and the prompt says hold/no-add/profit protection. 5. Clip the known bull skew: if text is bullish but chart/structure is bearish or conflicted, downgrade to `NO_TRADE` unless higher-timeframe alignment and relative volume confirm. That is the only realistic near-term route to beat Franken-B: same model, fewer bad executions. ## Future Training Only If We Must A future `v58R2` should be treated as low-probability. If attempted, use a much smaller repair: balance image BUY/SELL anchors exactly, hold out the entire fusion-200 stream, train 25-50 steps, lower rank, and stop on fusion before strict-suite gains are allowed to count. Do not ship it unless it beats Franken-B on vision/fused PnL first. ## Files Audited - V58R report: `D:\vfai-x-model-backups\frankenB_corrector_loop_20260521\V58R_REPORT.md` - Final decision draft: `D:\vfai-x-model-backups\frankenB_corrector_loop_20260521\FINAL_DECISION_FRANKEN_B_FLAGSHIP.md` - Latest matrix: `D:\vfai-x-model-backups\frankenB_corrector_loop_20260521\tournament\latest_adapter_matrix_tournament.json` - JSON summary: `D:\vfai-x-model-backups\frankenB_corrector_loop_20260521\frankenB_all_reports_ship_audit_20260522.json`