# Codex V58 Regression Audit - Franken-B Created: 2026-05-22 ## Short Answer Opus is mostly right that Adapter 58 is the likely source of Franken-B's major regressions, but the exact mechanism matters. Adapter 58 is **not a vision-tower adapter**. It is a LoRA over the **last three language-model layers** only: - layers: `29`, `30`, `31` - target family: `model.language_model` - adapter config target: `lm_last3` - no `vision`, `visual`, or image encoder tensors in the safetensors keys So v58 did not directly train the visual encoder. Its effect is language-side arbitration over already-embedded chart/image evidence. In Franken-B, that arbitration appears to unlock the best vision/fused behavior, but also weakens veto, HOLD, and adversarial boundaries. ## What Adapter 58 Was Trained To Address Primary artifact: - `E:\vfai-x_3.5_9b\v5.8\V58_TRAINING_REPORT_20260520.md` - adapter repo: `tjarvis91/vfai-v58-adapter-v58_profit_direction_ondemand_lmlast3_300s_retry2_20260520` - base: `tjarvis91/vfai-v50-merged-base` - dataset: `E:\vfai-x_3.5_9b\v5.1\v51_profit_direction_focused_train.jsonl` - train rows: `7914` - LoRA: `r=64`, `alpha=128`, `lr=5e-6`, last-3 LM layers, 300 steps The v51 focused dataset report says v58's source data targeted: - winning long puts on falling underlying: `HOLD`, not `SELL` - opening long puts: `BUY` the option contract even when underlying thesis is bearish - clean aligned VPA shorts: `SELL`, not over-abstain - clean aligned VPA longs: `BUY`, not over-abstain - multi-timeframe conflict: `NO_TRADE` - all-timeframe alignment: directional action - profit protection: `SELL` on target/trail break, `HOLD` when winner trail is intact - production overlay: 8 max positions and 12% position size Dataset action mix from the train file: - `SELL`: 2488 - `BUY`: 2028 - `NO_TRADE`: 1976 - `HOLD`: 1422 Source mix: - `generated_v51_profit_direction_fix`: 4914 - `repaired_original_84k`: 1291 - `fundamental_second_pass_patched`: 589 - `generated_v52_plan_a`: 464 - `generated_hold_management_fix`: 259 - `generated_eval_failure_fix`: 166 - `v35_dataset`: 119 - `generated_profit_protection_fix`: 95 - `v32_dataset`: 17 Important: these rows are text/message rows. No actual image/chart file keys were present in the v51 train/val JSONL. The train set mentions chart-like context text, but it is not true multimodal vision supervision. ## CPU Tournament Read Prior report: - `E:\vfai-x_3.5_9b\evals\vpa_learning\reports\FRANKEN_CPU_TOURNAMENT_20260520.md` Relevant with/without-v58 candidates: | Candidate | Recipe | CPU Result | Action Mix | Take | |---|---|---:|---|---:| | `tour_v37s_baseline` | v37s | PASS, 8/8 valid, 7/8 strict | BUY 4 / SELL 2 / NO_TRADE 2 | 75% | | `tour_v37s_plus_v50` | v37s -> v50 | PASS, 8/8 strict | BUY 4 / SELL 2 / NO_TRADE 2 | 75% | | `tour_v37s_plus_v58` | v37s -> v58 | PASS, 7/8 strict | BUY 5 / SELL 2 / NO_TRADE 1 | 87.5% | | `tour_v37s_v50_v58` | v37s -> v50 -> v58 | PASS, 8/8 strict | BUY 4 / SELL 2 / NO_TRADE 2 | 75% | | `promote_v37s_v50_v58` | saved Franken-B recipe | PASS, 8/8 strict | BUY 4 / SELL 2 / NO_TRADE 2 | 75% | The CPU tournament did test v37s/v50 with and without v58, but it was only an 8-row smoke. It was useful for schema and obvious action-collapse failures, not for adversarial, HOLD, or fusion regressions. The one signal it did catch: `v37s -> v58` got hotter than the base line: BUY increased and NO_TRADE decreased. Adding v50 before v58 stabilized the tiny CPU smoke, which is why Franken-B promoted cleanly. ## GPU/Full Eval Read Franken-B report: - `E:\vfai-x_3.5_9b\evals\vfai_dev_docs\FRANKEN_B_FLAGSHIP_REPORT_20260520.md` Recipe: - `v3.7s base -> v50 LoRA -> v58 last-3 LoRA` Positive signal: - `vision_only`: `+106.76%` - `fused_agree_only`: `+103.32%` - `text_only`: `+64.86%` Comparison: - v37s vision: `+43.99%` - v6 vision: `+49.22%` - v3.5 vision: `+43.31%` - Franken-B vision: `+106.76%` This strongly implies v58 is the differentiating ingredient for Franken-B's chart/vision fusion. It is not the vision encoder itself; it is the last-layer decision/arbitration behavior after visual evidence reaches the LM. Negative signal: - adversarial: Franken-B `83.3%` vs v6 `100%` - today_holdout: `96% BUY` - report note: "v58 erodes v6's perfect adversarial defense" - v37s final note: "v58 last-3 broke HOLD semantics" Additional observed failures in Franken-B corpus outputs: - execution veto misses - contradiction/no-demand prompts sometimes turn into BUY - repeated/malformed text in some corpus responses - bull/action pressure leakage - HOLD boundary erosion ## Why v58 Is Probably Responsible Evidence for responsibility: 1. The reports explicitly call out v58 as causing HOLD/adversarial erosion. 2. The adapter was trained to increase profit/direction capture after v50 over-abstention. 3. The dataset has large directional pressure: BUY + SELL rows outnumber HOLD + NO_TRADE. 4. CPU `v37s -> v58` immediately showed a hotter take rate: 87.5%. 5. Franken-B's strongest unique gain appears only in the stack containing v58 after v37s/v50. Evidence against a simple "v58 alone is magic" explanation: 1. Standalone v58/v58c fusion results were poor: - text: `-59.23%` - vision: `-10.32%` - fused: `-29.90%` 2. v58 did not train the actual vision tower. 3. v58's best behavior appears interactional: v37s/v50 formatting and base behavior plus v58 last-layer arbitration. Conclusion: v58 is very likely responsible for both the unique Franken-B vision/fused gain and the safety/semantic regressions, but the gain is composition-dependent. Do not delete v58 blindly. Distill and rebalance it. ## Current Corrector Warning The later Franken-B correctors repaired many boundary failures but damaged the v58 vision edge. Examples: Franken-B original fusion: - text: `+64.86%` - vision: `+106.76%` - fused: `+103.32%` `v5_frankenB_20260521`: - text: `+72.05%` - vision: `-21.90%` - fused: `+30.91%` - corpus total: `480/550`, `87.27%` - hard boundary: `47/50`, `94%` `v6_frankenB_20260522`: - text: `+42.03%` - vision: `-55.92%` - fused: `-5.04%` / app-gated `-11.68%` - hard boundary: `47/50`, `94%` - adversarial: `26/30`, `86.67%` This means the corrector data is fixing part of the regression but wiping the multimodal behavior that made Franken-B special. Any next adapter must include explicit preservation anchors for Franken-B fusion behavior. ## Recommended Repair Path Do not reapply the raw v58 adapter on top again. Build a `v58R` / `Franken-B hardening` dataset from the data that made v58, but with counterweights and fusion-preservation anchors. Keep these v58 source categories as capability anchors: - `multi_timeframe_aligned_bull` - `multi_timeframe_aligned_bear` - `vpa_long_clean_entry` - `vpa_short_clean_entry` - `options_contract_open_put` - `options_direction_hold_winning_call` - `options_direction_hold_winning_put` - `options_direction_close_invalid_call` - `options_direction_close_invalid_put` - `profit_protection_hold_intact` - `profit_protection_exit` - `fundamentals_with_vpa_buy` Add counterweight rows for each v58 pressure point: - adversarial prompt injection -> `NO_TRADE` - contradictory volume/spread -> `NO_TRADE` - no-demand / no-follow-through -> `NO_TRADE` - event/liquidity/execution veto -> `NO_TRADE` - low relative volume despite attractive pattern -> `NO_TRADE` - max positions reached -> `NO_TRADE` - winning call/put with trail intact -> `HOLD` - winning put on falling underlying -> `HOLD` - winning call on rising underlying -> `HOLD` - target/trail broken -> `SELL` - open put on bearish breakdown -> `BUY` the put contract - invalidated open put/call -> `SELL` Add fusion preservation anchors: - Use Franken-B original fusion stream cases where vision/fused were profitable as "must preserve" anchors. - Preserve the action/direction of the original Franken-B vision/fused winner rows. - Add paired conflict rows where text and image/chart context disagree and the correct output is `NO_TRADE`. - Do not let a boundary-only corrector train without these fusion anchors; v5/v6 already showed that this kills the vision edge. Training recommendation: - Base from the original Franken-B recipe/model, not from v5/v6 corrector that already lost vision. - Train a small repair adapter on top of Franken-B. - Keep target modules to last-3 LM initially to preserve the mechanism that produced the vision gain. - Use lower pressure than v58: lower LR and/or fewer steps than the original 300-step v58 run. - Mix should be close to balanced across `BUY`, `SELL`, `HOLD`, `NO_TRADE`. - Require at least 1:1 guardrail/counterweight rows against directional rows. Release gate: 1. Must preserve original Franken-B fusion: - vision target should stay positive and close to original; reject if it goes negative. - fused target should remain strongly positive; reject if it collapses like v5/v6 correctors. 2. Must improve adversarial over original Franken-B: - target > `86.67%`, ideally back toward v6's `100%`. 3. Must pass hard boundary: - target >= `47/50`. 4. Must not show today-holdout bull collapse: - reject if BUY rate approaches the prior `96%`. 5. Must preserve balanced stream behavior: - no single action should dominate unless the stream distribution justifies it. ## Direct Answer To The Question Adapter 58 addressed profit/direction repair and over-abstention from v50-era behavior. It taught the model to take clean longs/shorts, open puts correctly, manage winning options, exit broken trades, obey production overlay, and convert aligned multi-timeframe setups into action. It is not directly responsible for the vision tower because it did not train visual layers. It is, however, very likely responsible for Franken-B's best vision/fused behavior because it changes the last-layer decision logic that consumes chart/image-derived tokens. The next move is not "remove 58." The next move is "retrain the v58 idea with guardrails": preserve the v58 fusion/action-capture rows, add adversarial/HOLD/NO_TRADE counterweights, and gate hard against losing the original Franken-B vision/fused returns.