v45: SFT+chained GRPO with ITN β 95.9% number accuracy, 97.0% filler-free, deletion behavior matches v36 81578ee verified juanquivilla commited on May 4
v36: full-FT GRPO with substantive-deletion-aware reward β filler-free 96.9%, sub-del-15-long 0.64% 7278227 verified juanquivilla commited on May 3
v15: ROUGE-L 0.960, 70% exact match β LR 2.5e-5 breakthrough 27c16fc verified juanquivilla commited on Apr 2