--- base_model: Rakushaking/Qwen4b-SFT-d9-merged-after-dpo-d2 language: [en] license: apache-2.0 tags: [dpo, structured-data, toml, yaml, xml] --- # Final DPO: TOML + YAML + XML + codeblock fix - TOML: 750 pairs (section vs inline) - YAML: 300 pairs (correct vs broken indent) - XML: 300 pairs (valid vs broken tags) - JSON: 225 pairs (clean vs codeblock-wrapped) - CSV: 150 pairs - LR=1e-05, Beta=0.1, Epochs=1