| model_name | final_score | task_coverage@1 | relative_quality@1 | hallucination@1 | |
|---|---|---|---|---|---|
| 0 | qwen3-4b-dw-lr-dpo-offline-energy-GRPO_step_200 | 1.7964 | 0.970873 | 0.883589 | 0.0580614 |
| 1 | qwen3-4b-dw-lr-GRPO-mix-preference_step_100 | 1.77586 | 0.974856 | 0.875816 | 0.0748081 |
| 9 | energyv2-dpo-offline-GRPO_step_100 | 1.75576 | 0.957438 | 0.865499 | 0.0671785 |
| 4 | Qwen3-4B | 1.73133 | 0.979511 | 0.872361 | 0.120537 |
| 10 | energyv2-dpo-offline-GRPO_step_180 | 1.66418 | 0.930302 | 0.826008 | 0.0921305 |
| 12 | energyv2-dpo-offline-GRPO_step_180_no_think | 1.58354 | 0.946665 | 0.800216 | 0.16334 |
| 11 | energyv2-dpo-offline-GRPO_step_100_no_think | 1.5715 | 0.939347 | 0.79333 | 0.16118 |
| 3 | qwen3-4b-dw-lr | 1.52447 | 0.944386 | 0.782582 | 0.202495 |
| 2 | qwen3-4b-dw-lr-dpo-offline | 1.29638 | 0.785869 | 0.660269 | 0.14976 |
| 7 | energyv2-dpo-offline_think_off_ | 0.940547 | 0.655758 | 0.515067 | 0.230278 |
| 8 | enregy-gpt-regulatorio-v2_think_off_ | 0.929175 | 0.826536 | 0.538196 | 0.435557 |
| 6 | energyv2-dpo-offline | -0.0952015 | 0.0823417 | 0.0571017 | 0.234645 |
| 5 | enregy-gpt-regulatorio-v2 | -0.303599 | 0.303887 | 0.149664 | 0.75715 |
- Downloads last month
- 42
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for CEIA-RL/energyv2-dpo-offline-GRPO
Base model
cemig-nlp-releases/enregy-gpt-regulatorio-v2 Finetuned
CEIA-RL/energyv2-dpo-offline

