Buckets:
| **Unison-Judge** is a fine-tuned **Qwen3-VL-8B** vision-language model | |
| that serves as the local automatic judge for the Unison benchmark. It scores UMMs' outputs across | |
| all four unified tasks (IC, UGG, GGU and ME) without requiring a hosted API. | |
| ## Judge Consistency Data | |
| The `Judge_Consistency/` directory contains **231 evaluation cases** used to assess the scoring consistency of Unison-Judge across all four tasks. | |
| | Field | Description | | |
| |---|---| | |
| | `id` | Item identifier | | |
| | `task` | One of `IC`, `UGG`, `GGU`, `ME` | | |
| | `family` | Question type | | |
| | `model` | The UMM whose output is being evaluated | | |
| | `questions` | List of sub-questions, each with the model's answer and judge-assigned score | | |
| | `images` | Reference image(s) and the model-generated image | | |
| **Task distribution:** IC (57), ME (62), GGU (56), UGG (56) | |
| **Models covered:** BAGEL-7B-MoT, OmniGen2, SEED-X-17B, UniWorld-V1 | |
Xet Storage Details
- Size:
- 903 Bytes
- Xet hash:
- 9ea530f3e700b5b3850738cb7bc91983d1b086e78a0afd838e18afad3c0f6ad4
·
Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.