JinyuLiu's picture
|
download
raw
903 Bytes
**Unison-Judge** is a fine-tuned **Qwen3-VL-8B** vision-language model
that serves as the local automatic judge for the Unison benchmark. It scores UMMs' outputs across
all four unified tasks (IC, UGG, GGU and ME) without requiring a hosted API.
## Judge Consistency Data
The `Judge_Consistency/` directory contains **231 evaluation cases** used to assess the scoring consistency of Unison-Judge across all four tasks.
| Field | Description |
|---|---|
| `id` | Item identifier |
| `task` | One of `IC`, `UGG`, `GGU`, `ME` |
| `family` | Question type |
| `model` | The UMM whose output is being evaluated |
| `questions` | List of sub-questions, each with the model's answer and judge-assigned score |
| `images` | Reference image(s) and the model-generated image |
**Task distribution:** IC (57), ME (62), GGU (56), UGG (56)
**Models covered:** BAGEL-7B-MoT, OmniGen2, SEED-X-17B, UniWorld-V1

Xet Storage Details

Size:
903 Bytes
·
Xet hash:
9ea530f3e700b5b3850738cb7bc91983d1b086e78a0afd838e18afad3c0f6ad4

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.