17.8 GB
484 files
Updated 3 days ago
README.md

Unison-Judge is a fine-tuned Qwen3-VL-8B vision-language model that serves as the local automatic judge for the Unison benchmark. It scores UMMs' outputs across all four unified tasks (IC, UGG, GGU and ME) without requiring a hosted API.

Judge Consistency Data

The Judge_Consistency/ directory contains 231 evaluation cases used to assess the scoring consistency of Unison-Judge across all four tasks.

Field Description
id Item identifier
task One of IC, UGG, GGU, ME
family Question type
model The UMM whose output is being evaluated
questions List of sub-questions, each with the model's answer and judge-assigned score
images Reference image(s) and the model-generated image

Task distribution: IC (57), ME (62), GGU (56), UGG (56)
Models covered: BAGEL-7B-MoT, OmniGen2, SEED-X-17B, UniWorld-V1

Total size
17.8 GB
Files
484
Last updated
Jun 30
Pre-warmed CDN
US EU US EU

Contributors