Robust-U1
Collection
5 items • Updated • 1
How to use Jiaqi-hkust/Robust-U1-SFT with Transformers:
# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("Jiaqi-hkust/Robust-U1-SFT", trust_remote_code=True, dtype="auto")Robust-U1 is a unified MLLM that self-recovers corrupted visual content and reasons over it, enabling robust visual understanding under real-world image degradations.
| Checkpoint | Link | Note |
|---|---|---|
| BAGEL-7B-MoT | ByteDance-Seed/BAGEL-7B-MoT | Used as initial weights for training. |
| Robust-U1 | Jiaqi-hkust/Robust-U1 | Final model for visual self-recovery and multimodal reasoning. |
| Robust-U1-RL | Jiaqi-hkust/Robust-U1-RL | Fine-tuned with reinforcement learning. |
| Robust-U1-SFT | Jiaqi-hkust/Robust-U1-SFT | Fine-tuned with supervised learning. |
If you find this repository useful, please cite our paper:
@inproceedings{
2026robustu,
title={Robust-U1: Can {MLLM}s Self-Recover Corrupted Visual Content for Robust Understanding?},
author={Jiaqi Tang, Jianmin Chen, Youyang Zhai, Wei Wei, Runtao Liu, Mengjie Zhao, Xiangyu Wu, Qingfa Xiao, Qifeng Chen},
booktitle={Forty-third International Conference on Machine Learning},
year={2026},
url={https://openreview.net/forum?id=I6W6cxVVts}
}