Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?

Robust-U1 is a unified MLLM that self-recovers corrupted visual content and reasons over it, enabling robust visual understanding under real-world image degradations.

Paper: Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?
Repository: jqtangust/Robust-U1
Project Page: Hugging Face Space

🏰 Pretrained checkpoints (reference)

Checkpoint	Link	Note
BAGEL-7B-MoT	ByteDance-Seed/BAGEL-7B-MoT	Used as initial weights for training.
Robust-U1	Jiaqi-hkust/Robust-U1	Final model for visual self-recovery and multimodal reasoning.
Robust-U1-RL	Jiaqi-hkust/Robust-U1-RL	Fine-tuned with reinforcement learning.
Robust-U1-SFT	Jiaqi-hkust/Robust-U1-SFT	Fine-tuned with supervised learning.

⭐️ Citation

If you find this repository useful, please cite our paper:

@inproceedings{
2026robustu,
title={Robust-U1: Can {MLLM}s Self-Recover Corrupted Visual Content for Robust Understanding?},
author={Jiaqi Tang, Jianmin Chen, Youyang Zhai, Wei Wei, Runtao Liu, Mengjie Zhao, Xiangyu Wu, Qingfa Xiao, Qifeng Chen},
booktitle={Forty-third International Conference on Machine Learning},
year={2026},
url={https://openreview.net/forum?id=I6W6cxVVts}
}

Downloads last month: 38

Safetensors

Model size

15B params

Tensor type

BF16

Inference Providers NEW

Any-to-Any

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including Jiaqi-hkust/Robust-U1-SFT

Robust-U1

Collection

5 items • Updated 6 days ago • 1

Paper for Jiaqi-hkust/Robust-U1-SFT

Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding?

Paper • 2606.08063 • Published 12 days ago • 77