--- license: apache-2.0 pipeline_tag: any-to-any tags: - Diffusion Large Language Model - Multi-Modal Generation and Understanding --- # 💡 OmniGen2-MICo • OmniGen2-Variant finetuned on Multi-Image Composition Dataset (MICo-150K) [![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white)](https://www.arxiv.org/pdf/2512.07348) [![ArXiv](https://img.shields.io/badge/arXiv-A42C25?style=for-the-badge&logo=arxiv&logoColor=white&color=blue)](https://www.arxiv.org/abs/2512.07348) [![Github](https://img.shields.io/badge/MICo150K-000000?style=for-the-badge&logo=github&logoColor=000&logoColor=white)](https://github.com/A113N-W3I/MICo-150K) [![Project Page](https://img.shields.io/badge/Project_Page-00CED1?style=for-the-badge&logo=web&logoColor=white)](https://mico-150k.github.io/) ## 🎨 Demo Since [OmniGen2](https://huggingface.co/OmniGen2/OmniGen2) naturally supports multi-reference generation, here we give some comparisons against [OmniGen2](https://huggingface.co/OmniGen2/OmniGen2). ![image](https://cdn-uploads.huggingface.co/production/uploads/655db6a58c2d4379a70837c0/cM9IcdPVO5k67U8MBNEz4.png) ![image](https://cdn-uploads.huggingface.co/production/uploads/655db6a58c2d4379a70837c0/2uZWMesZDQOtRoRLw40_D.png) ![image](https://cdn-uploads.huggingface.co/production/uploads/655db6a58c2d4379a70837c0/hZ0xjXXAf0Tn_iS4YczYP.png) ![image](https://cdn-uploads.huggingface.co/production/uploads/655db6a58c2d4379a70837c0/KDbG4yAHMmzWyDXLKV_le.png) ## ✍️ Citation If you find this work useful, please cite: ~~~ @article{wei2025mico, title={MICo-150K: A Comprehensive Dataset Advancing Multi-Image Composition}, author={Wei, Xinyu and Cen, Kangrui and Wei, Hongyang and Guo, Zhen and Li, Bairui and Wang, Zeqing and Zhang, Jinrui and Zhang, Lei}, journal={arXiv preprint arXiv:2512.07348}, year={2025} } ~~~ ## License OmniGen2-MICo is licensed under Apache License 2.0