Image-Text-to-Text
Transformers
GGUF
Cosmos
English
qwen2_5_vl
nvidia
unsloth
conversational