Feature Suggestion: Enable Multimodal / Vision Capabilities

#3
by MoQh - opened

Hi, thank you for your amazing work on qwen3.5-122b-a10b-opus-reasoning-gguf!
I really appreciate the reasoning performance of this model so far.
I wanted to ask if you could consider adding visual input (image understanding) support to this version or a future release.
Having multimodal capabilities would make it much more powerful for tasks that involve both text and visual reasoning.
Thanks again for your great contributions and for sharing this model with the community!

Use any mmproj extracted from the base model. Vision works, just not great due to text only training.

Sign up or log in to comment