GGUF
llama.cpp
unsloth
qwen3.6
conversational

Removing the model's vision feature?

#2
by NovaYear - opened

Hi friend,
I've been following your work for a while now, and the models you've created are truly remarkable examples of successful optimizations achieved through hard work.
I have an idea: since we're creating a model based on Claude Opus, could you perhaps develop it further with more coding-oriented features? Removing the Vision feature from the general-purpose model could reduce the space it occupies in VRAM. For someone using autonomous coding, the Vision feature isn't really necessary. Wouldn't removing the Vision feature and fine-tuning the coding result in a much better model?
Perhaps, if you have the time and resources, you could release a different fork as TeichAI/Qwen3.6-27B-Claude-Opus-CODER-Reasoning-Distill-v2-GGUF.
Finally, thank you so much for your efforts; you're guiding us.

TeichAI org

It would result in a much better coding model, but then it wouldn't be qwen3.6 architecture and would also require much more compute to occupy the space the vision encoder used. 🤷

you can run without mmproj to save some VRAM, just download the GGUF manually or use --no-mmproj in llama.cpp

Well the vision encoder is a separate model that, if images are present, encodes them and projects them into Qwen’s hidden state, so it could be removed with no hit to non vision performance, but it would not increase performance without significant retraining, and the encoder is only a couple hundred million parameters anyway so VRAM savings would be marginal, though a dedicated agentic coding model trained on frontier coding traces would be great, though since coding quality mainly depends on the large amounts of code it sees in pretraining, not the few thousand examples it sees in a frontier finetune (that influences mainly how it structures its agentic workflow and reasoning), I would recommend Devstral 2 small 24B as a good model as it is a purely agentic coding focused model from the ground up.

I'm not sure I understand what you are suggesting since qwen 3.6 coding quality is miles ahead of devstral 2 small already?

TeichAI org
edited Apr 30

He has expressed in other discussions that getting traces of something like opus not just creating new projects from scratch, but also spawning into an already populated git repo and tasked with exploring, auditing, fixing, or implementing things.

TeichAI org

I'm working on a new version of agentic datagen that could make this nice and easy to do. we'll be back soon with some new data and a new model will follow soon after :)

Personally i think removing vision from the model is a terrible idea. Some of us use these models with vision to iterate through UI/UX along with coding, having automated screenshot generation and analysis as UI/UX development happens.

Sign up or log in to comment