Vision👍

by dpe1 - opened 10 days ago

This model has better vision than Qwen 3.6 35BA3B and was able to get further when playing Pokemon Crystal using just vision

sonali-kumari11

Google org 6 days ago

Hi @dpe1 -

Thanks for sharing your feedback! It's great to hear that you were impressed by the vision capabilities of the Gemma 4 12B variant.

nixudos

5 days ago

The default resolution of input images is set way too low. The 12b model is not able to pick up text on a paper bag on a photo that all Qwen models did without a hitch in all my tests.
There should be a way to change it for the other Gemma 4 models in Llamacpp, but currently, that doesn't support local MCP servers, so I'm stuck between using those and have myopic vision in LMstudio or go Llamacpp and have vision but no local MCP.
I wish the resolution parameters had been exposed in the chat template.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment