no mmproj to run vision?

#1
by alx-d - opened

I am trying to analyze an image file with medgemma 4q and I get this response:
"I am unable to access local files or specific URLs, including the one you
provided (/home/alex/img/path/diab.jpg). Therefore, I cannot
analyze the retinal fundus image and provide a clinical grading for
diabetic retinopathy or other pathologies."
To get an analysis, you would need to:

  1. Upload the image: Share the image file directly with me (if
    possible through the platform) or describe its features
    in detail^C>>> /exit"

With the help of Claude, I found out that a mmproj file is missing?

Claude:
Bartowski hasn't included the mmproj in his 27b GGUF repo. The 27B multimodal variant provides all the capabilities of the 4B multimodal variant but with significantly improved language capabilities as well as improved EHR understanding and anatomy localization. arxiv
So the situation is:

βœ… Google released medgemma-27b multimodal (vision capable)
βœ… Bartowski quantized it to GGUF
❌ Bartowski did not include the mmproj file in the repo β€” so you can't use its vision capabilities via llama.cpp/Ollama

Thank you! I have found them.
Sorry, I didn't click on the files, I only looked on the right side.
Perhaps it would help to list them in the description so that it shows up in a search of "mmproj" which I have done and probably also Claude and Gemini who have advised wrongly

Sign up or log in to comment