Vision seems broken

#3
by Duxon - opened

I've downloaded medgemma-27b-it-UD-Q8_K_XL.gguf, and it is not able to deal with images when served via Ollama. For reference, Gemma27b works.

Here is is docker compose I used to start it on my system:

  • RTX 4090D with 48GB VRAM:
  • Intel Xeon W5-3525
services:
  llama-small:
    image: ghcr.io/ggml-org/llama.cpp:full-cuda
    container_name: medgemma
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              capabilities: [gpu]
    ports:
      - 0.0.0.0:37000:37000
    volumes:
      - /home/slavik/.cache:/root/.cache
    entrypoint: ["./llama-server"]
    #  https://github.com/ggml-org/llama.cpp/tree/master/tools/server
    command: >
      --hf-repo unsloth/medgemma-27b-it-GGUF:UD-Q8_K_XL --hf-token hf_l****
      --alias "medgemma-27b"
      --ctx-size 65536
      --n-gpu-layers 999
      --jinja
      --flash-attn
      --host 0.0.0.0 --port 37000

Everything works for me. Vision works.
I see these files downloaded:

ls -hl .cache/llama.cpp/
total 31G
-rw-r--r-- 1 root root 1.4K Aug 10 02:56 'manifest=unsloth_medgemma-27b-it-GGUF=UD-Q8_K_XL.json'
-rw-r--r-- 1 root root  30G Aug 10 03:01  unsloth_medgemma-27b-it-GGUF_medgemma-27b-it-UD-Q8_K_XL.gguf
-rw-r--r-- 1 root root  221 Aug 10 03:01  unsloth_medgemma-27b-it-GGUF_medgemma-27b-it-UD-Q8_K_XL.gguf.json
-rw-r--r-- 1 root root 819M Aug 10 03:02  unsloth_medgemma-27b-it-GGUF_mmproj-F16.gguf
-rw-r--r-- 1 root root  205 Aug 10 03:02  unsloth_medgemma-27b-it-GGUF_mmproj-F16.gguf.json

I see 39GB of VRAM used.

medgemma-4090D  | prompt eval time =     138.12 ms /   122 tokens (    1.13 ms per token,   883.26 tokens per second)
medgemma-4090D  |        eval time =   11226.62 ms /   295 tokens (   38.06 ms per token,    26.28 tokens per second)

I also wasn't able to get vision working properly. Yes, it loads, yes gemma "see something". But something is clearly wrong. Uploaded few xray photos - 4B medgemma had no problems describing issues, 122B qwen had no problems, this one completely hallucinated (recognized which part of body it but that's all).

Sign up or log in to comment