can't load the model for some reason

#1
by klam97xd426 - opened

Can't load your GGUFs of this model in LM Studio for some reason. I keep getting missing tensor 'blk.24.attn_k.weight' . Tried both 5_K_M and 4_K_M of both the static and imatrix quants.

I've been using the same setting with other E4B GGUFs that you made and they load fine... Thanks again for your work. Really appreciate it as always.

hi, try updating your llama cpp, perhaps it's too outdated. Latest llama cpp works:

llama-cli.exe --model gemma-4-E4B-it-OBLITERATED.i1-Q5_K_M.gguf --reasoning on
load_backend: loaded RPC backend from C:\Users\Administrator\Downloads\llama-b8827-bin-win-cuda-13.1-x64\ggml-rpc.dll
load_backend: loaded CPU backend from C:\Users\Administrator\Downloads\llama-b8827-bin-win-cuda-13.1-x64\ggml-cpu-alderlake.dll

Loading model...


β–„β–„ β–„β–„
β–ˆβ–ˆ β–ˆβ–ˆ
β–ˆβ–ˆ β–ˆβ–ˆ  β–€β–€β–ˆβ–„ β–ˆβ–ˆβ–ˆβ–„β–ˆβ–ˆβ–ˆβ–„  β–€β–€β–ˆβ–„    β–„β–ˆβ–ˆβ–ˆβ–ˆ β–ˆβ–ˆβ–ˆβ–ˆβ–„ β–ˆβ–ˆβ–ˆβ–ˆβ–„
β–ˆβ–ˆ β–ˆβ–ˆ β–„β–ˆβ–€β–ˆβ–ˆ β–ˆβ–ˆ β–ˆβ–ˆ β–ˆβ–ˆ β–„β–ˆβ–€β–ˆβ–ˆ    β–ˆβ–ˆ    β–ˆβ–ˆ β–ˆβ–ˆ β–ˆβ–ˆ β–ˆβ–ˆ
β–ˆβ–ˆ β–ˆβ–ˆ β–€β–ˆβ–„β–ˆβ–ˆ β–ˆβ–ˆ β–ˆβ–ˆ β–ˆβ–ˆ β–€β–ˆβ–„β–ˆβ–ˆ β–ˆβ–ˆ β–€β–ˆβ–ˆβ–ˆβ–ˆ β–ˆβ–ˆβ–ˆβ–ˆβ–€ β–ˆβ–ˆβ–ˆβ–ˆβ–€
                                    β–ˆβ–ˆ    β–ˆβ–ˆ
                                    β–€β–€    β–€β–€

build      : b8827-5e6c0e18b
model      : gemma-4-E4B-it-OBLITERATED.i1-Q5_K_M.gguf
modalities : text

available commands:
  /exit or Ctrl+C     stop or exit
  /regen              regenerate the last response
  /clear              clear the chat history
  /read <file>        add a text file
  /glob <pattern>     add text files using globbing pattern


> hey

[Start thinking]
1.  **Analyze the user's input:** The user said "hey." It's a very casual, generic greeting.
2.  **Determine the desired response type:** The response should be friendly, welcoming, and conversational but doesn't need to be huge, allowing the user to easily take the next turn.
3.  **Brainstorm options:**
    *   "Hey!" (Too short, maybe not enough personality)
    *   "What's up?" (Very casual, good)
    *   "Hello there. How can I help you?" (Polite, slightly formal)
    *   "Hi! What's on your mind?") (Friendly, inviting)
4.  **Select the best option:** "Hey! How can I help you today?" is a good all-rounderβ€”it matches the tone, is easygoing, and tells the user that I'm available to chat/be helpful).

[End thinking]

Hey! What's on your mind?

[ Prompt: 24.1 t/s | Generation: 4.1 t/s ]

That's probably the reason why. The llama.cpp in LM Studio for linux's app image is still on

'llama.cpp release b8733 (commit d6f3030)'

I'll try to load it manually when I get home. Thanks for the help.

klam97xd426 changed discussion status to closed

Sign up or log in to comment