Issue with the chat template in opencode

#6
by Milor123 - opened

I tried loading the model normally without chat template file options in llama.cpp, and I also included its .ninja files, but I get the same error with all of them.

main: server is listening on http://127.0.0.1:10000
main: starting the main loop...
srv  update_slots: all slots are idle
srv   operator (): got exception: {"error":{"code":500,"message":"\n------------\nWhile executing CallExpression at line 85, column 32 in source:\n...first %}↵            {{- raise_exception('System message must be at the beginnin...\n                                           ^\nError: Jinja Exception: System message must be at the beginning.","type":"server_error"}}
srv  log_server_r: done request: POST /v1/chat/completions 127.0.0.1 500
srv   operator (): operator (): cleaning up before exit...

I had to resort to using my 3.6_chat_template-v10.jinja file, which did work, but I'm not sure if it breaks anything; for now, it seems to be working normally, it calls the tools correctly, but I don't know if it's the right way to do it.

Same in claude code, useing https://huggingface.co/froggeric/Qwen-Fixed-Chat-Templates to fix it.
However, after the context exceeds 100k(rough observation), tool calls frequently malfunction—for instance, a single command may be executed repeatedly multiple times, or even get stuck in an infinite loop. This issue was not encountered in Qwen3.6-35B-A3B-UD-Q4_K_M.gguf(unsloth) or Qwen3.6-35B-A3B-APEX-I-Compact.gguf. I'm not sure whether this is an issue with the model itself , command parameter for llama-server or with the third-fixed chat template.

Aside from the context-length-related issues, the model's tool-calling accuracy and proactiveness are excellent and justify its rating. It would be great to see a solution (whether model-side or chat-template-side) for the problems.

llama-server -m "ornith-1.0-35b-Q4_K_M.gguf" `
    --alias Qwen3.6-35B-A3B `
    -c 261244 `
    -cram 1024 `
    --no-mmap `
    -np 1 `
    -ngl 99 `
    -ncmoe 12 `
    -fa on `
    -ctk q8_0 -ctv q8_0 `
    -b 1024 -ub 1024 `
    -t 4 -tb 2 --threads-http 4 `
    --prio 2 `
    --temp 0.6 `
    --top-p 0.95 `
    --top-k 20 `
    --min-p 0.0 `
    --presence-penalty 0.0 `
    --repeat-penalty 1.0 `
    --log-file server.log

Sign up or log in to comment