Can not use 35B with claude code and codex.

#19
by beginor - opened

I have download ornith-1.0-35b-q6_k.gguf and run with latest llama.cpp

llama.cpp version is:

llama-server --version
version: 9837 (b3fed31b9)
built with AppleClang 17.0.0.17000013 for Darwin arm64

my start script is:

llama-server --port 10800 --flash-attn on --n-gpu-layers 999 --fit off --parallel 1 \
  --ctx-size 131072 --jinja \
  --model ./models/ornith-1.0-35b-q6_k.gguf \
  --mmproj ./models/qwen3.6-35b-a3b-uncensored-heretic-mtp-mmproj-bf16.gguf

claude code version is v2.1.195, start claude code and just type hello , then get the following error:

⏺ API Error: 400 Unable to generate parser for this template. Automatic parser generation failed:
  ------------
  While executing CallExpression at line 85, column 32 in source:
  ...first %}↵            {{- raise_exception('System message must be at the beginnin...
                                             ^
  Error: Jinja Exception: System message must be at the beginning.

codex version is: v0.141.0, get the same error:

■ {"error":{"code":400,"message":"Unable to generate parser for this template.
Automatic parser generation failed: \n------------\nWhile executing CallExpression
at line 85, column 32 in source:\n...first %}↵            {{-
raise_exception('System message must be at the beginnin...\n
^\nError: Jinja Exception: System message must be at the
beginning.","type":"invalid_request_error"}}

Is there any advice about how to fix this?

refer to discuss https://huggingface.co/deepreinforce-ai/Ornith-1.0-35B/discussions/10 , use chat template from unsloth https://huggingface.co/unsloth/Qwen3.5-35B-A3B/blob/main/chat_template.jinja

add parameter --chat-template-file ./ornith-1.0-35b-gguf/chat_template.jinja to llama-server.

Sign up or log in to comment