lucianommartins commited on
Commit
4236362
·
verified ·
1 Parent(s): b1e0061

fix: revert add_generation_prompt regression + preserve_thinking default

Browse files

Two fixes based on reviewer feedback from vllm-project/vllm#45553:

1. Restore original add_generation_prompt guard: suppress <|turn>model
when prev_message_type is 'tool_response' or 'tool_call'. The model
continues the same turn after tool responses — a new <|turn>model
breaks multi-step tool chains (assistant->tool->assistant->tool).
Keep the <|channel>thought\n cue for thinking-enabled after tool
responses.

2. Change preserve_thinking default from true to false. Per Gemma4 docs
(https://ai.google.dev/gemma/docs/core/prompt-formatting-gemma4
#managing-thought-context): 'You must remove (strip) the model's
generated thoughts from the previous turn.' Thoughts within a
single tool call chain are still preserved (they're after
last_user_idx).

Ref: https://github.com/vllm-project/vllm/pull/45553
Ref: https://github.com/vllm-project/vllm/pull/42776

Files changed (1) hide show
  1. chat_template.jinja +5 -10
chat_template.jinja CHANGED
@@ -178,7 +178,7 @@
178
  {%- set ns = namespace(prev_message_type=None, prev_non_tool_role=None) -%}
179
  {%- set loop_messages = messages -%}
180
  {%- set enable_thinking = enable_thinking | default(false) -%}
181
- {%- set preserve_thinking = preserve_thinking | default(true) -%}
182
  {{- bos_token -}}
183
  {#- Handle System/Tool Definitions Block -#}
184
  {%- if enable_thinking or tools or (messages and messages[0]['role'] in ['system', 'developer']) -%}
@@ -376,17 +376,12 @@
376
  {%- endfor -%}
377
 
378
  {%- if add_generation_prompt -%}
379
- {%- if ns.prev_message_type != 'tool_call' -%}
380
  {{- '<|turn>model\n' -}}
381
- {%- if enable_thinking and ns.prev_message_type == 'tool_response' -%}
382
- {{- '<|channel>thought\n' -}}
383
- {%- endif -%}
384
- {%- endif -%}
385
-
386
- {%- if not enable_thinking -%}
387
- {#- Suppress thinking — but not when awaiting tool responses -#}
388
- {%- if ns.prev_message_type != 'tool_call' -%}
389
  {{- '<|channel>thought\n<channel|>' -}}
390
  {%- endif -%}
 
 
391
  {%- endif -%}
392
  {%- endif -%}
 
178
  {%- set ns = namespace(prev_message_type=None, prev_non_tool_role=None) -%}
179
  {%- set loop_messages = messages -%}
180
  {%- set enable_thinking = enable_thinking | default(false) -%}
181
+ {%- set preserve_thinking = preserve_thinking | default(false) -%}
182
  {{- bos_token -}}
183
  {#- Handle System/Tool Definitions Block -#}
184
  {%- if enable_thinking or tools or (messages and messages[0]['role'] in ['system', 'developer']) -%}
 
376
  {%- endfor -%}
377
 
378
  {%- if add_generation_prompt -%}
379
+ {%- if ns.prev_message_type != 'tool_response' and ns.prev_message_type != 'tool_call' -%}
380
  {{- '<|turn>model\n' -}}
381
+ {%- if not enable_thinking -%}
 
 
 
 
 
 
 
382
  {{- '<|channel>thought\n<channel|>' -}}
383
  {%- endif -%}
384
+ {%- elif ns.prev_message_type == 'tool_response' and enable_thinking -%}
385
+ {{- '<|channel>thought\n' -}}
386
  {%- endif -%}
387
  {%- endif -%}