fix: restore model turn + thinking cue after tool responses

The generation prompt suppresses <|turn>model after tool responses, which prevents the model from re-entering its thinking state in multi-turn tool-calling flows.

Fix:
- Remove 'tool_response' from the suppression condition so the model gets its turn marker after tool responses.
- Inject <|channel>thought when enable_thinking=true AND the previous message was a tool_response, nudging the model back into its reasoning state machine.

Normal (non-tool) turns are unchanged — the model self-generates the thinking channel cue as usual.

Ref: https://github.com/vllm-project/vllm/issues/45039

Files changed (1) hide show

chat_template.jinja +4 -1

chat_template.jinja CHANGED Viewed

@@ -374,8 +374,11 @@
 {%- endfor -%}
 {%- if add_generation_prompt -%}
-    {%- if ns.prev_message_type != 'tool_response' and ns.prev_message_type != 'tool_call' -%}
         {{- '<|turn>model\n' -}}
     {%- endif -%}
     {%- if not enable_thinking -%}

 {%- endfor -%}
 {%- if add_generation_prompt -%}
+    {%- if ns.prev_message_type != 'tool_call' -%}
         {{- '<|turn>model\n' -}}
+        {%- if enable_thinking and ns.prev_message_type == 'tool_response' -%}
+            {{- '<|channel>thought\n' -}}
+        {%- endif -%}
     {%- endif -%}
     {%- if not enable_thinking -%}