Instructions to use google/gemma-4-12B-it with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use google/gemma-4-12B-it with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForMultimodalLM processor = AutoProcessor.from_pretrained("google/gemma-4-12B-it") model = AutoModelForMultimodalLM.from_pretrained("google/gemma-4-12B-it") - Notebooks
- Google Colab
- Kaggle
fix: render thinking channel regardless of tool_calls presence
Browse filesRemove message.get('tool_calls') guard from thinking channel
rendering. Reasoning on assistant messages without tool_calls
(e.g. final answer after a tool chain) was silently dropped
from conversation history.
Ref: https://github.com/vllm-project/vllm/pull/45553
- chat_template.jinja +1 -1
chat_template.jinja
CHANGED
|
@@ -234,7 +234,7 @@
|
|
| 234 |
{#- Render reasoning/reasoning_content as thinking channel (tool-call turns only) -#}
|
| 235 |
{%- set thinking_text = message.get('reasoning') or message.get('reasoning_content') -%}
|
| 236 |
{%- set thinking_gate = (loop.index0 > ns_turn.last_user_idx) or preserve_thinking -%}
|
| 237 |
-
{%- if thinking_text and thinking_gate
|
| 238 |
{{- '<|channel>thought\n' + thinking_text + '\n<channel|>' -}}
|
| 239 |
{%- endif -%}
|
| 240 |
|
|
|
|
| 234 |
{#- Render reasoning/reasoning_content as thinking channel (tool-call turns only) -#}
|
| 235 |
{%- set thinking_text = message.get('reasoning') or message.get('reasoning_content') -%}
|
| 236 |
{%- set thinking_gate = (loop.index0 > ns_turn.last_user_idx) or preserve_thinking -%}
|
| 237 |
+
{%- if thinking_text and thinking_gate -%}
|
| 238 |
{{- '<|channel>thought\n' + thinking_text + '\n<channel|>' -}}
|
| 239 |
{%- endif -%}
|
| 240 |
|