Instructions to use RohithMidigudla/gemma-health-telugu-lora-h1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use RohithMidigudla/gemma-health-telugu-lora-h1 with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("unsloth/gemma-4-E4B-it")
model = PeftModel.from_pretrained(base_model, "RohithMidigudla/gemma-health-telugu-lora-h1")

Transformers

How to use RohithMidigudla/gemma-health-telugu-lora-h1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="RohithMidigudla/gemma-health-telugu-lora-h1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("RohithMidigudla/gemma-health-telugu-lora-h1", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use RohithMidigudla/gemma-health-telugu-lora-h1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "RohithMidigudla/gemma-health-telugu-lora-h1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RohithMidigudla/gemma-health-telugu-lora-h1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/RohithMidigudla/gemma-health-telugu-lora-h1

SGLang

How to use RohithMidigudla/gemma-health-telugu-lora-h1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "RohithMidigudla/gemma-health-telugu-lora-h1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RohithMidigudla/gemma-health-telugu-lora-h1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "RohithMidigudla/gemma-health-telugu-lora-h1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RohithMidigudla/gemma-health-telugu-lora-h1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio

How to use RohithMidigudla/gemma-health-telugu-lora-h1 with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for RohithMidigudla/gemma-health-telugu-lora-h1 to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for RohithMidigudla/gemma-health-telugu-lora-h1 to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for RohithMidigudla/gemma-health-telugu-lora-h1 to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="RohithMidigudla/gemma-health-telugu-lora-h1",
    max_seq_length=2048,
)

Docker Model Runner
How to use RohithMidigudla/gemma-health-telugu-lora-h1 with Docker Model Runner:
```
docker model run hf.co/RohithMidigudla/gemma-health-telugu-lora-h1
```

RohithMidigudla commited on May 16

Commit

e28bc94

verified ·

1 Parent(s): 9f52bec

Upload folder using huggingface_hub

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.gitattributes +56 -0
README.md +63 -0
adapter_config.json +52 -0
adapter_model.safetensors +3 -0
chat_template.jinja +351 -0
checkpoint-100/README.md +210 -0
checkpoint-100/adapter_config.json +52 -0
checkpoint-100/adapter_model.safetensors +3 -0
checkpoint-100/chat_template.jinja +351 -0
checkpoint-100/optimizer.pt +3 -0
checkpoint-100/processor_config.json +75 -0
checkpoint-100/rng_state.pth +3 -0
checkpoint-100/scheduler.pt +3 -0
checkpoint-100/tokenizer.json +3 -0
checkpoint-100/tokenizer_config.json +289 -0
checkpoint-100/trainer_state.json +182 -0
checkpoint-100/training_args.bin +3 -0
checkpoint-1000/README.md +210 -0
checkpoint-1000/adapter_config.json +52 -0
checkpoint-1000/adapter_model.safetensors +3 -0
checkpoint-1000/chat_template.jinja +351 -0
checkpoint-1000/optimizer.pt +3 -0
checkpoint-1000/processor_config.json +75 -0
checkpoint-1000/rng_state.pth +3 -0
checkpoint-1000/scheduler.pt +3 -0
checkpoint-1000/tokenizer.json +3 -0
checkpoint-1000/tokenizer_config.json +289 -0
checkpoint-1000/trainer_state.json +1442 -0
checkpoint-1000/training_args.bin +3 -0
checkpoint-1100/README.md +210 -0
checkpoint-1100/adapter_config.json +52 -0
checkpoint-1100/adapter_model.safetensors +3 -0
checkpoint-1100/chat_template.jinja +351 -0
checkpoint-1100/optimizer.pt +3 -0
checkpoint-1100/processor_config.json +75 -0
checkpoint-1100/rng_state.pth +3 -0
checkpoint-1100/scheduler.pt +3 -0
checkpoint-1100/tokenizer.json +3 -0
checkpoint-1100/tokenizer_config.json +289 -0
checkpoint-1100/trainer_state.json +1582 -0
checkpoint-1100/training_args.bin +3 -0
checkpoint-1200/README.md +210 -0
checkpoint-1200/adapter_config.json +52 -0
checkpoint-1200/adapter_model.safetensors +3 -0
checkpoint-1200/chat_template.jinja +351 -0
checkpoint-1200/optimizer.pt +3 -0
checkpoint-1200/processor_config.json +75 -0
checkpoint-1200/rng_state.pth +3 -0
checkpoint-1200/scheduler.pt +3 -0
checkpoint-1200/tokenizer.json +3 -0

.gitattributes CHANGED Viewed

@@ -34,3 +34,59 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 last-checkpoint/tokenizer.json filter=lfs diff=lfs merge=lfs -text

 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
 last-checkpoint/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-100/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-1000/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-1100/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-1200/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-1300/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-1400/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-1500/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-1600/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-1700/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-1800/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-1900/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-200/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-2000/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-2100/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-2200/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-2300/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-2400/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-2500/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-2600/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-2700/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-2800/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-2900/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-300/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-3000/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-3100/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-3200/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-3300/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-3400/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-3500/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-3600/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-3700/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-3800/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-3900/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-400/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-4000/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-4100/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-4200/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-4300/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-4400/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-4500/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-4600/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-4700/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-4800/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-4900/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-500/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-5000/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-5100/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-5200/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-5300/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-5400/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-5500/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-600/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-700/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-800/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+checkpoint-900/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,63 @@

+---
+base_model: unsloth/gemma-4-E4B-it
+library_name: peft
+model_name: telugu
+tags:
+- base_model:adapter:unsloth/gemma-4-E4B-it
+- lora
+- sft
+- transformers
+- trl
+- unsloth
+licence: license
+pipeline_tag: text-generation
+---
+# Model Card for telugu
+This model is a fine-tuned version of [unsloth/gemma-4-E4B-it](https://huggingface.co/unsloth/gemma-4-E4B-it).
+It has been trained using [TRL](https://github.com/huggingface/trl).
+## Quick start
+```python
+from transformers import pipeline
+question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
+generator = pipeline("text-generation", model="None", device="cuda")
+output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
+print(output["generated_text"])
+```
+## Training procedure
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/rohithsaimidigudla-omnisynkai/gemma-health-adapters/runs/mwc9jt0z)
+This model was trained with SFT.
+### Framework versions
+- PEFT 0.19.1
+- TRL: 0.19.1
+- Transformers: 5.5.0
+- Pytorch: 2.7.0+cu128
+- Datasets: 3.6.0
+- Tokenizers: 0.22.2
+## Citations
+Cite TRL as:
+```bibtex
+@misc{vonwerra2022trl,
+	title        = {{TRL: Transformer Reinforcement Learning}},
+	author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
+	year         = 2020,
+	journal      = {GitHub repository},
+	publisher    = {GitHub},
+	howpublished = {\url{https://github.com/huggingface/trl}}
+}
+```

adapter_config.json ADDED Viewed

	@@ -0,0 +1,52 @@

+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": {
+    "base_model_class": "Gemma4ForConditionalGeneration",
+    "parent_library": "transformers.models.gemma4.modeling_gemma4",
+    "unsloth_fixed": true
+  },
+  "base_model_name_or_path": "unsloth/gemma-4-E4B-it",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 16,
+  "lora_bias": false,
+  "lora_dropout": 0.0,
+  "lora_ga_config": null,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.19.1",
+  "qalora_group_size": 16,
+  "r": 16,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "gate_proj",
+    "v_proj",
+    "o_proj",
+    "k_proj",
+    "up_proj",
+    "down_proj",
+    "q_proj"
+  ],
+  "target_parameters": null,
+  "task_type": "CAUSAL_LM",
+  "trainable_token_indices": null,
+  "use_bdlora": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:023fcb9c596c99c5e8d74320f9720621834918ec3bcd5d877b44b0fe0907ce2e
+size 169741912

chat_template.jinja ADDED Viewed

	@@ -0,0 +1,351 @@

+{%- macro format_parameters(properties, required, filter_keys=false) -%}
+    {%- set standard_keys = ['description', 'type', 'properties', 'required', 'nullable'] -%}
+    {%- set ns = namespace(found_first=false) -%}
+    {%- for key, value in properties | dictsort -%}
+        {%- set add_comma = false -%}
+        {%- if not filter_keys or key not in standard_keys -%}
+            {%- if ns.found_first %},{% endif -%}
+            {%- set ns.found_first = true -%}
+            {{ key }}:{
+            {%- if value['description'] -%}
+                description:<|"|>{{ value['description'] }}<|"|>
+                {%- set add_comma = true -%}
+            {%- endif -%}
+            {%- if value['type'] | upper == 'STRING' -%}
+                {%- if value['enum'] -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    enum:{{ format_argument(value['enum']) }}
+                {%- endif -%}
+            {%- elif value['type'] | upper == 'ARRAY' -%}
+                {%- if value['items'] is mapping and value['items'] -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    items:{
+                    {%- set ns_items = namespace(found_first=false) -%}
+                    {%- for item_key, item_value in value['items'] | dictsort -%}
+                        {%- if item_value is not none -%}
+                            {%- if ns_items.found_first %},{% endif -%}
+                            {%- set ns_items.found_first = true -%}
+                            {%- if item_key == 'properties' -%}
+                                properties:{
+                                {%- if item_value is mapping -%}
+                                    {{- format_parameters(item_value, value['items']['required'] | default([])) -}}
+                                {%- endif -%}
+                                }
+                            {%- elif item_key == 'required' -%}
+                                required:[
+                                {%- for req_item in item_value -%}
+                                    <|"|>{{- req_item -}}<|"|>
+                                    {%- if not loop.last %},{% endif -%}
+                                {%- endfor -%}
+                                ]
+                            {%- elif item_key == 'type' -%}
+                                {%- if item_value is string -%}
+                                    type:{{ format_argument(item_value | upper) }}
+                                {%- else -%}
+                                    type:{{ format_argument(item_value | map('upper') | list) }}
+                                {%- endif -%}
+                            {%- else -%}
+                                {{ item_key }}:{{ format_argument(item_value) }}
+                            {%- endif -%}
+                        {%- endif -%}
+                    {%- endfor -%}
+                    }
+                {%- endif -%}
+            {%- endif -%}
+            {%- if value['nullable'] %}
+                {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                nullable:true
+            {%- endif -%}
+            {%- if value['type'] | upper == 'OBJECT' -%}
+                {%- if value['properties'] is defined and value['properties'] is mapping -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    properties:{
+                    {{- format_parameters(value['properties'], value['required'] | default([])) -}}
+                    }
+                {%- elif value is mapping -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    properties:{
+                    {{- format_parameters(value, value['required'] | default([]), filter_keys=true) -}}
+                    }
+                {%- endif -%}
+                {%- if value['required'] -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    required:[
+                    {%- for item in value['required'] | default([]) -%}
+                        <|"|>{{- item -}}<|"|>
+                        {%- if not loop.last %},{% endif -%}
+                    {%- endfor -%}
+                    ]
+                {%- endif -%}
+            {%- endif -%}
+            {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+            type:<|"|>{{ value['type'] | upper }}<|"|>}
+        {%- endif -%}
+    {%- endfor -%}
+{%- endmacro -%}
+{%- macro format_function_declaration(tool_data) -%}
+    declaration:{{- tool_data['function']['name'] -}}{description:<|"|>{{- tool_data['function']['description'] -}}<|"|>
+    {%- set params = tool_data['function']['parameters'] -%}
+    {%- if params -%}
+        ,parameters:{
+        {%- if params['properties'] -%}
+            properties:{ {{- format_parameters(params['properties'], params['required']) -}} },
+        {%- endif -%}
+        {%- if params['required'] -%}
+            required:[
+            {%- for item in params['required'] -%}
+                <|"|>{{- item -}}<|"|>
+                {{- ',' if not loop.last -}}
+            {%- endfor -%}
+            ],
+        {%- endif -%}
+        {%- if params['type'] -%}
+            type:<|"|>{{- params['type'] | upper -}}<|"|>}
+        {%- endif -%}
+    {%- endif -%}
+    {%- if 'response' in tool_data['function'] -%}
+        {%- set response_declaration = tool_data['function']['response'] -%}
+        ,response:{
+        {%- if response_declaration['description'] -%}
+            description:<|"|>{{- response_declaration['description'] -}}<|"|>,
+        {%- endif -%}
+        {%- if response_declaration['type'] | upper == 'OBJECT' -%}
+            type:<|"|>{{- response_declaration['type'] | upper -}}<|"|>}
+        {%- endif -%}
+    {%- endif -%}
+    }
+{%- endmacro -%}
+{%- macro format_argument(argument, escape_keys=True) -%}
+    {%- if argument is string -%}
+        {{- '<|"|>' + argument + '<|"|>' -}}
+    {%- elif argument is boolean -%}
+        {{- 'true' if argument else 'false' -}}
+    {%- elif argument is mapping -%}
+        {{- '{' -}}
+        {%- set ns = namespace(found_first=false) -%}
+        {%- for key, value in argument | dictsort -%}
+            {%- if ns.found_first %},{% endif -%}
+            {%- set ns.found_first = true -%}
+            {%- if escape_keys -%}
+                {{- '<|"|>' + key + '<|"|>' -}}
+            {%- else -%}
+                {{- key -}}
+            {%- endif -%}
+            :{{- format_argument(value, escape_keys=escape_keys) -}}
+        {%- endfor -%}
+        {{- '}' -}}
+    {%- elif argument is sequence -%}
+        {{- '[' -}}
+        {%- for item in argument -%}
+            {{- format_argument(item, escape_keys=escape_keys) -}}
+            {%- if not loop.last %},{% endif -%}
+        {%- endfor -%}
+        {{- ']' -}}
+    {%- else -%}
+        {{- argument -}}
+    {%- endif -%}
+{%- endmacro -%}
+{%- macro strip_thinking(text) -%}
+    {%- set ns = namespace(result='') -%}
+    {%- for part in text.split('<channel|>') -%}
+        {%- if '<|channel>' in part -%}
+            {%- set ns.result = ns.result + part.split('<|channel>')[0] -%}
+        {%- else -%}
+            {%- set ns.result = ns.result + part -%}
+        {%- endif -%}
+    {%- endfor -%}
+    {{- ns.result | trim -}}
+{%- endmacro -%}
+{%- macro format_tool_response_block(tool_name, response) -%}
+    {{- '<|tool_response>' -}}
+    {%- if response is mapping -%}
+        {{- 'response:' + tool_name + '{' -}}
+        {%- for key, value in response | dictsort -%}
+            {{- key -}}:{{- format_argument(value, escape_keys=False) -}}
+            {%- if not loop.last %},{% endif -%}
+        {%- endfor -%}
+        {{- '}' -}}
+    {%- else -%}
+        {{- 'response:' + tool_name + '{value:' + format_argument(response, escape_keys=False) + '}' -}}
+    {%- endif -%}
+    {{- '<tool_response|>' -}}
+{%- endmacro -%}
+{%- set ns = namespace(prev_message_type=None) -%}
+{%- set loop_messages = messages -%}
+{{- bos_token -}}
+{#- Handle System/Tool Definitions Block -#}
+{%- if (enable_thinking is defined and enable_thinking) or tools or messages[0]['role'] in ['system', 'developer'] -%}
+    {{- '<|turn>system\n' -}}
+    {#- Inject Thinking token at the very top of the FIRST system turn -#}
+    {%- if enable_thinking is defined and enable_thinking -%}
+        {{- '<|think|>\n' -}}
+        {%- set ns.prev_message_type = 'think' -%}
+    {%- endif -%}
+    {%- if messages[0]['role'] in ['system', 'developer'] -%}
+        {%- if messages[0]['content'] is string -%}
+            {{- messages[0]['content'] | trim -}}
+        {%- elif messages[0]['content'] is sequence -%}
+            {%- for item in messages[0]['content'] -%}
+                {{- item['text'] | trim + ' '-}}
+            {%- endfor -%}
+        {%- endif -%}
+        {%- set loop_messages = messages[1:] -%}
+    {%- endif -%}
+    {%- if tools -%}
+        {%- for tool in tools %}
+            {{- '<|tool>' -}}
+            {{- format_function_declaration(tool) | trim -}}
+            {{- '<tool|>' -}}
+        {%- endfor %}
+        {%- set ns.prev_message_type = 'tool' -%}
+    {%- endif -%}
+    {{- '<turn|>\n' -}}
+{%- endif %}
+{#- Pre-scan: find last user message index for reasoning guard -#}
+{%- set ns_turn = namespace(last_user_idx=-1) -%}
+{%- for i in range(loop_messages | length) -%}
+    {%- if loop_messages[i]['role'] == 'user' -%}
+        {%- set ns_turn.last_user_idx = i -%}
+    {%- endif -%}
+{%- endfor -%}
+{#- Loop through messages -#}
+{%- for message in loop_messages -%}
+    {%- if message['role'] != 'tool' -%}
+    {%- set ns.prev_message_type = None -%}
+    {%- set role = 'model' if message['role'] == 'assistant' else message['role'] -%}
+    {#- Detect continuation: suppress duplicate <|turn>model when previous non-tool message was also assistant -#}
+    {%- set prev_nt = namespace(role=None, found=false) -%}
+    {%- if loop.index0 > 0 -%}
+        {%- for j in range(loop.index0 - 1, -1, -1) -%}
+            {%- if not prev_nt.found -%}
+                {%- if loop_messages[j]['role'] != 'tool' -%}
+                    {%- set prev_nt.role = loop_messages[j]['role'] -%}
+                    {%- set prev_nt.found = true -%}
+                {%- endif -%}
+            {%- endif -%}
+        {%- endfor -%}
+    {%- endif -%}
+    {%- set continue_same_model_turn = (role == 'model' and prev_nt.role == 'assistant') -%}
+    {%- if not continue_same_model_turn -%}
+        {{- '<|turn>' + role + '\n' }}
+    {%- endif -%}
+    {#- Render reasoning/reasoning_content as thinking channel -#}
+    {%- set thinking_text = message.get('reasoning') or message.get('reasoning_content') -%}
+    {%- if thinking_text and loop.index0 > ns_turn.last_user_idx and message.get('tool_calls') -%}
+        {{- '<|channel>thought\n' + thinking_text + '\n<channel|>' -}}
+    {%- endif -%}
+            {%- if message['tool_calls'] -%}
+                {%- for tool_call in message['tool_calls'] -%}
+                    {%- set function = tool_call['function'] -%}
+                    {{- '<|tool_call>call:' + function['name'] + '{' -}}
+                    {%- if function['arguments'] is mapping -%}
+                        {%- set ns_args = namespace(found_first=false) -%}
+                        {%- for key, value in function['arguments'] | dictsort -%}
+                            {%- if ns_args.found_first %},{% endif -%}
+                            {%- set ns_args.found_first = true -%}
+                            {{- key -}}:{{- format_argument(value, escape_keys=False) -}}
+                        {%- endfor -%}
+                    {%- elif function['arguments'] is string -%}
+                        {{- function['arguments'] -}}
+                    {%- endif -%}
+                    {{- '}<tool_call|>' -}}
+                {%- endfor -%}
+                {%- set ns.prev_message_type = 'tool_call' -%}
+            {%- endif -%}
+            {%- set ns_tr_out = namespace(flag=false) -%}
+            {%- if message.get('tool_responses') -%}
+                {#- Legacy: tool_responses embedded on the assistant message (Google/Gemma native) -#}
+                {%- for tool_response in message['tool_responses'] -%}
+                    {{- format_tool_response_block(tool_response['name'] | default('unknown'), tool_response['response']) -}}
+                    {%- set ns_tr_out.flag = true -%}
+                    {%- set ns.prev_message_type = 'tool_response' -%}
+                {%- endfor -%}
+            {%- elif message.get('tool_calls') -%}
+                {#- OpenAI Chat Completions: forward-scan consecutive role:tool messages -#}
+                {%- set ns_tool_scan = namespace(stopped=false) -%}
+                {%- for k in range(loop.index0 + 1, loop_messages | length) -%}
+                    {%- if ns_tool_scan.stopped -%}
+                    {%- elif loop_messages[k]['role'] != 'tool' -%}
+                        {%- set ns_tool_scan.stopped = true -%}
+                    {%- else -%}
+                        {%- set follow = loop_messages[k] -%}
+                        {#- Resolve tool_call_id to function name -#}
+                        {%- set ns_tname = namespace(name=follow.get('name') | default('unknown')) -%}
+                        {%- for tc in message['tool_calls'] -%}
+                            {%- if tc.get('id') == follow.get('tool_call_id') -%}
+                                {%- set ns_tname.name = tc['function']['name'] -%}
+                            {%- endif -%}
+                        {%- endfor -%}
+                        {#- Handle content as string or content-parts array -#}
+                        {%- set tool_body = follow.get('content') -%}
+                        {%- if tool_body is string -%}
+                            {{- format_tool_response_block(ns_tname.name, tool_body) -}}
+                        {%- elif tool_body is sequence and tool_body is not string -%}
+                            {%- set ns_txt = namespace(s='') -%}
+                            {%- for part in tool_body -%}
+                                {%- if part.get('type') == 'text' -%}
+                                    {%- set ns_txt.s = ns_txt.s + (part.get('text') | default('')) -%}
+                                {%- endif -%}
+                            {%- endfor -%}
+                            {{- format_tool_response_block(ns_tname.name, ns_txt.s) -}}
+                        {%- else -%}
+                            {{- format_tool_response_block(ns_tname.name, tool_body) -}}
+                        {%- endif -%}
+                        {%- set ns_tr_out.flag = true -%}
+                        {%- set ns.prev_message_type = 'tool_response' -%}
+                    {%- endif -%}
+                {%- endfor -%}
+            {%- endif -%}
+            {%- set captured_content -%}
+            {%- if message['content'] is string -%}
+                {%- if role == 'model' -%}
+                    {{- strip_thinking(message['content']) -}}
+                {%- else -%}
+                    {{- message['content'] | trim -}}
+                {%- endif -%}
+            {%- elif message['content'] is sequence -%}
+                {%- for item in message['content'] -%}
+                    {%- if item['type'] == 'text' -%}
+                        {%- if role == 'model' -%}
+                            {{- strip_thinking(item['text']) -}}
+                        {%- else -%}
+                            {{- item['text'] | trim -}}
+                        {%- endif -%}
+                    {%- elif item['type'] == 'image' -%}
+                        {{- '<|image|>' -}}
+                        {%- set ns.prev_message_type = 'image' -%}
+                    {%- elif item['type'] == 'audio' -%}
+                        {{- '<|audio|>' -}}
+                        {%- set ns.prev_message_type = 'audio' -%}
+                    {%- elif item['type'] == 'video' -%}
+                        {{- '<|video|>' -}}
+                        {%- set ns.prev_message_type = 'video' -%}
+                    {%- endif -%}
+                {%- endfor -%}
+            {%- endif -%}
+            {%- endset -%}
+            {{- captured_content -}}
+            {%- set has_content = captured_content | trim | length > 0 -%}
+        {%- if ns.prev_message_type == 'tool_call' and not ns_tr_out.flag -%}
+            {{- '<|tool_response>' -}}
+        {%- elif not (ns_tr_out.flag and not has_content) -%}
+            {{- '<turn|>\n' -}}
+        {%- endif -%}
+    {%- endif -%}
+{%- endfor -%}
+{%- if add_generation_prompt -%}
+    {%- if ns.prev_message_type != 'tool_response' and ns.prev_message_type != 'tool_call' -%}
+        {{- '<|turn>model\n' -}}
+    {%- endif -%}
+{%- endif -%}

checkpoint-100/README.md ADDED Viewed

	@@ -0,0 +1,210 @@

+---
+base_model: unsloth/gemma-4-E4B-it
+library_name: peft
+pipeline_tag: text-generation
+tags:
+- base_model:adapter:unsloth/gemma-4-E4B-it
+- lora
+- sft
+- transformers
+- trl
+- unsloth
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.19.1

checkpoint-100/adapter_config.json ADDED Viewed

	@@ -0,0 +1,52 @@

+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": {
+    "base_model_class": "Gemma4ForConditionalGeneration",
+    "parent_library": "transformers.models.gemma4.modeling_gemma4",
+    "unsloth_fixed": true
+  },
+  "base_model_name_or_path": "unsloth/gemma-4-E4B-it",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 16,
+  "lora_bias": false,
+  "lora_dropout": 0.0,
+  "lora_ga_config": null,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.19.1",
+  "qalora_group_size": 16,
+  "r": 16,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "gate_proj",
+    "v_proj",
+    "o_proj",
+    "k_proj",
+    "up_proj",
+    "down_proj",
+    "q_proj"
+  ],
+  "target_parameters": null,
+  "task_type": "CAUSAL_LM",
+  "trainable_token_indices": null,
+  "use_bdlora": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

checkpoint-100/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2a0d76b0ebb45ec68a37d642d7342c66a7ebc9bc3239f3387972226f24509e56
+size 169741912

checkpoint-100/chat_template.jinja ADDED Viewed

	@@ -0,0 +1,351 @@

+{%- macro format_parameters(properties, required, filter_keys=false) -%}
+    {%- set standard_keys = ['description', 'type', 'properties', 'required', 'nullable'] -%}
+    {%- set ns = namespace(found_first=false) -%}
+    {%- for key, value in properties | dictsort -%}
+        {%- set add_comma = false -%}
+        {%- if not filter_keys or key not in standard_keys -%}
+            {%- if ns.found_first %},{% endif -%}
+            {%- set ns.found_first = true -%}
+            {{ key }}:{
+            {%- if value['description'] -%}
+                description:<|"|>{{ value['description'] }}<|"|>
+                {%- set add_comma = true -%}
+            {%- endif -%}
+            {%- if value['type'] | upper == 'STRING' -%}
+                {%- if value['enum'] -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    enum:{{ format_argument(value['enum']) }}
+                {%- endif -%}
+            {%- elif value['type'] | upper == 'ARRAY' -%}
+                {%- if value['items'] is mapping and value['items'] -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    items:{
+                    {%- set ns_items = namespace(found_first=false) -%}
+                    {%- for item_key, item_value in value['items'] | dictsort -%}
+                        {%- if item_value is not none -%}
+                            {%- if ns_items.found_first %},{% endif -%}
+                            {%- set ns_items.found_first = true -%}
+                            {%- if item_key == 'properties' -%}
+                                properties:{
+                                {%- if item_value is mapping -%}
+                                    {{- format_parameters(item_value, value['items']['required'] | default([])) -}}
+                                {%- endif -%}
+                                }
+                            {%- elif item_key == 'required' -%}
+                                required:[
+                                {%- for req_item in item_value -%}
+                                    <|"|>{{- req_item -}}<|"|>
+                                    {%- if not loop.last %},{% endif -%}
+                                {%- endfor -%}
+                                ]
+                            {%- elif item_key == 'type' -%}
+                                {%- if item_value is string -%}
+                                    type:{{ format_argument(item_value | upper) }}
+                                {%- else -%}
+                                    type:{{ format_argument(item_value | map('upper') | list) }}
+                                {%- endif -%}
+                            {%- else -%}
+                                {{ item_key }}:{{ format_argument(item_value) }}
+                            {%- endif -%}
+                        {%- endif -%}
+                    {%- endfor -%}
+                    }
+                {%- endif -%}
+            {%- endif -%}
+            {%- if value['nullable'] %}
+                {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                nullable:true
+            {%- endif -%}
+            {%- if value['type'] | upper == 'OBJECT' -%}
+                {%- if value['properties'] is defined and value['properties'] is mapping -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    properties:{
+                    {{- format_parameters(value['properties'], value['required'] | default([])) -}}
+                    }
+                {%- elif value is mapping -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    properties:{
+                    {{- format_parameters(value, value['required'] | default([]), filter_keys=true) -}}
+                    }
+                {%- endif -%}
+                {%- if value['required'] -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    required:[
+                    {%- for item in value['required'] | default([]) -%}
+                        <|"|>{{- item -}}<|"|>
+                        {%- if not loop.last %},{% endif -%}
+                    {%- endfor -%}
+                    ]
+                {%- endif -%}
+            {%- endif -%}
+            {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+            type:<|"|>{{ value['type'] | upper }}<|"|>}
+        {%- endif -%}
+    {%- endfor -%}
+{%- endmacro -%}
+{%- macro format_function_declaration(tool_data) -%}
+    declaration:{{- tool_data['function']['name'] -}}{description:<|"|>{{- tool_data['function']['description'] -}}<|"|>
+    {%- set params = tool_data['function']['parameters'] -%}
+    {%- if params -%}
+        ,parameters:{
+        {%- if params['properties'] -%}
+            properties:{ {{- format_parameters(params['properties'], params['required']) -}} },
+        {%- endif -%}
+        {%- if params['required'] -%}
+            required:[
+            {%- for item in params['required'] -%}
+                <|"|>{{- item -}}<|"|>
+                {{- ',' if not loop.last -}}
+            {%- endfor -%}
+            ],
+        {%- endif -%}
+        {%- if params['type'] -%}
+            type:<|"|>{{- params['type'] | upper -}}<|"|>}
+        {%- endif -%}
+    {%- endif -%}
+    {%- if 'response' in tool_data['function'] -%}
+        {%- set response_declaration = tool_data['function']['response'] -%}
+        ,response:{
+        {%- if response_declaration['description'] -%}
+            description:<|"|>{{- response_declaration['description'] -}}<|"|>,
+        {%- endif -%}
+        {%- if response_declaration['type'] | upper == 'OBJECT' -%}
+            type:<|"|>{{- response_declaration['type'] | upper -}}<|"|>}
+        {%- endif -%}
+    {%- endif -%}
+    }
+{%- endmacro -%}
+{%- macro format_argument(argument, escape_keys=True) -%}
+    {%- if argument is string -%}
+        {{- '<|"|>' + argument + '<|"|>' -}}
+    {%- elif argument is boolean -%}
+        {{- 'true' if argument else 'false' -}}
+    {%- elif argument is mapping -%}
+        {{- '{' -}}
+        {%- set ns = namespace(found_first=false) -%}
+        {%- for key, value in argument | dictsort -%}
+            {%- if ns.found_first %},{% endif -%}
+            {%- set ns.found_first = true -%}
+            {%- if escape_keys -%}
+                {{- '<|"|>' + key + '<|"|>' -}}
+            {%- else -%}
+                {{- key -}}
+            {%- endif -%}
+            :{{- format_argument(value, escape_keys=escape_keys) -}}
+        {%- endfor -%}
+        {{- '}' -}}
+    {%- elif argument is sequence -%}
+        {{- '[' -}}
+        {%- for item in argument -%}
+            {{- format_argument(item, escape_keys=escape_keys) -}}
+            {%- if not loop.last %},{% endif -%}
+        {%- endfor -%}
+        {{- ']' -}}
+    {%- else -%}
+        {{- argument -}}
+    {%- endif -%}
+{%- endmacro -%}
+{%- macro strip_thinking(text) -%}
+    {%- set ns = namespace(result='') -%}
+    {%- for part in text.split('<channel|>') -%}
+        {%- if '<|channel>' in part -%}
+            {%- set ns.result = ns.result + part.split('<|channel>')[0] -%}
+        {%- else -%}
+            {%- set ns.result = ns.result + part -%}
+        {%- endif -%}
+    {%- endfor -%}
+    {{- ns.result | trim -}}
+{%- endmacro -%}
+{%- macro format_tool_response_block(tool_name, response) -%}
+    {{- '<|tool_response>' -}}
+    {%- if response is mapping -%}
+        {{- 'response:' + tool_name + '{' -}}
+        {%- for key, value in response | dictsort -%}
+            {{- key -}}:{{- format_argument(value, escape_keys=False) -}}
+            {%- if not loop.last %},{% endif -%}
+        {%- endfor -%}
+        {{- '}' -}}
+    {%- else -%}
+        {{- 'response:' + tool_name + '{value:' + format_argument(response, escape_keys=False) + '}' -}}
+    {%- endif -%}
+    {{- '<tool_response|>' -}}
+{%- endmacro -%}
+{%- set ns = namespace(prev_message_type=None) -%}
+{%- set loop_messages = messages -%}
+{{- bos_token -}}
+{#- Handle System/Tool Definitions Block -#}
+{%- if (enable_thinking is defined and enable_thinking) or tools or messages[0]['role'] in ['system', 'developer'] -%}
+    {{- '<|turn>system\n' -}}
+    {#- Inject Thinking token at the very top of the FIRST system turn -#}
+    {%- if enable_thinking is defined and enable_thinking -%}
+        {{- '<|think|>\n' -}}
+        {%- set ns.prev_message_type = 'think' -%}
+    {%- endif -%}
+    {%- if messages[0]['role'] in ['system', 'developer'] -%}
+        {%- if messages[0]['content'] is string -%}
+            {{- messages[0]['content'] | trim -}}
+        {%- elif messages[0]['content'] is sequence -%}
+            {%- for item in messages[0]['content'] -%}
+                {{- item['text'] | trim + ' '-}}
+            {%- endfor -%}
+        {%- endif -%}
+        {%- set loop_messages = messages[1:] -%}
+    {%- endif -%}
+    {%- if tools -%}
+        {%- for tool in tools %}
+            {{- '<|tool>' -}}
+            {{- format_function_declaration(tool) | trim -}}
+            {{- '<tool|>' -}}
+        {%- endfor %}
+        {%- set ns.prev_message_type = 'tool' -%}
+    {%- endif -%}
+    {{- '<turn|>\n' -}}
+{%- endif %}
+{#- Pre-scan: find last user message index for reasoning guard -#}
+{%- set ns_turn = namespace(last_user_idx=-1) -%}
+{%- for i in range(loop_messages | length) -%}
+    {%- if loop_messages[i]['role'] == 'user' -%}
+        {%- set ns_turn.last_user_idx = i -%}
+    {%- endif -%}
+{%- endfor -%}
+{#- Loop through messages -#}
+{%- for message in loop_messages -%}
+    {%- if message['role'] != 'tool' -%}
+    {%- set ns.prev_message_type = None -%}
+    {%- set role = 'model' if message['role'] == 'assistant' else message['role'] -%}
+    {#- Detect continuation: suppress duplicate <|turn>model when previous non-tool message was also assistant -#}
+    {%- set prev_nt = namespace(role=None, found=false) -%}
+    {%- if loop.index0 > 0 -%}
+        {%- for j in range(loop.index0 - 1, -1, -1) -%}
+            {%- if not prev_nt.found -%}
+                {%- if loop_messages[j]['role'] != 'tool' -%}
+                    {%- set prev_nt.role = loop_messages[j]['role'] -%}
+                    {%- set prev_nt.found = true -%}
+                {%- endif -%}
+            {%- endif -%}
+        {%- endfor -%}
+    {%- endif -%}
+    {%- set continue_same_model_turn = (role == 'model' and prev_nt.role == 'assistant') -%}
+    {%- if not continue_same_model_turn -%}
+        {{- '<|turn>' + role + '\n' }}
+    {%- endif -%}
+    {#- Render reasoning/reasoning_content as thinking channel -#}
+    {%- set thinking_text = message.get('reasoning') or message.get('reasoning_content') -%}
+    {%- if thinking_text and loop.index0 > ns_turn.last_user_idx and message.get('tool_calls') -%}
+        {{- '<|channel>thought\n' + thinking_text + '\n<channel|>' -}}
+    {%- endif -%}
+            {%- if message['tool_calls'] -%}
+                {%- for tool_call in message['tool_calls'] -%}
+                    {%- set function = tool_call['function'] -%}
+                    {{- '<|tool_call>call:' + function['name'] + '{' -}}
+                    {%- if function['arguments'] is mapping -%}
+                        {%- set ns_args = namespace(found_first=false) -%}
+                        {%- for key, value in function['arguments'] | dictsort -%}
+                            {%- if ns_args.found_first %},{% endif -%}
+                            {%- set ns_args.found_first = true -%}
+                            {{- key -}}:{{- format_argument(value, escape_keys=False) -}}
+                        {%- endfor -%}
+                    {%- elif function['arguments'] is string -%}
+                        {{- function['arguments'] -}}
+                    {%- endif -%}
+                    {{- '}<tool_call|>' -}}
+                {%- endfor -%}
+                {%- set ns.prev_message_type = 'tool_call' -%}
+            {%- endif -%}
+            {%- set ns_tr_out = namespace(flag=false) -%}
+            {%- if message.get('tool_responses') -%}
+                {#- Legacy: tool_responses embedded on the assistant message (Google/Gemma native) -#}
+                {%- for tool_response in message['tool_responses'] -%}
+                    {{- format_tool_response_block(tool_response['name'] | default('unknown'), tool_response['response']) -}}
+                    {%- set ns_tr_out.flag = true -%}
+                    {%- set ns.prev_message_type = 'tool_response' -%}
+                {%- endfor -%}
+            {%- elif message.get('tool_calls') -%}
+                {#- OpenAI Chat Completions: forward-scan consecutive role:tool messages -#}
+                {%- set ns_tool_scan = namespace(stopped=false) -%}
+                {%- for k in range(loop.index0 + 1, loop_messages | length) -%}
+                    {%- if ns_tool_scan.stopped -%}
+                    {%- elif loop_messages[k]['role'] != 'tool' -%}
+                        {%- set ns_tool_scan.stopped = true -%}
+                    {%- else -%}
+                        {%- set follow = loop_messages[k] -%}
+                        {#- Resolve tool_call_id to function name -#}
+                        {%- set ns_tname = namespace(name=follow.get('name') | default('unknown')) -%}
+                        {%- for tc in message['tool_calls'] -%}
+                            {%- if tc.get('id') == follow.get('tool_call_id') -%}
+                                {%- set ns_tname.name = tc['function']['name'] -%}
+                            {%- endif -%}
+                        {%- endfor -%}
+                        {#- Handle content as string or content-parts array -#}
+                        {%- set tool_body = follow.get('content') -%}
+                        {%- if tool_body is string -%}
+                            {{- format_tool_response_block(ns_tname.name, tool_body) -}}
+                        {%- elif tool_body is sequence and tool_body is not string -%}
+                            {%- set ns_txt = namespace(s='') -%}
+                            {%- for part in tool_body -%}
+                                {%- if part.get('type') == 'text' -%}
+                                    {%- set ns_txt.s = ns_txt.s + (part.get('text') | default('')) -%}
+                                {%- endif -%}
+                            {%- endfor -%}
+                            {{- format_tool_response_block(ns_tname.name, ns_txt.s) -}}
+                        {%- else -%}
+                            {{- format_tool_response_block(ns_tname.name, tool_body) -}}
+                        {%- endif -%}
+                        {%- set ns_tr_out.flag = true -%}
+                        {%- set ns.prev_message_type = 'tool_response' -%}
+                    {%- endif -%}
+                {%- endfor -%}
+            {%- endif -%}
+            {%- set captured_content -%}
+            {%- if message['content'] is string -%}
+                {%- if role == 'model' -%}
+                    {{- strip_thinking(message['content']) -}}
+                {%- else -%}
+                    {{- message['content'] | trim -}}
+                {%- endif -%}
+            {%- elif message['content'] is sequence -%}
+                {%- for item in message['content'] -%}
+                    {%- if item['type'] == 'text' -%}
+                        {%- if role == 'model' -%}
+                            {{- strip_thinking(item['text']) -}}
+                        {%- else -%}
+                            {{- item['text'] | trim -}}
+                        {%- endif -%}
+                    {%- elif item['type'] == 'image' -%}
+                        {{- '<|image|>' -}}
+                        {%- set ns.prev_message_type = 'image' -%}
+                    {%- elif item['type'] == 'audio' -%}
+                        {{- '<|audio|>' -}}
+                        {%- set ns.prev_message_type = 'audio' -%}
+                    {%- elif item['type'] == 'video' -%}
+                        {{- '<|video|>' -}}
+                        {%- set ns.prev_message_type = 'video' -%}
+                    {%- endif -%}
+                {%- endfor -%}
+            {%- endif -%}
+            {%- endset -%}
+            {{- captured_content -}}
+            {%- set has_content = captured_content | trim | length > 0 -%}
+        {%- if ns.prev_message_type == 'tool_call' and not ns_tr_out.flag -%}
+            {{- '<|tool_response>' -}}
+        {%- elif not (ns_tr_out.flag and not has_content) -%}
+            {{- '<turn|>\n' -}}
+        {%- endif -%}
+    {%- endif -%}
+{%- endfor -%}
+{%- if add_generation_prompt -%}
+    {%- if ns.prev_message_type != 'tool_response' and ns.prev_message_type != 'tool_call' -%}
+        {{- '<|turn>model\n' -}}
+    {%- endif -%}
+{%- endif -%}

checkpoint-100/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7ebe97922ef0bee5a2887cb2ee8f12595764d517de7176ed003caf71939844df
+size 71463733

checkpoint-100/processor_config.json ADDED Viewed

	@@ -0,0 +1,75 @@

+{
+  "audio_ms_per_token": 40,
+  "audio_seq_length": 750,
+  "feature_extractor": {
+    "dither": 0.0,
+    "feature_extractor_type": "Gemma4AudioFeatureExtractor",
+    "feature_size": 128,
+    "fft_length": 512,
+    "fft_overdrive": false,
+    "frame_length": 320,
+    "hop_length": 160,
+    "input_scale_factor": 1.0,
+    "max_frequency": 8000.0,
+    "mel_floor": 0.001,
+    "min_frequency": 0.0,
+    "padding_side": "left",
+    "padding_value": 0.0,
+    "per_bin_mean": null,
+    "per_bin_stddev": null,
+    "preemphasis": 0.0,
+    "preemphasis_htk_flavor": true,
+    "return_attention_mask": true,
+    "sampling_rate": 16000
+  },
+  "image_processor": {
+    "do_convert_rgb": true,
+    "do_normalize": false,
+    "do_rescale": true,
+    "do_resize": true,
+    "image_mean": [
+      0.0,
+      0.0,
+      0.0
+    ],
+    "image_processor_type": "Gemma4ImageProcessor",
+    "image_seq_length": 280,
+    "image_std": [
+      1.0,
+      1.0,
+      1.0
+    ],
+    "max_soft_tokens": 280,
+    "patch_size": 16,
+    "pooling_kernel_size": 3,
+    "resample": 3,
+    "rescale_factor": 0.00392156862745098
+  },
+  "image_seq_length": 280,
+  "processor_class": "Gemma4Processor",
+  "video_processor": {
+    "do_convert_rgb": true,
+    "do_normalize": true,
+    "do_rescale": true,
+    "do_resize": true,
+    "do_sample_frames": true,
+    "image_mean": [
+      0.0,
+      0.0,
+      0.0
+    ],
+    "image_std": [
+      1.0,
+      1.0,
+      1.0
+    ],
+    "max_soft_tokens": 70,
+    "num_frames": 32,
+    "patch_size": 16,
+    "pooling_kernel_size": 3,
+    "resample": 3,
+    "rescale_factor": 0.00392156862745098,
+    "return_metadata": false,
+    "video_processor_type": "Gemma4VideoProcessor"
+  }
+}

checkpoint-100/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:098b29492211804ab324a36f37466821d948280bb74fce4ba895c03f13ecd878
+size 14645

checkpoint-100/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bfa39a08ca6ca0b25c44556fe7464362808ae67fd00d1432e1130777acac8674
+size 1465

checkpoint-100/tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cc8d3a0ce36466ccc1278bf987df5f71db1719b9ca6b4118264f45cb627bfe0f
+size 32169626

checkpoint-100/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,289 @@

+{
+  "audio_token": "<|audio|>",
+  "backend": "tokenizers",
+  "boa_token": "<|audio>",
+  "boi_token": "<|image>",
+  "bos_token": "<bos>",
+  "eoa_token": "<audio|>",
+  "eoc_token": "<channel|>",
+  "eoi_token": "<image|>",
+  "eos_token": "<turn|>",
+  "eot_token": "<turn|>",
+  "escape_token": "<|\"|>",
+  "etc_token": "<tool_call|>",
+  "etd_token": "<tool|>",
+  "etr_token": "<tool_response|>",
+  "extra_special_tokens": [
+    "<|video|>"
+  ],
+  "image_token": "<|image|>",
+  "is_local": false,
+  "mask_token": "<mask>",
+  "model_max_length": 131072,
+  "model_specific_special_tokens": {
+    "audio_token": "<|audio|>",
+    "boa_token": "<|audio>",
+    "boi_token": "<|image>",
+    "eoa_token": "<audio|>",
+    "eoc_token": "<channel|>",
+    "eoi_token": "<image|>",
+    "eot_token": "<turn|>",
+    "escape_token": "<|\"|>",
+    "etc_token": "<tool_call|>",
+    "etd_token": "<tool|>",
+    "etr_token": "<tool_response|>",
+    "image_token": "<|image|>",
+    "soc_token": "<|channel>",
+    "sot_token": "<|turn>",
+    "stc_token": "<|tool_call>",
+    "std_token": "<|tool>",
+    "str_token": "<|tool_response>",
+    "think_token": "<|think|>"
+  },
+  "pad_token": "<pad>",
+  "padding_side": "right",
+  "processor_class": "Gemma4Processor",
+  "response_schema": {
+    "properties": {
+      "content": {
+        "type": "string"
+      },
+      "role": {
+        "const": "assistant"
+      },
+      "thinking": {
+        "type": "string"
+      },
+      "tool_calls": {
+        "items": {
+          "properties": {
+            "function": {
+              "properties": {
+                "arguments": {
+                  "additionalProperties": {},
+                  "type": "object",
+                  "x-parser": "gemma4-tool-call"
+                },
+                "name": {
+                  "type": "string"
+                }
+              },
+              "type": "object",
+              "x-regex": "call\\:(?P<name>\\w+)(?P<arguments>\\{.*\\})"
+            },
+            "type": {
+              "const": "function"
+            }
+          },
+          "type": "object"
+        },
+        "type": "array",
+        "x-regex-iterator": "<\\|tool_call>(.*?)<tool_call\\|>"
+      }
+    },
+    "type": "object",
+    "x-regex": "(\\<\\|channel\\>thought\\n(?P<thinking>.*?)\\<channel\\|\\>)?(?P<tool_calls>\\<\\|tool_call\\>.*\\<tool_call\\|\\>)?(?P<content>(?:(?!\\<turn\\|\\>)(?!\\<\\|tool_response\\>).)+)?(?:\\<turn\\|\\>|\\<\\|tool_response\\>)?"
+  },
+  "soc_token": "<|channel>",
+  "sot_token": "<|turn>",
+  "stc_token": "<|tool_call>",
+  "std_token": "<|tool>",
+  "str_token": "<|tool_response>",
+  "think_token": "<|think|>",
+  "tokenizer_class": "GemmaTokenizer",
+  "unk_token": "<unk>",
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<pad>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "1": {
+      "content": "<eos>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "2": {
+      "content": "<bos>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "3": {
+      "content": "<unk>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "4": {
+      "content": "<mask>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "46": {
+      "content": "<|tool>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "47": {
+      "content": "<tool|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "48": {
+      "content": "<|tool_call>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "49": {
+      "content": "<tool_call|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "50": {
+      "content": "<|tool_response>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "51": {
+      "content": "<tool_response|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "52": {
+      "content": "<|\"|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "98": {
+      "content": "<|think|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "100": {
+      "content": "<|channel>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "101": {
+      "content": "<channel|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "105": {
+      "content": "<|turn>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "106": {
+      "content": "<turn|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "255999": {
+      "content": "<|image>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "256000": {
+      "content": "<|audio>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "258880": {
+      "content": "<|image|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "258881": {
+      "content": "<|audio|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "258882": {
+      "content": "<image|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "258883": {
+      "content": "<audio|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "258884": {
+      "content": "<|video|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    }
+  }
+}

checkpoint-100/trainer_state.json ADDED Viewed

	@@ -0,0 +1,182 @@

+{
+  "best_global_step": null,
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 0.018195050946142648,
+  "eval_steps": 100,
+  "global_step": 100,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.0009097525473071324,
+      "grad_norm": 1.0602493286132812,
+      "learning_rate": 1.2121212121212122e-06,
+      "loss": 1.7156932830810547,
+      "step": 5
+    },
+    {
+      "epoch": 0.001819505094614265,
+      "grad_norm": 1.1577719449996948,
+      "learning_rate": 2.7272727272727272e-06,
+      "loss": 1.6629371643066406,
+      "step": 10
+    },
+    {
+      "epoch": 0.0027292576419213972,
+      "grad_norm": 1.0288419723510742,
+      "learning_rate": 4.242424242424243e-06,
+      "loss": 1.6706295013427734,
+      "step": 15
+    },
+    {
+      "epoch": 0.00363901018922853,
+      "grad_norm": 2.129403829574585,
+      "learning_rate": 5.7575757575757586e-06,
+      "loss": 1.7363752365112304,
+      "step": 20
+    },
+    {
+      "epoch": 0.004548762736535662,
+      "grad_norm": 1.9468326568603516,
+      "learning_rate": 7.272727272727272e-06,
+      "loss": 1.7111135482788087,
+      "step": 25
+    },
+    {
+      "epoch": 0.0054585152838427945,
+      "grad_norm": 1.1269357204437256,
+      "learning_rate": 8.787878787878788e-06,
+      "loss": 1.6924203872680663,
+      "step": 30
+    },
+    {
+      "epoch": 0.006368267831149927,
+      "grad_norm": 1.4021248817443848,
+      "learning_rate": 1.0303030303030304e-05,
+      "loss": 1.658310317993164,
+      "step": 35
+    },
+    {
+      "epoch": 0.00727802037845706,
+      "grad_norm": 1.313381314277649,
+      "learning_rate": 1.1818181818181819e-05,
+      "loss": 1.5383296012878418,
+      "step": 40
+    },
+    {
+      "epoch": 0.008187772925764192,
+      "grad_norm": 2.4359891414642334,
+      "learning_rate": 1.3333333333333333e-05,
+      "loss": 1.4302565574645996,
+      "step": 45
+    },
+    {
+      "epoch": 0.009097525473071324,
+      "grad_norm": 1.6459542512893677,
+      "learning_rate": 1.484848484848485e-05,
+      "loss": 1.2602953910827637,
+      "step": 50
+    },
+    {
+      "epoch": 0.010007278020378457,
+      "grad_norm": 0.7953159213066101,
+      "learning_rate": 1.6363636363636366e-05,
+      "loss": 1.204326343536377,
+      "step": 55
+    },
+    {
+      "epoch": 0.010917030567685589,
+      "grad_norm": 0.5824465155601501,
+      "learning_rate": 1.787878787878788e-05,
+      "loss": 1.068561840057373,
+      "step": 60
+    },
+    {
+      "epoch": 0.011826783114992722,
+      "grad_norm": 0.39265626668930054,
+      "learning_rate": 1.9393939393939395e-05,
+      "loss": 0.9570062637329102,
+      "step": 65
+    },
+    {
+      "epoch": 0.012736535662299854,
+      "grad_norm": 0.3387283384799957,
+      "learning_rate": 2.090909090909091e-05,
+      "loss": 0.9454713821411133,
+      "step": 70
+    },
+    {
+      "epoch": 0.013646288209606987,
+      "grad_norm": 0.3182811141014099,
+      "learning_rate": 2.2424242424242424e-05,
+      "loss": 0.8901592254638672,
+      "step": 75
+    },
+    {
+      "epoch": 0.01455604075691412,
+      "grad_norm": 0.2735312879085541,
+      "learning_rate": 2.393939393939394e-05,
+      "loss": 0.8491583824157715,
+      "step": 80
+    },
+    {
+      "epoch": 0.015465793304221253,
+      "grad_norm": 0.2376435250043869,
+      "learning_rate": 2.5454545454545454e-05,
+      "loss": 0.8109179496765136,
+      "step": 85
+    },
+    {
+      "epoch": 0.016375545851528384,
+      "grad_norm": 0.2161586880683899,
+      "learning_rate": 2.696969696969697e-05,
+      "loss": 0.76962308883667,
+      "step": 90
+    },
+    {
+      "epoch": 0.017285298398835518,
+      "grad_norm": 0.19587980210781097,
+      "learning_rate": 2.8484848484848486e-05,
+      "loss": 0.7301986694335938,
+      "step": 95
+    },
+    {
+      "epoch": 0.018195050946142648,
+      "grad_norm": 0.20971694588661194,
+      "learning_rate": 3e-05,
+      "loss": 0.7269618034362793,
+      "step": 100
+    },
+    {
+      "epoch": 0.018195050946142648,
+      "eval_loss": 2.605874538421631,
+      "eval_runtime": 1120.0905,
+      "eval_samples_per_second": 33.935,
+      "eval_steps_per_second": 8.484,
+      "step": 100
+    }
+  ],
+  "logging_steps": 5,
+  "max_steps": 5500,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 2,
+  "save_steps": 100,
+  "stateful_callbacks": {
+    "TrainerControl": {
+      "args": {
+        "should_epoch_stop": false,
+        "should_evaluate": false,
+        "should_log": false,
+        "should_save": true,
+        "should_training_stop": false
+      },
+      "attributes": {}
+    }
+  },
+  "total_flos": 6.444622973392128e+16,
+  "train_batch_size": 8,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-100/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:195f79601dec1ad668a414b5c045319cec84f48961f45b7d32762f86750cd8b1
+size 5777

checkpoint-1000/README.md ADDED Viewed

	@@ -0,0 +1,210 @@

+---
+base_model: unsloth/gemma-4-E4B-it
+library_name: peft
+pipeline_tag: text-generation
+tags:
+- base_model:adapter:unsloth/gemma-4-E4B-it
+- lora
+- sft
+- transformers
+- trl
+- unsloth
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.19.1

checkpoint-1000/adapter_config.json ADDED Viewed

	@@ -0,0 +1,52 @@

+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": {
+    "base_model_class": "Gemma4ForConditionalGeneration",
+    "parent_library": "transformers.models.gemma4.modeling_gemma4",
+    "unsloth_fixed": true
+  },
+  "base_model_name_or_path": "unsloth/gemma-4-E4B-it",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 16,
+  "lora_bias": false,
+  "lora_dropout": 0.0,
+  "lora_ga_config": null,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.19.1",
+  "qalora_group_size": 16,
+  "r": 16,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "gate_proj",
+    "v_proj",
+    "o_proj",
+    "k_proj",
+    "up_proj",
+    "down_proj",
+    "q_proj"
+  ],
+  "target_parameters": null,
+  "task_type": "CAUSAL_LM",
+  "trainable_token_indices": null,
+  "use_bdlora": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

checkpoint-1000/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f94c7dd4d79ecdb435c295a616d4707c2bf0e734fbefe7d10ecfa59b195ee625
+size 169741912

checkpoint-1000/chat_template.jinja ADDED Viewed

	@@ -0,0 +1,351 @@

+{%- macro format_parameters(properties, required, filter_keys=false) -%}
+    {%- set standard_keys = ['description', 'type', 'properties', 'required', 'nullable'] -%}
+    {%- set ns = namespace(found_first=false) -%}
+    {%- for key, value in properties | dictsort -%}
+        {%- set add_comma = false -%}
+        {%- if not filter_keys or key not in standard_keys -%}
+            {%- if ns.found_first %},{% endif -%}
+            {%- set ns.found_first = true -%}
+            {{ key }}:{
+            {%- if value['description'] -%}
+                description:<|"|>{{ value['description'] }}<|"|>
+                {%- set add_comma = true -%}
+            {%- endif -%}
+            {%- if value['type'] | upper == 'STRING' -%}
+                {%- if value['enum'] -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    enum:{{ format_argument(value['enum']) }}
+                {%- endif -%}
+            {%- elif value['type'] | upper == 'ARRAY' -%}
+                {%- if value['items'] is mapping and value['items'] -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    items:{
+                    {%- set ns_items = namespace(found_first=false) -%}
+                    {%- for item_key, item_value in value['items'] | dictsort -%}
+                        {%- if item_value is not none -%}
+                            {%- if ns_items.found_first %},{% endif -%}
+                            {%- set ns_items.found_first = true -%}
+                            {%- if item_key == 'properties' -%}
+                                properties:{
+                                {%- if item_value is mapping -%}
+                                    {{- format_parameters(item_value, value['items']['required'] | default([])) -}}
+                                {%- endif -%}
+                                }
+                            {%- elif item_key == 'required' -%}
+                                required:[
+                                {%- for req_item in item_value -%}
+                                    <|"|>{{- req_item -}}<|"|>
+                                    {%- if not loop.last %},{% endif -%}
+                                {%- endfor -%}
+                                ]
+                            {%- elif item_key == 'type' -%}
+                                {%- if item_value is string -%}
+                                    type:{{ format_argument(item_value | upper) }}
+                                {%- else -%}
+                                    type:{{ format_argument(item_value | map('upper') | list) }}
+                                {%- endif -%}
+                            {%- else -%}
+                                {{ item_key }}:{{ format_argument(item_value) }}
+                            {%- endif -%}
+                        {%- endif -%}
+                    {%- endfor -%}
+                    }
+                {%- endif -%}
+            {%- endif -%}
+            {%- if value['nullable'] %}
+                {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                nullable:true
+            {%- endif -%}
+            {%- if value['type'] | upper == 'OBJECT' -%}
+                {%- if value['properties'] is defined and value['properties'] is mapping -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    properties:{
+                    {{- format_parameters(value['properties'], value['required'] | default([])) -}}
+                    }
+                {%- elif value is mapping -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    properties:{
+                    {{- format_parameters(value, value['required'] | default([]), filter_keys=true) -}}
+                    }
+                {%- endif -%}
+                {%- if value['required'] -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    required:[
+                    {%- for item in value['required'] | default([]) -%}
+                        <|"|>{{- item -}}<|"|>
+                        {%- if not loop.last %},{% endif -%}
+                    {%- endfor -%}
+                    ]
+                {%- endif -%}
+            {%- endif -%}
+            {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+            type:<|"|>{{ value['type'] | upper }}<|"|>}
+        {%- endif -%}
+    {%- endfor -%}
+{%- endmacro -%}
+{%- macro format_function_declaration(tool_data) -%}
+    declaration:{{- tool_data['function']['name'] -}}{description:<|"|>{{- tool_data['function']['description'] -}}<|"|>
+    {%- set params = tool_data['function']['parameters'] -%}
+    {%- if params -%}
+        ,parameters:{
+        {%- if params['properties'] -%}
+            properties:{ {{- format_parameters(params['properties'], params['required']) -}} },
+        {%- endif -%}
+        {%- if params['required'] -%}
+            required:[
+            {%- for item in params['required'] -%}
+                <|"|>{{- item -}}<|"|>
+                {{- ',' if not loop.last -}}
+            {%- endfor -%}
+            ],
+        {%- endif -%}
+        {%- if params['type'] -%}
+            type:<|"|>{{- params['type'] | upper -}}<|"|>}
+        {%- endif -%}
+    {%- endif -%}
+    {%- if 'response' in tool_data['function'] -%}
+        {%- set response_declaration = tool_data['function']['response'] -%}
+        ,response:{
+        {%- if response_declaration['description'] -%}
+            description:<|"|>{{- response_declaration['description'] -}}<|"|>,
+        {%- endif -%}
+        {%- if response_declaration['type'] | upper == 'OBJECT' -%}
+            type:<|"|>{{- response_declaration['type'] | upper -}}<|"|>}
+        {%- endif -%}
+    {%- endif -%}
+    }
+{%- endmacro -%}
+{%- macro format_argument(argument, escape_keys=True) -%}
+    {%- if argument is string -%}
+        {{- '<|"|>' + argument + '<|"|>' -}}
+    {%- elif argument is boolean -%}
+        {{- 'true' if argument else 'false' -}}
+    {%- elif argument is mapping -%}
+        {{- '{' -}}
+        {%- set ns = namespace(found_first=false) -%}
+        {%- for key, value in argument | dictsort -%}
+            {%- if ns.found_first %},{% endif -%}
+            {%- set ns.found_first = true -%}
+            {%- if escape_keys -%}
+                {{- '<|"|>' + key + '<|"|>' -}}
+            {%- else -%}
+                {{- key -}}
+            {%- endif -%}
+            :{{- format_argument(value, escape_keys=escape_keys) -}}
+        {%- endfor -%}
+        {{- '}' -}}
+    {%- elif argument is sequence -%}
+        {{- '[' -}}
+        {%- for item in argument -%}
+            {{- format_argument(item, escape_keys=escape_keys) -}}
+            {%- if not loop.last %},{% endif -%}
+        {%- endfor -%}
+        {{- ']' -}}
+    {%- else -%}
+        {{- argument -}}
+    {%- endif -%}
+{%- endmacro -%}
+{%- macro strip_thinking(text) -%}
+    {%- set ns = namespace(result='') -%}
+    {%- for part in text.split('<channel|>') -%}
+        {%- if '<|channel>' in part -%}
+            {%- set ns.result = ns.result + part.split('<|channel>')[0] -%}
+        {%- else -%}
+            {%- set ns.result = ns.result + part -%}
+        {%- endif -%}
+    {%- endfor -%}
+    {{- ns.result | trim -}}
+{%- endmacro -%}
+{%- macro format_tool_response_block(tool_name, response) -%}
+    {{- '<|tool_response>' -}}
+    {%- if response is mapping -%}
+        {{- 'response:' + tool_name + '{' -}}
+        {%- for key, value in response | dictsort -%}
+            {{- key -}}:{{- format_argument(value, escape_keys=False) -}}
+            {%- if not loop.last %},{% endif -%}
+        {%- endfor -%}
+        {{- '}' -}}
+    {%- else -%}
+        {{- 'response:' + tool_name + '{value:' + format_argument(response, escape_keys=False) + '}' -}}
+    {%- endif -%}
+    {{- '<tool_response|>' -}}
+{%- endmacro -%}
+{%- set ns = namespace(prev_message_type=None) -%}
+{%- set loop_messages = messages -%}
+{{- bos_token -}}
+{#- Handle System/Tool Definitions Block -#}
+{%- if (enable_thinking is defined and enable_thinking) or tools or messages[0]['role'] in ['system', 'developer'] -%}
+    {{- '<|turn>system\n' -}}
+    {#- Inject Thinking token at the very top of the FIRST system turn -#}
+    {%- if enable_thinking is defined and enable_thinking -%}
+        {{- '<|think|>\n' -}}
+        {%- set ns.prev_message_type = 'think' -%}
+    {%- endif -%}
+    {%- if messages[0]['role'] in ['system', 'developer'] -%}
+        {%- if messages[0]['content'] is string -%}
+            {{- messages[0]['content'] | trim -}}
+        {%- elif messages[0]['content'] is sequence -%}
+            {%- for item in messages[0]['content'] -%}
+                {{- item['text'] | trim + ' '-}}
+            {%- endfor -%}
+        {%- endif -%}
+        {%- set loop_messages = messages[1:] -%}
+    {%- endif -%}
+    {%- if tools -%}
+        {%- for tool in tools %}
+            {{- '<|tool>' -}}
+            {{- format_function_declaration(tool) | trim -}}
+            {{- '<tool|>' -}}
+        {%- endfor %}
+        {%- set ns.prev_message_type = 'tool' -%}
+    {%- endif -%}
+    {{- '<turn|>\n' -}}
+{%- endif %}
+{#- Pre-scan: find last user message index for reasoning guard -#}
+{%- set ns_turn = namespace(last_user_idx=-1) -%}
+{%- for i in range(loop_messages | length) -%}
+    {%- if loop_messages[i]['role'] == 'user' -%}
+        {%- set ns_turn.last_user_idx = i -%}
+    {%- endif -%}
+{%- endfor -%}
+{#- Loop through messages -#}
+{%- for message in loop_messages -%}
+    {%- if message['role'] != 'tool' -%}
+    {%- set ns.prev_message_type = None -%}
+    {%- set role = 'model' if message['role'] == 'assistant' else message['role'] -%}
+    {#- Detect continuation: suppress duplicate <|turn>model when previous non-tool message was also assistant -#}
+    {%- set prev_nt = namespace(role=None, found=false) -%}
+    {%- if loop.index0 > 0 -%}
+        {%- for j in range(loop.index0 - 1, -1, -1) -%}
+            {%- if not prev_nt.found -%}
+                {%- if loop_messages[j]['role'] != 'tool' -%}
+                    {%- set prev_nt.role = loop_messages[j]['role'] -%}
+                    {%- set prev_nt.found = true -%}
+                {%- endif -%}
+            {%- endif -%}
+        {%- endfor -%}
+    {%- endif -%}
+    {%- set continue_same_model_turn = (role == 'model' and prev_nt.role == 'assistant') -%}
+    {%- if not continue_same_model_turn -%}
+        {{- '<|turn>' + role + '\n' }}
+    {%- endif -%}
+    {#- Render reasoning/reasoning_content as thinking channel -#}
+    {%- set thinking_text = message.get('reasoning') or message.get('reasoning_content') -%}
+    {%- if thinking_text and loop.index0 > ns_turn.last_user_idx and message.get('tool_calls') -%}
+        {{- '<|channel>thought\n' + thinking_text + '\n<channel|>' -}}
+    {%- endif -%}
+            {%- if message['tool_calls'] -%}
+                {%- for tool_call in message['tool_calls'] -%}
+                    {%- set function = tool_call['function'] -%}
+                    {{- '<|tool_call>call:' + function['name'] + '{' -}}
+                    {%- if function['arguments'] is mapping -%}
+                        {%- set ns_args = namespace(found_first=false) -%}
+                        {%- for key, value in function['arguments'] | dictsort -%}
+                            {%- if ns_args.found_first %},{% endif -%}
+                            {%- set ns_args.found_first = true -%}
+                            {{- key -}}:{{- format_argument(value, escape_keys=False) -}}
+                        {%- endfor -%}
+                    {%- elif function['arguments'] is string -%}
+                        {{- function['arguments'] -}}
+                    {%- endif -%}
+                    {{- '}<tool_call|>' -}}
+                {%- endfor -%}
+                {%- set ns.prev_message_type = 'tool_call' -%}
+            {%- endif -%}
+            {%- set ns_tr_out = namespace(flag=false) -%}
+            {%- if message.get('tool_responses') -%}
+                {#- Legacy: tool_responses embedded on the assistant message (Google/Gemma native) -#}
+                {%- for tool_response in message['tool_responses'] -%}
+                    {{- format_tool_response_block(tool_response['name'] | default('unknown'), tool_response['response']) -}}
+                    {%- set ns_tr_out.flag = true -%}
+                    {%- set ns.prev_message_type = 'tool_response' -%}
+                {%- endfor -%}
+            {%- elif message.get('tool_calls') -%}
+                {#- OpenAI Chat Completions: forward-scan consecutive role:tool messages -#}
+                {%- set ns_tool_scan = namespace(stopped=false) -%}
+                {%- for k in range(loop.index0 + 1, loop_messages | length) -%}
+                    {%- if ns_tool_scan.stopped -%}
+                    {%- elif loop_messages[k]['role'] != 'tool' -%}
+                        {%- set ns_tool_scan.stopped = true -%}
+                    {%- else -%}
+                        {%- set follow = loop_messages[k] -%}
+                        {#- Resolve tool_call_id to function name -#}
+                        {%- set ns_tname = namespace(name=follow.get('name') | default('unknown')) -%}
+                        {%- for tc in message['tool_calls'] -%}
+                            {%- if tc.get('id') == follow.get('tool_call_id') -%}
+                                {%- set ns_tname.name = tc['function']['name'] -%}
+                            {%- endif -%}
+                        {%- endfor -%}
+                        {#- Handle content as string or content-parts array -#}
+                        {%- set tool_body = follow.get('content') -%}
+                        {%- if tool_body is string -%}
+                            {{- format_tool_response_block(ns_tname.name, tool_body) -}}
+                        {%- elif tool_body is sequence and tool_body is not string -%}
+                            {%- set ns_txt = namespace(s='') -%}
+                            {%- for part in tool_body -%}
+                                {%- if part.get('type') == 'text' -%}
+                                    {%- set ns_txt.s = ns_txt.s + (part.get('text') | default('')) -%}
+                                {%- endif -%}
+                            {%- endfor -%}
+                            {{- format_tool_response_block(ns_tname.name, ns_txt.s) -}}
+                        {%- else -%}
+                            {{- format_tool_response_block(ns_tname.name, tool_body) -}}
+                        {%- endif -%}
+                        {%- set ns_tr_out.flag = true -%}
+                        {%- set ns.prev_message_type = 'tool_response' -%}
+                    {%- endif -%}
+                {%- endfor -%}
+            {%- endif -%}
+            {%- set captured_content -%}
+            {%- if message['content'] is string -%}
+                {%- if role == 'model' -%}
+                    {{- strip_thinking(message['content']) -}}
+                {%- else -%}
+                    {{- message['content'] | trim -}}
+                {%- endif -%}
+            {%- elif message['content'] is sequence -%}
+                {%- for item in message['content'] -%}
+                    {%- if item['type'] == 'text' -%}
+                        {%- if role == 'model' -%}
+                            {{- strip_thinking(item['text']) -}}
+                        {%- else -%}
+                            {{- item['text'] | trim -}}
+                        {%- endif -%}
+                    {%- elif item['type'] == 'image' -%}
+                        {{- '<|image|>' -}}
+                        {%- set ns.prev_message_type = 'image' -%}
+                    {%- elif item['type'] == 'audio' -%}
+                        {{- '<|audio|>' -}}
+                        {%- set ns.prev_message_type = 'audio' -%}
+                    {%- elif item['type'] == 'video' -%}
+                        {{- '<|video|>' -}}
+                        {%- set ns.prev_message_type = 'video' -%}
+                    {%- endif -%}
+                {%- endfor -%}
+            {%- endif -%}
+            {%- endset -%}
+            {{- captured_content -}}
+            {%- set has_content = captured_content | trim | length > 0 -%}
+        {%- if ns.prev_message_type == 'tool_call' and not ns_tr_out.flag -%}
+            {{- '<|tool_response>' -}}
+        {%- elif not (ns_tr_out.flag and not has_content) -%}
+            {{- '<turn|>\n' -}}
+        {%- endif -%}
+    {%- endif -%}
+{%- endfor -%}
+{%- if add_generation_prompt -%}
+    {%- if ns.prev_message_type != 'tool_response' and ns.prev_message_type != 'tool_call' -%}
+        {{- '<|turn>model\n' -}}
+    {%- endif -%}
+{%- endif -%}

checkpoint-1000/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:795a63e9a73654a7dd8a4dac66a5a2b305d11f32784400415681ec19ef91f007
+size 72807355

checkpoint-1000/processor_config.json ADDED Viewed

	@@ -0,0 +1,75 @@

+{
+  "audio_ms_per_token": 40,
+  "audio_seq_length": 750,
+  "feature_extractor": {
+    "dither": 0.0,
+    "feature_extractor_type": "Gemma4AudioFeatureExtractor",
+    "feature_size": 128,
+    "fft_length": 512,
+    "fft_overdrive": false,
+    "frame_length": 320,
+    "hop_length": 160,
+    "input_scale_factor": 1.0,
+    "max_frequency": 8000.0,
+    "mel_floor": 0.001,
+    "min_frequency": 0.0,
+    "padding_side": "left",
+    "padding_value": 0.0,
+    "per_bin_mean": null,
+    "per_bin_stddev": null,
+    "preemphasis": 0.0,
+    "preemphasis_htk_flavor": true,
+    "return_attention_mask": true,
+    "sampling_rate": 16000
+  },
+  "image_processor": {
+    "do_convert_rgb": true,
+    "do_normalize": false,
+    "do_rescale": true,
+    "do_resize": true,
+    "image_mean": [
+      0.0,
+      0.0,
+      0.0
+    ],
+    "image_processor_type": "Gemma4ImageProcessor",
+    "image_seq_length": 280,
+    "image_std": [
+      1.0,
+      1.0,
+      1.0
+    ],
+    "max_soft_tokens": 280,
+    "patch_size": 16,
+    "pooling_kernel_size": 3,
+    "resample": 3,
+    "rescale_factor": 0.00392156862745098
+  },
+  "image_seq_length": 280,
+  "processor_class": "Gemma4Processor",
+  "video_processor": {
+    "do_convert_rgb": true,
+    "do_normalize": true,
+    "do_rescale": true,
+    "do_resize": true,
+    "do_sample_frames": true,
+    "image_mean": [
+      0.0,
+      0.0,
+      0.0
+    ],
+    "image_std": [
+      1.0,
+      1.0,
+      1.0
+    ],
+    "max_soft_tokens": 70,
+    "num_frames": 32,
+    "patch_size": 16,
+    "pooling_kernel_size": 3,
+    "resample": 3,
+    "rescale_factor": 0.00392156862745098,
+    "return_metadata": false,
+    "video_processor_type": "Gemma4VideoProcessor"
+  }
+}

checkpoint-1000/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:098b29492211804ab324a36f37466821d948280bb74fce4ba895c03f13ecd878
+size 14645

checkpoint-1000/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:406994c2cf2acc1e48ce8857e7cbb9e95d4fab92a97bbe36f71721705be347d7
+size 1465

checkpoint-1000/tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cc8d3a0ce36466ccc1278bf987df5f71db1719b9ca6b4118264f45cb627bfe0f
+size 32169626

checkpoint-1000/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,289 @@

+{
+  "audio_token": "<|audio|>",
+  "backend": "tokenizers",
+  "boa_token": "<|audio>",
+  "boi_token": "<|image>",
+  "bos_token": "<bos>",
+  "eoa_token": "<audio|>",
+  "eoc_token": "<channel|>",
+  "eoi_token": "<image|>",
+  "eos_token": "<turn|>",
+  "eot_token": "<turn|>",
+  "escape_token": "<|\"|>",
+  "etc_token": "<tool_call|>",
+  "etd_token": "<tool|>",
+  "etr_token": "<tool_response|>",
+  "extra_special_tokens": [
+    "<|video|>"
+  ],
+  "image_token": "<|image|>",
+  "is_local": false,
+  "mask_token": "<mask>",
+  "model_max_length": 131072,
+  "model_specific_special_tokens": {
+    "audio_token": "<|audio|>",
+    "boa_token": "<|audio>",
+    "boi_token": "<|image>",
+    "eoa_token": "<audio|>",
+    "eoc_token": "<channel|>",
+    "eoi_token": "<image|>",
+    "eot_token": "<turn|>",
+    "escape_token": "<|\"|>",
+    "etc_token": "<tool_call|>",
+    "etd_token": "<tool|>",
+    "etr_token": "<tool_response|>",
+    "image_token": "<|image|>",
+    "soc_token": "<|channel>",
+    "sot_token": "<|turn>",
+    "stc_token": "<|tool_call>",
+    "std_token": "<|tool>",
+    "str_token": "<|tool_response>",
+    "think_token": "<|think|>"
+  },
+  "pad_token": "<pad>",
+  "padding_side": "right",
+  "processor_class": "Gemma4Processor",
+  "response_schema": {
+    "properties": {
+      "content": {
+        "type": "string"
+      },
+      "role": {
+        "const": "assistant"
+      },
+      "thinking": {
+        "type": "string"
+      },
+      "tool_calls": {
+        "items": {
+          "properties": {
+            "function": {
+              "properties": {
+                "arguments": {
+                  "additionalProperties": {},
+                  "type": "object",
+                  "x-parser": "gemma4-tool-call"
+                },
+                "name": {
+                  "type": "string"
+                }
+              },
+              "type": "object",
+              "x-regex": "call\\:(?P<name>\\w+)(?P<arguments>\\{.*\\})"
+            },
+            "type": {
+              "const": "function"
+            }
+          },
+          "type": "object"
+        },
+        "type": "array",
+        "x-regex-iterator": "<\\|tool_call>(.*?)<tool_call\\|>"
+      }
+    },
+    "type": "object",
+    "x-regex": "(\\<\\|channel\\>thought\\n(?P<thinking>.*?)\\<channel\\|\\>)?(?P<tool_calls>\\<\\|tool_call\\>.*\\<tool_call\\|\\>)?(?P<content>(?:(?!\\<turn\\|\\>)(?!\\<\\|tool_response\\>).)+)?(?:\\<turn\\|\\>|\\<\\|tool_response\\>)?"
+  },
+  "soc_token": "<|channel>",
+  "sot_token": "<|turn>",
+  "stc_token": "<|tool_call>",
+  "std_token": "<|tool>",
+  "str_token": "<|tool_response>",
+  "think_token": "<|think|>",
+  "tokenizer_class": "GemmaTokenizer",
+  "unk_token": "<unk>",
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<pad>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "1": {
+      "content": "<eos>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "2": {
+      "content": "<bos>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "3": {
+      "content": "<unk>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "4": {
+      "content": "<mask>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "46": {
+      "content": "<|tool>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "47": {
+      "content": "<tool|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "48": {
+      "content": "<|tool_call>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "49": {
+      "content": "<tool_call|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "50": {
+      "content": "<|tool_response>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "51": {
+      "content": "<tool_response|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "52": {
+      "content": "<|\"|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "98": {
+      "content": "<|think|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "100": {
+      "content": "<|channel>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "101": {
+      "content": "<channel|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "105": {
+      "content": "<|turn>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "106": {
+      "content": "<turn|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "255999": {
+      "content": "<|image>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "256000": {
+      "content": "<|audio>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "258880": {
+      "content": "<|image|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "258881": {
+      "content": "<|audio|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "258882": {
+      "content": "<image|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "258883": {
+      "content": "<audio|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "258884": {
+      "content": "<|video|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    }
+  }
+}

checkpoint-1000/trainer_state.json ADDED Viewed

	@@ -0,0 +1,1442 @@

+{
+  "best_global_step": null,
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 0.1819505094614265,
+  "eval_steps": 100,
+  "global_step": 1000,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.0009097525473071324,
+      "grad_norm": 1.0602493286132812,
+      "learning_rate": 1.2121212121212122e-06,
+      "loss": 1.7156932830810547,
+      "step": 5
+    },
+    {
+      "epoch": 0.001819505094614265,
+      "grad_norm": 1.1577719449996948,
+      "learning_rate": 2.7272727272727272e-06,
+      "loss": 1.6629371643066406,
+      "step": 10
+    },
+    {
+      "epoch": 0.0027292576419213972,
+      "grad_norm": 1.0288419723510742,
+      "learning_rate": 4.242424242424243e-06,
+      "loss": 1.6706295013427734,
+      "step": 15
+    },
+    {
+      "epoch": 0.00363901018922853,
+      "grad_norm": 2.129403829574585,
+      "learning_rate": 5.7575757575757586e-06,
+      "loss": 1.7363752365112304,
+      "step": 20
+    },
+    {
+      "epoch": 0.004548762736535662,
+      "grad_norm": 1.9468326568603516,
+      "learning_rate": 7.272727272727272e-06,
+      "loss": 1.7111135482788087,
+      "step": 25
+    },
+    {
+      "epoch": 0.0054585152838427945,
+      "grad_norm": 1.1269357204437256,
+      "learning_rate": 8.787878787878788e-06,
+      "loss": 1.6924203872680663,
+      "step": 30
+    },
+    {
+      "epoch": 0.006368267831149927,
+      "grad_norm": 1.4021248817443848,
+      "learning_rate": 1.0303030303030304e-05,
+      "loss": 1.658310317993164,
+      "step": 35
+    },
+    {
+      "epoch": 0.00727802037845706,
+      "grad_norm": 1.313381314277649,
+      "learning_rate": 1.1818181818181819e-05,
+      "loss": 1.5383296012878418,
+      "step": 40
+    },
+    {
+      "epoch": 0.008187772925764192,
+      "grad_norm": 2.4359891414642334,
+      "learning_rate": 1.3333333333333333e-05,
+      "loss": 1.4302565574645996,
+      "step": 45
+    },
+    {
+      "epoch": 0.009097525473071324,
+      "grad_norm": 1.6459542512893677,
+      "learning_rate": 1.484848484848485e-05,
+      "loss": 1.2602953910827637,
+      "step": 50
+    },
+    {
+      "epoch": 0.010007278020378457,
+      "grad_norm": 0.7953159213066101,
+      "learning_rate": 1.6363636363636366e-05,
+      "loss": 1.204326343536377,
+      "step": 55
+    },
+    {
+      "epoch": 0.010917030567685589,
+      "grad_norm": 0.5824465155601501,
+      "learning_rate": 1.787878787878788e-05,
+      "loss": 1.068561840057373,
+      "step": 60
+    },
+    {
+      "epoch": 0.011826783114992722,
+      "grad_norm": 0.39265626668930054,
+      "learning_rate": 1.9393939393939395e-05,
+      "loss": 0.9570062637329102,
+      "step": 65
+    },
+    {
+      "epoch": 0.012736535662299854,
+      "grad_norm": 0.3387283384799957,
+      "learning_rate": 2.090909090909091e-05,
+      "loss": 0.9454713821411133,
+      "step": 70
+    },
+    {
+      "epoch": 0.013646288209606987,
+      "grad_norm": 0.3182811141014099,
+      "learning_rate": 2.2424242424242424e-05,
+      "loss": 0.8901592254638672,
+      "step": 75
+    },
+    {
+      "epoch": 0.01455604075691412,
+      "grad_norm": 0.2735312879085541,
+      "learning_rate": 2.393939393939394e-05,
+      "loss": 0.8491583824157715,
+      "step": 80
+    },
+    {
+      "epoch": 0.015465793304221253,
+      "grad_norm": 0.2376435250043869,
+      "learning_rate": 2.5454545454545454e-05,
+      "loss": 0.8109179496765136,
+      "step": 85
+    },
+    {
+      "epoch": 0.016375545851528384,
+      "grad_norm": 0.2161586880683899,
+      "learning_rate": 2.696969696969697e-05,
+      "loss": 0.76962308883667,
+      "step": 90
+    },
+    {
+      "epoch": 0.017285298398835518,
+      "grad_norm": 0.19587980210781097,
+      "learning_rate": 2.8484848484848486e-05,
+      "loss": 0.7301986694335938,
+      "step": 95
+    },
+    {
+      "epoch": 0.018195050946142648,
+      "grad_norm": 0.20971694588661194,
+      "learning_rate": 3e-05,
+      "loss": 0.7269618034362793,
+      "step": 100
+    },
+    {
+      "epoch": 0.018195050946142648,
+      "eval_loss": 2.605874538421631,
+      "eval_runtime": 1120.0905,
+      "eval_samples_per_second": 33.935,
+      "eval_steps_per_second": 8.484,
+      "step": 100
+    },
+    {
+      "epoch": 0.01910480349344978,
+      "grad_norm": 0.10413152724504471,
+      "learning_rate": 3.151515151515151e-05,
+      "loss": 0.3250573635101318,
+      "step": 105
+    },
+    {
+      "epoch": 0.020014556040756915,
+      "grad_norm": 0.09383206814527512,
+      "learning_rate": 3.303030303030303e-05,
+      "loss": 0.3277724742889404,
+      "step": 110
+    },
+    {
+      "epoch": 0.020924308588064048,
+      "grad_norm": 0.1195850670337677,
+      "learning_rate": 3.454545454545455e-05,
+      "loss": 0.3215961217880249,
+      "step": 115
+    },
+    {
+      "epoch": 0.021834061135371178,
+      "grad_norm": 0.0715397521853447,
+      "learning_rate": 3.606060606060606e-05,
+      "loss": 0.3120795965194702,
+      "step": 120
+    },
+    {
+      "epoch": 0.02274381368267831,
+      "grad_norm": 0.068007692694664,
+      "learning_rate": 3.757575757575758e-05,
+      "loss": 0.2964257955551147,
+      "step": 125
+    },
+    {
+      "epoch": 0.023653566229985445,
+      "grad_norm": 0.09345484524965286,
+      "learning_rate": 3.909090909090909e-05,
+      "loss": 0.30776252746582033,
+      "step": 130
+    },
+    {
+      "epoch": 0.024563318777292575,
+      "grad_norm": 0.05577846243977547,
+      "learning_rate": 4.0606060606060606e-05,
+      "loss": 0.3180255889892578,
+      "step": 135
+    },
+    {
+      "epoch": 0.025473071324599708,
+      "grad_norm": 0.05919989198446274,
+      "learning_rate": 4.212121212121212e-05,
+      "loss": 0.31608285903930666,
+      "step": 140
+    },
+    {
+      "epoch": 0.02638282387190684,
+      "grad_norm": 0.05644674599170685,
+      "learning_rate": 4.3636363636363636e-05,
+      "loss": 0.2993780136108398,
+      "step": 145
+    },
+    {
+      "epoch": 0.027292576419213975,
+      "grad_norm": 0.059986088424921036,
+      "learning_rate": 4.515151515151516e-05,
+      "loss": 0.2931638479232788,
+      "step": 150
+    },
+    {
+      "epoch": 0.028202328966521105,
+      "grad_norm": 0.05941484495997429,
+      "learning_rate": 4.666666666666667e-05,
+      "loss": 0.29284651279449464,
+      "step": 155
+    },
+    {
+      "epoch": 0.02911208151382824,
+      "grad_norm": 0.0579044483602047,
+      "learning_rate": 4.8181818181818186e-05,
+      "loss": 0.2927037000656128,
+      "step": 160
+    },
+    {
+      "epoch": 0.030021834061135372,
+      "grad_norm": 0.061985693871974945,
+      "learning_rate": 4.9696969696969694e-05,
+      "loss": 0.28671720027923586,
+      "step": 165
+    },
+    {
+      "epoch": 0.030931586608442505,
+      "grad_norm": 0.05715535953640938,
+      "learning_rate": 4.999993064772809e-05,
+      "loss": 0.2817929744720459,
+      "step": 170
+    },
+    {
+      "epoch": 0.03184133915574964,
+      "grad_norm": 0.06549780815839767,
+      "learning_rate": 4.999964890478288e-05,
+      "loss": 0.27853829860687257,
+      "step": 175
+    },
+    {
+      "epoch": 0.03275109170305677,
+      "grad_norm": 0.05948757752776146,
+      "learning_rate": 4.999915043908795e-05,
+      "loss": 0.27522289752960205,
+      "step": 180
+    },
+    {
+      "epoch": 0.0336608442503639,
+      "grad_norm": 0.06262889504432678,
+      "learning_rate": 4.9998435254964515e-05,
+      "loss": 0.270997428894043,
+      "step": 185
+    },
+    {
+      "epoch": 0.034570596797671035,
+      "grad_norm": 0.06916829943656921,
+      "learning_rate": 4.999750335861253e-05,
+      "loss": 0.2788438558578491,
+      "step": 190
+    },
+    {
+      "epoch": 0.035480349344978165,
+      "grad_norm": 0.06128217652440071,
+      "learning_rate": 4.9996354758110624e-05,
+      "loss": 0.25649352073669435,
+      "step": 195
+    },
+    {
+      "epoch": 0.036390101892285295,
+      "grad_norm": 0.06704027950763702,
+      "learning_rate": 4.999498946341606e-05,
+      "loss": 0.25619523525238036,
+      "step": 200
+    },
+    {
+      "epoch": 0.03729985443959243,
+      "grad_norm": 0.061678580939769745,
+      "learning_rate": 4.999340748636462e-05,
+      "loss": 0.24956226348876953,
+      "step": 205
+    },
+    {
+      "epoch": 0.03820960698689956,
+      "grad_norm": 0.07328873127698898,
+      "learning_rate": 4.999160884067051e-05,
+      "loss": 0.26169676780700685,
+      "step": 210
+    },
+    {
+      "epoch": 0.0391193595342067,
+      "grad_norm": 0.08287990838289261,
+      "learning_rate": 4.9989593541926246e-05,
+      "loss": 0.2574604034423828,
+      "step": 215
+    },
+    {
+      "epoch": 0.04002911208151383,
+      "grad_norm": 0.06787359714508057,
+      "learning_rate": 4.9987361607602525e-05,
+      "loss": 0.25351409912109374,
+      "step": 220
+    },
+    {
+      "epoch": 0.04093886462882096,
+      "grad_norm": 0.06695502996444702,
+      "learning_rate": 4.998491305704805e-05,
+      "loss": 0.24522039890289307,
+      "step": 225
+    },
+    {
+      "epoch": 0.041848617176128096,
+      "grad_norm": 0.08872214704751968,
+      "learning_rate": 4.9982247911489375e-05,
+      "loss": 0.2581867933273315,
+      "step": 230
+    },
+    {
+      "epoch": 0.042758369723435226,
+      "grad_norm": 0.07637131959199905,
+      "learning_rate": 4.9979366194030743e-05,
+      "loss": 0.25569658279418944,
+      "step": 235
+    },
+    {
+      "epoch": 0.043668122270742356,
+      "grad_norm": 0.08158119022846222,
+      "learning_rate": 4.997626792965385e-05,
+      "loss": 0.2529409646987915,
+      "step": 240
+    },
+    {
+      "epoch": 0.04457787481804949,
+      "grad_norm": 0.07529161125421524,
+      "learning_rate": 4.997295314521766e-05,
+      "loss": 0.24049024581909179,
+      "step": 245
+    },
+    {
+      "epoch": 0.04548762736535662,
+      "grad_norm": 0.08860139548778534,
+      "learning_rate": 4.996942186945813e-05,
+      "loss": 0.2490522861480713,
+      "step": 250
+    },
+    {
+      "epoch": 0.04639737991266375,
+      "grad_norm": 0.0850321501493454,
+      "learning_rate": 4.9965674132988005e-05,
+      "loss": 0.24180831909179687,
+      "step": 255
+    },
+    {
+      "epoch": 0.04730713245997089,
+      "grad_norm": 0.07556115090847015,
+      "learning_rate": 4.996170996829653e-05,
+      "loss": 0.2509631872177124,
+      "step": 260
+    },
+    {
+      "epoch": 0.04821688500727802,
+      "grad_norm": 0.07971206307411194,
+      "learning_rate": 4.995752940974918e-05,
+      "loss": 0.24398891925811766,
+      "step": 265
+    },
+    {
+      "epoch": 0.04912663755458515,
+      "grad_norm": 0.09149336814880371,
+      "learning_rate": 4.9953132493587344e-05,
+      "loss": 0.2300492286682129,
+      "step": 270
+    },
+    {
+      "epoch": 0.050036390101892286,
+      "grad_norm": 0.08265820890665054,
+      "learning_rate": 4.9948519257928034e-05,
+      "loss": 0.24246792793273925,
+      "step": 275
+    },
+    {
+      "epoch": 0.050946142649199416,
+      "grad_norm": 0.10328587144613266,
+      "learning_rate": 4.9943689742763534e-05,
+      "loss": 0.2367171049118042,
+      "step": 280
+    },
+    {
+      "epoch": 0.05185589519650655,
+      "grad_norm": 0.0836917981505394,
+      "learning_rate": 4.993864398996105e-05,
+      "loss": 0.23215813636779786,
+      "step": 285
+    },
+    {
+      "epoch": 0.05276564774381368,
+      "grad_norm": 0.09475161135196686,
+      "learning_rate": 4.99333820432624e-05,
+      "loss": 0.2350748062133789,
+      "step": 290
+    },
+    {
+      "epoch": 0.05367540029112081,
+      "grad_norm": 0.08040128648281097,
+      "learning_rate": 4.992790394828355e-05,
+      "loss": 0.23253886699676513,
+      "step": 295
+    },
+    {
+      "epoch": 0.05458515283842795,
+      "grad_norm": 0.08852150291204453,
+      "learning_rate": 4.992220975251428e-05,
+      "loss": 0.23856515884399415,
+      "step": 300
+    },
+    {
+      "epoch": 0.05549490538573508,
+      "grad_norm": 0.09565229713916779,
+      "learning_rate": 4.991629950531775e-05,
+      "loss": 0.23311660289764405,
+      "step": 305
+    },
+    {
+      "epoch": 0.05640465793304221,
+      "grad_norm": 0.08158160001039505,
+      "learning_rate": 4.991017325793009e-05,
+      "loss": 0.22467944622039795,
+      "step": 310
+    },
+    {
+      "epoch": 0.05731441048034935,
+      "grad_norm": 0.07746429741382599,
+      "learning_rate": 4.990383106345994e-05,
+      "loss": 0.229844069480896,
+      "step": 315
+    },
+    {
+      "epoch": 0.05822416302765648,
+      "grad_norm": 0.08564355969429016,
+      "learning_rate": 4.989727297688797e-05,
+      "loss": 0.22414517402648926,
+      "step": 320
+    },
+    {
+      "epoch": 0.05913391557496361,
+      "grad_norm": 0.07517435401678085,
+      "learning_rate": 4.9890499055066435e-05,
+      "loss": 0.2236532211303711,
+      "step": 325
+    },
+    {
+      "epoch": 0.060043668122270744,
+      "grad_norm": 0.111734539270401,
+      "learning_rate": 4.988350935671869e-05,
+      "loss": 0.21474847793579102,
+      "step": 330
+    },
+    {
+      "epoch": 0.060953420669577874,
+      "grad_norm": 0.09906989336013794,
+      "learning_rate": 4.987630394243866e-05,
+      "loss": 0.23321933746337892,
+      "step": 335
+    },
+    {
+      "epoch": 0.06186317321688501,
+      "grad_norm": 0.10131457448005676,
+      "learning_rate": 4.98688828746903e-05,
+      "loss": 0.2310662031173706,
+      "step": 340
+    },
+    {
+      "epoch": 0.06277292576419213,
+      "grad_norm": 0.09203507006168365,
+      "learning_rate": 4.986124621780708e-05,
+      "loss": 0.22021169662475587,
+      "step": 345
+    },
+    {
+      "epoch": 0.06368267831149928,
+      "grad_norm": 0.09505912661552429,
+      "learning_rate": 4.9853394037991416e-05,
+      "loss": 0.2197155237197876,
+      "step": 350
+    },
+    {
+      "epoch": 0.06459243085880641,
+      "grad_norm": 0.09038657695055008,
+      "learning_rate": 4.984532640331412e-05,
+      "loss": 0.22066287994384765,
+      "step": 355
+    },
+    {
+      "epoch": 0.06550218340611354,
+      "grad_norm": 0.09707064181566238,
+      "learning_rate": 4.9837043383713753e-05,
+      "loss": 0.22455451488494874,
+      "step": 360
+    },
+    {
+      "epoch": 0.06641193595342067,
+      "grad_norm": 0.10367228090763092,
+      "learning_rate": 4.98285450509961e-05,
+      "loss": 0.21993820667266845,
+      "step": 365
+    },
+    {
+      "epoch": 0.0673216885007278,
+      "grad_norm": 0.12229471653699875,
+      "learning_rate": 4.9819831478833456e-05,
+      "loss": 0.2168867588043213,
+      "step": 370
+    },
+    {
+      "epoch": 0.06823144104803494,
+      "grad_norm": 0.0964592918753624,
+      "learning_rate": 4.981090274276406e-05,
+      "loss": 0.21579203605651856,
+      "step": 375
+    },
+    {
+      "epoch": 0.06914119359534207,
+      "grad_norm": 0.09400496631860733,
+      "learning_rate": 4.980175892019141e-05,
+      "loss": 0.20972180366516113,
+      "step": 380
+    },
+    {
+      "epoch": 0.0700509461426492,
+      "grad_norm": 0.08158645778894424,
+      "learning_rate": 4.9792400090383594e-05,
+      "loss": 0.22148358821868896,
+      "step": 385
+    },
+    {
+      "epoch": 0.07096069868995633,
+      "grad_norm": 0.10916394740343094,
+      "learning_rate": 4.978282633447261e-05,
+      "loss": 0.2214418649673462,
+      "step": 390
+    },
+    {
+      "epoch": 0.07187045123726346,
+      "grad_norm": 0.11138810962438583,
+      "learning_rate": 4.9773037735453636e-05,
+      "loss": 0.21814754009246826,
+      "step": 395
+    },
+    {
+      "epoch": 0.07278020378457059,
+      "grad_norm": 0.10914396494626999,
+      "learning_rate": 4.9763034378184365e-05,
+      "loss": 0.21310818195343018,
+      "step": 400
+    },
+    {
+      "epoch": 0.07368995633187773,
+      "grad_norm": 0.1043366864323616,
+      "learning_rate": 4.975281634938421e-05,
+      "loss": 0.21266789436340333,
+      "step": 405
+    },
+    {
+      "epoch": 0.07459970887918486,
+      "grad_norm": 0.1036868542432785,
+      "learning_rate": 4.9742383737633594e-05,
+      "loss": 0.21606721878051757,
+      "step": 410
+    },
+    {
+      "epoch": 0.075509461426492,
+      "grad_norm": 0.11640442907810211,
+      "learning_rate": 4.9731736633373144e-05,
+      "loss": 0.21532948017120362,
+      "step": 415
+    },
+    {
+      "epoch": 0.07641921397379912,
+      "grad_norm": 0.11219926178455353,
+      "learning_rate": 4.9720875128902956e-05,
+      "loss": 0.2191627025604248,
+      "step": 420
+    },
+    {
+      "epoch": 0.07732896652110625,
+      "grad_norm": 0.12103637307882309,
+      "learning_rate": 4.970979931838176e-05,
+      "loss": 0.20938868522644044,
+      "step": 425
+    },
+    {
+      "epoch": 0.0782387190684134,
+      "grad_norm": 0.13274189829826355,
+      "learning_rate": 4.96985092978261e-05,
+      "loss": 0.21792960166931152,
+      "step": 430
+    },
+    {
+      "epoch": 0.07914847161572053,
+      "grad_norm": 0.11164513230323792,
+      "learning_rate": 4.968700516510954e-05,
+      "loss": 0.2022618055343628,
+      "step": 435
+    },
+    {
+      "epoch": 0.08005822416302766,
+      "grad_norm": 0.09532847255468369,
+      "learning_rate": 4.967528701996174e-05,
+      "loss": 0.21255812644958497,
+      "step": 440
+    },
+    {
+      "epoch": 0.08096797671033479,
+      "grad_norm": 0.10279258340597153,
+      "learning_rate": 4.96633549639677e-05,
+      "loss": 0.20683050155639648,
+      "step": 445
+    },
+    {
+      "epoch": 0.08187772925764192,
+      "grad_norm": 0.1257462352514267,
+      "learning_rate": 4.965120910056677e-05,
+      "loss": 0.21419920921325683,
+      "step": 450
+    },
+    {
+      "epoch": 0.08278748180494905,
+      "grad_norm": 0.11663137376308441,
+      "learning_rate": 4.963884953505186e-05,
+      "loss": 0.2072287082672119,
+      "step": 455
+    },
+    {
+      "epoch": 0.08369723435225619,
+      "grad_norm": 0.10488224029541016,
+      "learning_rate": 4.96262763745684e-05,
+      "loss": 0.1982678532600403,
+      "step": 460
+    },
+    {
+      "epoch": 0.08460698689956332,
+      "grad_norm": 0.11801692098379135,
+      "learning_rate": 4.961348972811354e-05,
+      "loss": 0.20662031173706055,
+      "step": 465
+    },
+    {
+      "epoch": 0.08551673944687045,
+      "grad_norm": 0.11318827420473099,
+      "learning_rate": 4.96004897065351e-05,
+      "loss": 0.20947303771972656,
+      "step": 470
+    },
+    {
+      "epoch": 0.08642649199417758,
+      "grad_norm": 0.13409486413002014,
+      "learning_rate": 4.95872764225307e-05,
+      "loss": 0.19670876264572143,
+      "step": 475
+    },
+    {
+      "epoch": 0.08733624454148471,
+      "grad_norm": 0.14440792798995972,
+      "learning_rate": 4.957384999064672e-05,
+      "loss": 0.19842848777770997,
+      "step": 480
+    },
+    {
+      "epoch": 0.08824599708879186,
+      "grad_norm": 0.12246996909379959,
+      "learning_rate": 4.956021052727731e-05,
+      "loss": 0.20318071842193602,
+      "step": 485
+    },
+    {
+      "epoch": 0.08915574963609899,
+      "grad_norm": 0.13437233865261078,
+      "learning_rate": 4.954635815066342e-05,
+      "loss": 0.21675212383270265,
+      "step": 490
+    },
+    {
+      "epoch": 0.09006550218340612,
+      "grad_norm": 0.11109672486782074,
+      "learning_rate": 4.9532292980891744e-05,
+      "loss": 0.2100757837295532,
+      "step": 495
+    },
+    {
+      "epoch": 0.09097525473071325,
+      "grad_norm": 0.1388893872499466,
+      "learning_rate": 4.9518015139893675e-05,
+      "loss": 0.20303285121917725,
+      "step": 500
+    },
+    {
+      "epoch": 0.09188500727802038,
+      "grad_norm": 0.13239721953868866,
+      "learning_rate": 4.950352475144427e-05,
+      "loss": 0.2152268409729004,
+      "step": 505
+    },
+    {
+      "epoch": 0.0927947598253275,
+      "grad_norm": 0.12834979593753815,
+      "learning_rate": 4.948882194116119e-05,
+      "loss": 0.20799248218536376,
+      "step": 510
+    },
+    {
+      "epoch": 0.09370451237263465,
+      "grad_norm": 0.11886704713106155,
+      "learning_rate": 4.947390683650354e-05,
+      "loss": 0.20394976139068605,
+      "step": 515
+    },
+    {
+      "epoch": 0.09461426491994178,
+      "grad_norm": 0.11398876458406448,
+      "learning_rate": 4.945877956677083e-05,
+      "loss": 0.2091092586517334,
+      "step": 520
+    },
+    {
+      "epoch": 0.09552401746724891,
+      "grad_norm": 0.1422540694475174,
+      "learning_rate": 4.944344026310186e-05,
+      "loss": 0.19564238786697388,
+      "step": 525
+    },
+    {
+      "epoch": 0.09643377001455604,
+      "grad_norm": 0.11359584331512451,
+      "learning_rate": 4.9427889058473535e-05,
+      "loss": 0.20493624210357667,
+      "step": 530
+    },
+    {
+      "epoch": 0.09734352256186317,
+      "grad_norm": 0.11703553050756454,
+      "learning_rate": 4.941212608769974e-05,
+      "loss": 0.2098615884780884,
+      "step": 535
+    },
+    {
+      "epoch": 0.0982532751091703,
+      "grad_norm": 0.14552047848701477,
+      "learning_rate": 4.939615148743017e-05,
+      "loss": 0.20382182598114013,
+      "step": 540
+    },
+    {
+      "epoch": 0.09916302765647744,
+      "grad_norm": 0.13178016245365143,
+      "learning_rate": 4.937996539614914e-05,
+      "loss": 0.19901862144470214,
+      "step": 545
+    },
+    {
+      "epoch": 0.10007278020378457,
+      "grad_norm": 0.635392427444458,
+      "learning_rate": 4.936356795417439e-05,
+      "loss": 0.20694944858551026,
+      "step": 550
+    },
+    {
+      "epoch": 0.1009825327510917,
+      "grad_norm": 0.15019077062606812,
+      "learning_rate": 4.934695930365586e-05,
+      "loss": 0.19313746690750122,
+      "step": 555
+    },
+    {
+      "epoch": 0.10189228529839883,
+      "grad_norm": 0.12941956520080566,
+      "learning_rate": 4.9330139588574474e-05,
+      "loss": 0.19671722650527954,
+      "step": 560
+    },
+    {
+      "epoch": 0.10280203784570596,
+      "grad_norm": 0.13818831741809845,
+      "learning_rate": 4.931310895474088e-05,
+      "loss": 0.20026786327362062,
+      "step": 565
+    },
+    {
+      "epoch": 0.1037117903930131,
+      "grad_norm": 0.12011194974184036,
+      "learning_rate": 4.929586754979417e-05,
+      "loss": 0.1932437539100647,
+      "step": 570
+    },
+    {
+      "epoch": 0.10462154294032024,
+      "grad_norm": 0.1345364898443222,
+      "learning_rate": 4.9278415523200644e-05,
+      "loss": 0.20245940685272218,
+      "step": 575
+    },
+    {
+      "epoch": 0.10553129548762737,
+      "grad_norm": 0.13281017541885376,
+      "learning_rate": 4.926075302625247e-05,
+      "loss": 0.19864981174468993,
+      "step": 580
+    },
+    {
+      "epoch": 0.1064410480349345,
+      "grad_norm": 0.13465586304664612,
+      "learning_rate": 4.924288021206639e-05,
+      "loss": 0.19573183059692384,
+      "step": 585
+    },
+    {
+      "epoch": 0.10735080058224163,
+      "grad_norm": 0.15225961804389954,
+      "learning_rate": 4.9224797235582396e-05,
+      "loss": 0.19946801662445068,
+      "step": 590
+    },
+    {
+      "epoch": 0.10826055312954876,
+      "grad_norm": 0.12816746532917023,
+      "learning_rate": 4.92065042535624e-05,
+      "loss": 0.19851526021957397,
+      "step": 595
+    },
+    {
+      "epoch": 0.1091703056768559,
+      "grad_norm": 0.13802853226661682,
+      "learning_rate": 4.9188001424588824e-05,
+      "loss": 0.19321763515472412,
+      "step": 600
+    },
+    {
+      "epoch": 0.11008005822416303,
+      "grad_norm": 0.17504797875881195,
+      "learning_rate": 4.9169288909063295e-05,
+      "loss": 0.2032616138458252,
+      "step": 605
+    },
+    {
+      "epoch": 0.11098981077147016,
+      "grad_norm": 0.13544194400310516,
+      "learning_rate": 4.91503668692052e-05,
+      "loss": 0.2011256456375122,
+      "step": 610
+    },
+    {
+      "epoch": 0.11189956331877729,
+      "grad_norm": 1.3976134061813354,
+      "learning_rate": 4.91312354690503e-05,
+      "loss": 0.19916868209838867,
+      "step": 615
+    },
+    {
+      "epoch": 0.11280931586608442,
+      "grad_norm": 0.1465059071779251,
+      "learning_rate": 4.91118948744493e-05,
+      "loss": 0.19487457275390624,
+      "step": 620
+    },
+    {
+      "epoch": 0.11371906841339156,
+      "grad_norm": 0.12103168666362762,
+      "learning_rate": 4.909234525306645e-05,
+      "loss": 0.1907251238822937,
+      "step": 625
+    },
+    {
+      "epoch": 0.1146288209606987,
+      "grad_norm": 0.12660574913024902,
+      "learning_rate": 4.907258677437802e-05,
+      "loss": 0.19327253103256226,
+      "step": 630
+    },
+    {
+      "epoch": 0.11553857350800582,
+      "grad_norm": 0.1347813606262207,
+      "learning_rate": 4.90526196096709e-05,
+      "loss": 0.19637736082077026,
+      "step": 635
+    },
+    {
+      "epoch": 0.11644832605531295,
+      "grad_norm": 0.14953652024269104,
+      "learning_rate": 4.903244393204107e-05,
+      "loss": 0.20325069427490233,
+      "step": 640
+    },
+    {
+      "epoch": 0.11735807860262008,
+      "grad_norm": 0.13936272263526917,
+      "learning_rate": 4.901205991639213e-05,
+      "loss": 0.1930275321006775,
+      "step": 645
+    },
+    {
+      "epoch": 0.11826783114992721,
+      "grad_norm": 0.1448420137166977,
+      "learning_rate": 4.899146773943374e-05,
+      "loss": 0.20026936531066894,
+      "step": 650
+    },
+    {
+      "epoch": 0.11917758369723436,
+      "grad_norm": 0.1312534064054489,
+      "learning_rate": 4.897066757968014e-05,
+      "loss": 0.19062033891677857,
+      "step": 655
+    },
+    {
+      "epoch": 0.12008733624454149,
+      "grad_norm": 0.13644742965698242,
+      "learning_rate": 4.894965961744859e-05,
+      "loss": 0.18719595670700073,
+      "step": 660
+    },
+    {
+      "epoch": 0.12099708879184862,
+      "grad_norm": 0.14276087284088135,
+      "learning_rate": 4.892844403485777e-05,
+      "loss": 0.19784307479858398,
+      "step": 665
+    },
+    {
+      "epoch": 0.12190684133915575,
+      "grad_norm": 0.14735399186611176,
+      "learning_rate": 4.890702101582623e-05,
+      "loss": 0.19163782596588136,
+      "step": 670
+    },
+    {
+      "epoch": 0.12281659388646288,
+      "grad_norm": 0.15742065012454987,
+      "learning_rate": 4.888539074607082e-05,
+      "loss": 0.19312986135482788,
+      "step": 675
+    },
+    {
+      "epoch": 0.12372634643377002,
+      "grad_norm": 0.12917031347751617,
+      "learning_rate": 4.8863553413105025e-05,
+      "loss": 0.20066320896148682,
+      "step": 680
+    },
+    {
+      "epoch": 0.12463609898107715,
+      "grad_norm": 0.1484801322221756,
+      "learning_rate": 4.884150920623737e-05,
+      "loss": 0.20096096992492676,
+      "step": 685
+    },
+    {
+      "epoch": 0.12554585152838427,
+      "grad_norm": 0.1455296128988266,
+      "learning_rate": 4.88192583165698e-05,
+      "loss": 0.20518505573272705,
+      "step": 690
+    },
+    {
+      "epoch": 0.12645560407569142,
+      "grad_norm": 0.14517490565776825,
+      "learning_rate": 4.879680093699598e-05,
+      "loss": 0.18859238624572755,
+      "step": 695
+    },
+    {
+      "epoch": 0.12736535662299855,
+      "grad_norm": 0.18778090178966522,
+      "learning_rate": 4.877413726219964e-05,
+      "loss": 0.197074818611145,
+      "step": 700
+    },
+    {
+      "epoch": 0.12827510917030568,
+      "grad_norm": 0.13497677445411682,
+      "learning_rate": 4.87512674886529e-05,
+      "loss": 0.18713107109069824,
+      "step": 705
+    },
+    {
+      "epoch": 0.12918486171761281,
+      "grad_norm": 0.12657155096530914,
+      "learning_rate": 4.872819181461455e-05,
+      "loss": 0.1858484387397766,
+      "step": 710
+    },
+    {
+      "epoch": 0.13009461426491994,
+      "grad_norm": 0.11458148807287216,
+      "learning_rate": 4.870491044012834e-05,
+      "loss": 0.18732179403305055,
+      "step": 715
+    },
+    {
+      "epoch": 0.13100436681222707,
+      "grad_norm": 0.13000249862670898,
+      "learning_rate": 4.8681423567021244e-05,
+      "loss": 0.1872936010360718,
+      "step": 720
+    },
+    {
+      "epoch": 0.1319141193595342,
+      "grad_norm": 0.14580890536308289,
+      "learning_rate": 4.865773139890172e-05,
+      "loss": 0.19280019998550416,
+      "step": 725
+    },
+    {
+      "epoch": 0.13282387190684133,
+      "grad_norm": 0.1507277935743332,
+      "learning_rate": 4.8633834141157913e-05,
+      "loss": 0.1898929238319397,
+      "step": 730
+    },
+    {
+      "epoch": 0.13373362445414846,
+      "grad_norm": 0.1418737769126892,
+      "learning_rate": 4.860973200095592e-05,
+      "loss": 0.17926375865936278,
+      "step": 735
+    },
+    {
+      "epoch": 0.1346433770014556,
+      "grad_norm": 0.17151866853237152,
+      "learning_rate": 4.858542518723794e-05,
+      "loss": 0.18963592052459716,
+      "step": 740
+    },
+    {
+      "epoch": 0.13555312954876272,
+      "grad_norm": 0.11162743717432022,
+      "learning_rate": 4.8560913910720535e-05,
+      "loss": 0.19466646909713745,
+      "step": 745
+    },
+    {
+      "epoch": 0.13646288209606988,
+      "grad_norm": 0.15628376603126526,
+      "learning_rate": 4.8536198383892725e-05,
+      "loss": 0.19494034051895143,
+      "step": 750
+    },
+    {
+      "epoch": 0.137372634643377,
+      "grad_norm": 0.18209289014339447,
+      "learning_rate": 4.851127882101421e-05,
+      "loss": 0.18747550249099731,
+      "step": 755
+    },
+    {
+      "epoch": 0.13828238719068414,
+      "grad_norm": 0.14559614658355713,
+      "learning_rate": 4.8486155438113454e-05,
+      "loss": 0.1897158980369568,
+      "step": 760
+    },
+    {
+      "epoch": 0.13919213973799127,
+      "grad_norm": 0.3198587894439697,
+      "learning_rate": 4.846082845298586e-05,
+      "loss": 0.18571001291275024,
+      "step": 765
+    },
+    {
+      "epoch": 0.1401018922852984,
+      "grad_norm": 0.1486678421497345,
+      "learning_rate": 4.843529808519189e-05,
+      "loss": 0.19561930894851684,
+      "step": 770
+    },
+    {
+      "epoch": 0.14101164483260553,
+      "grad_norm": 0.15318170189857483,
+      "learning_rate": 4.840956455605509e-05,
+      "loss": 0.187040114402771,
+      "step": 775
+    },
+    {
+      "epoch": 0.14192139737991266,
+      "grad_norm": 0.13754244148731232,
+      "learning_rate": 4.838362808866025e-05,
+      "loss": 0.18345539569854735,
+      "step": 780
+    },
+    {
+      "epoch": 0.1428311499272198,
+      "grad_norm": 0.12943248450756073,
+      "learning_rate": 4.835748890785143e-05,
+      "loss": 0.1921079397201538,
+      "step": 785
+    },
+    {
+      "epoch": 0.14374090247452692,
+      "grad_norm": 0.110458143055439,
+      "learning_rate": 4.833114724023001e-05,
+      "loss": 0.17927205562591553,
+      "step": 790
+    },
+    {
+      "epoch": 0.14465065502183405,
+      "grad_norm": 0.2421770840883255,
+      "learning_rate": 4.830460331415275e-05,
+      "loss": 0.18317567110061644,
+      "step": 795
+    },
+    {
+      "epoch": 0.14556040756914118,
+      "grad_norm": 0.14752762019634247,
+      "learning_rate": 4.8277857359729787e-05,
+      "loss": 0.1843916058540344,
+      "step": 800
+    },
+    {
+      "epoch": 0.14647016011644834,
+      "grad_norm": 0.15043556690216064,
+      "learning_rate": 4.8250909608822644e-05,
+      "loss": 0.18354393243789674,
+      "step": 805
+    },
+    {
+      "epoch": 0.14737991266375547,
+      "grad_norm": 0.1381794661283493,
+      "learning_rate": 4.822376029504223e-05,
+      "loss": 0.1789781332015991,
+      "step": 810
+    },
+    {
+      "epoch": 0.1482896652110626,
+      "grad_norm": 0.18386174738407135,
+      "learning_rate": 4.819640965374681e-05,
+      "loss": 0.19494292736053467,
+      "step": 815
+    },
+    {
+      "epoch": 0.14919941775836973,
+      "grad_norm": 0.13829593360424042,
+      "learning_rate": 4.816885792203996e-05,
+      "loss": 0.18486063480377196,
+      "step": 820
+    },
+    {
+      "epoch": 0.15010917030567686,
+      "grad_norm": 0.15033291280269623,
+      "learning_rate": 4.814110533876852e-05,
+      "loss": 0.18061509132385253,
+      "step": 825
+    },
+    {
+      "epoch": 0.151018922852984,
+      "grad_norm": 0.17150473594665527,
+      "learning_rate": 4.811315214452051e-05,
+      "loss": 0.18464866876602173,
+      "step": 830
+    },
+    {
+      "epoch": 0.15192867540029112,
+      "grad_norm": 0.15317125618457794,
+      "learning_rate": 4.808499858162307e-05,
+      "loss": 0.1837708592414856,
+      "step": 835
+    },
+    {
+      "epoch": 0.15283842794759825,
+      "grad_norm": 0.2671392560005188,
+      "learning_rate": 4.805664489414031e-05,
+      "loss": 0.19338636398315429,
+      "step": 840
+    },
+    {
+      "epoch": 0.15374818049490538,
+      "grad_norm": 0.14047028124332428,
+      "learning_rate": 4.802809132787125e-05,
+      "loss": 0.17069108486175538,
+      "step": 845
+    },
+    {
+      "epoch": 0.1546579330422125,
+      "grad_norm": 0.1520431935787201,
+      "learning_rate": 4.799933813034768e-05,
+      "loss": 0.18607735633850098,
+      "step": 850
+    },
+    {
+      "epoch": 0.15556768558951964,
+      "grad_norm": 0.17239463329315186,
+      "learning_rate": 4.797038555083197e-05,
+      "loss": 0.18069062232971192,
+      "step": 855
+    },
+    {
+      "epoch": 0.1564774381368268,
+      "grad_norm": 0.1377955675125122,
+      "learning_rate": 4.794123384031495e-05,
+      "loss": 0.18870222568511963,
+      "step": 860
+    },
+    {
+      "epoch": 0.15738719068413393,
+      "grad_norm": 0.15901461243629456,
+      "learning_rate": 4.791188325151373e-05,
+      "loss": 0.18128334283828734,
+      "step": 865
+    },
+    {
+      "epoch": 0.15829694323144106,
+      "grad_norm": 0.14634132385253906,
+      "learning_rate": 4.7882334038869495e-05,
+      "loss": 0.1866163969039917,
+      "step": 870
+    },
+    {
+      "epoch": 0.1592066957787482,
+      "grad_norm": 0.15361061692237854,
+      "learning_rate": 4.785258645854529e-05,
+      "loss": 0.17850807905197144,
+      "step": 875
+    },
+    {
+      "epoch": 0.16011644832605532,
+      "grad_norm": 0.13751649856567383,
+      "learning_rate": 4.782264076842385e-05,
+      "loss": 0.17731113433837892,
+      "step": 880
+    },
+    {
+      "epoch": 0.16102620087336245,
+      "grad_norm": 0.17909638583660126,
+      "learning_rate": 4.7792497228105314e-05,
+      "loss": 0.18344542980194092,
+      "step": 885
+    },
+    {
+      "epoch": 0.16193595342066958,
+      "grad_norm": 0.16038304567337036,
+      "learning_rate": 4.776215609890498e-05,
+      "loss": 0.18868647813796996,
+      "step": 890
+    },
+    {
+      "epoch": 0.1628457059679767,
+      "grad_norm": 0.1653951108455658,
+      "learning_rate": 4.773161764385107e-05,
+      "loss": 0.18614152669906617,
+      "step": 895
+    },
+    {
+      "epoch": 0.16375545851528384,
+      "grad_norm": 0.16193026304244995,
+      "learning_rate": 4.770088212768241e-05,
+      "loss": 0.18564575910568237,
+      "step": 900
+    },
+    {
+      "epoch": 0.16466521106259097,
+      "grad_norm": 0.16048531234264374,
+      "learning_rate": 4.7669949816846173e-05,
+      "loss": 0.18330031633377075,
+      "step": 905
+    },
+    {
+      "epoch": 0.1655749636098981,
+      "grad_norm": 0.1440177708864212,
+      "learning_rate": 4.7638820979495534e-05,
+      "loss": 0.17712442874908446,
+      "step": 910
+    },
+    {
+      "epoch": 0.16648471615720525,
+      "grad_norm": 0.19635969400405884,
+      "learning_rate": 4.760749588548738e-05,
+      "loss": 0.18679027557373046,
+      "step": 915
+    },
+    {
+      "epoch": 0.16739446870451238,
+      "grad_norm": 0.15576541423797607,
+      "learning_rate": 4.757597480637995e-05,
+      "loss": 0.19283764362335204,
+      "step": 920
+    },
+    {
+      "epoch": 0.1683042212518195,
+      "grad_norm": 0.1550331562757492,
+      "learning_rate": 4.7544258015430463e-05,
+      "loss": 0.18269542455673218,
+      "step": 925
+    },
+    {
+      "epoch": 0.16921397379912664,
+      "grad_norm": 0.18369626998901367,
+      "learning_rate": 4.75123457875928e-05,
+      "loss": 0.1697891116142273,
+      "step": 930
+    },
+    {
+      "epoch": 0.17012372634643377,
+      "grad_norm": 0.15266314148902893,
+      "learning_rate": 4.7480238399515074e-05,
+      "loss": 0.18523451089859008,
+      "step": 935
+    },
+    {
+      "epoch": 0.1710334788937409,
+      "grad_norm": 0.16709664463996887,
+      "learning_rate": 4.744793612953724e-05,
+      "loss": 0.1803238034248352,
+      "step": 940
+    },
+    {
+      "epoch": 0.17194323144104803,
+      "grad_norm": 0.14929179847240448,
+      "learning_rate": 4.741543925768872e-05,
+      "loss": 0.1861217737197876,
+      "step": 945
+    },
+    {
+      "epoch": 0.17285298398835516,
+      "grad_norm": 0.1362280696630478,
+      "learning_rate": 4.7382748065685915e-05,
+      "loss": 0.17896100282669067,
+      "step": 950
+    },
+    {
+      "epoch": 0.1737627365356623,
+      "grad_norm": 0.15290239453315735,
+      "learning_rate": 4.734986283692982e-05,
+      "loss": 0.18432788848876952,
+      "step": 955
+    },
+    {
+      "epoch": 0.17467248908296942,
+      "grad_norm": 0.1287035197019577,
+      "learning_rate": 4.73167838565035e-05,
+      "loss": 0.18485682010650634,
+      "step": 960
+    },
+    {
+      "epoch": 0.17558224163027655,
+      "grad_norm": 0.17969627678394318,
+      "learning_rate": 4.728351141116971e-05,
+      "loss": 0.17361557483673096,
+      "step": 965
+    },
+    {
+      "epoch": 0.1764919941775837,
+      "grad_norm": 0.13751201331615448,
+      "learning_rate": 4.7250045789368326e-05,
+      "loss": 0.1731679320335388,
+      "step": 970
+    },
+    {
+      "epoch": 0.17740174672489084,
+      "grad_norm": 0.1603265255689621,
+      "learning_rate": 4.721638728121388e-05,
+      "loss": 0.17308170795440675,
+      "step": 975
+    },
+    {
+      "epoch": 0.17831149927219797,
+      "grad_norm": 0.1592789888381958,
+      "learning_rate": 4.718253617849306e-05,
+      "loss": 0.17534757852554322,
+      "step": 980
+    },
+    {
+      "epoch": 0.1792212518195051,
+      "grad_norm": 0.12727224826812744,
+      "learning_rate": 4.714849277466214e-05,
+      "loss": 0.17817609310150145,
+      "step": 985
+    },
+    {
+      "epoch": 0.18013100436681223,
+      "grad_norm": 0.15401554107666016,
+      "learning_rate": 4.711425736484447e-05,
+      "loss": 0.1733405351638794,
+      "step": 990
+    },
+    {
+      "epoch": 0.18104075691411936,
+      "grad_norm": 0.13253968954086304,
+      "learning_rate": 4.7079830245827906e-05,
+      "loss": 0.17846795320510864,
+      "step": 995
+    },
+    {
+      "epoch": 0.1819505094614265,
+      "grad_norm": 0.21846213936805725,
+      "learning_rate": 4.7045211716062245e-05,
+      "loss": 0.18021599054336548,
+      "step": 1000
+    }
+  ],
+  "logging_steps": 5,
+  "max_steps": 5500,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 2,
+  "save_steps": 100,
+  "stateful_callbacks": {
+    "TrainerControl": {
+      "args": {
+        "should_epoch_stop": false,
+        "should_evaluate": false,
+        "should_log": false,
+        "should_save": true,
+        "should_training_stop": false
+      },
+      "attributes": {}
+    }
+  },
+  "total_flos": 5.583006871819799e+17,
+  "train_batch_size": 8,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-1000/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c9abb54db33ad7a1865253346fbffbda40d7b72587ff8ccf0cc69c9680b59201
+size 5777

checkpoint-1100/README.md ADDED Viewed

	@@ -0,0 +1,210 @@

+---
+base_model: unsloth/gemma-4-E4B-it
+library_name: peft
+pipeline_tag: text-generation
+tags:
+- base_model:adapter:unsloth/gemma-4-E4B-it
+- lora
+- sft
+- transformers
+- trl
+- unsloth
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.19.1

checkpoint-1100/adapter_config.json ADDED Viewed

	@@ -0,0 +1,52 @@

+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": {
+    "base_model_class": "Gemma4ForConditionalGeneration",
+    "parent_library": "transformers.models.gemma4.modeling_gemma4",
+    "unsloth_fixed": true
+  },
+  "base_model_name_or_path": "unsloth/gemma-4-E4B-it",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 16,
+  "lora_bias": false,
+  "lora_dropout": 0.0,
+  "lora_ga_config": null,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.19.1",
+  "qalora_group_size": 16,
+  "r": 16,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "gate_proj",
+    "v_proj",
+    "o_proj",
+    "k_proj",
+    "up_proj",
+    "down_proj",
+    "q_proj"
+  ],
+  "target_parameters": null,
+  "task_type": "CAUSAL_LM",
+  "trainable_token_indices": null,
+  "use_bdlora": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

checkpoint-1100/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a4be3bea2ca3bd38e446c68a30717eb1a31d7d5b77955efe33bf656a8162068a
+size 169741912

checkpoint-1100/chat_template.jinja ADDED Viewed

	@@ -0,0 +1,351 @@

+{%- macro format_parameters(properties, required, filter_keys=false) -%}
+    {%- set standard_keys = ['description', 'type', 'properties', 'required', 'nullable'] -%}
+    {%- set ns = namespace(found_first=false) -%}
+    {%- for key, value in properties | dictsort -%}
+        {%- set add_comma = false -%}
+        {%- if not filter_keys or key not in standard_keys -%}
+            {%- if ns.found_first %},{% endif -%}
+            {%- set ns.found_first = true -%}
+            {{ key }}:{
+            {%- if value['description'] -%}
+                description:<|"|>{{ value['description'] }}<|"|>
+                {%- set add_comma = true -%}
+            {%- endif -%}
+            {%- if value['type'] | upper == 'STRING' -%}
+                {%- if value['enum'] -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    enum:{{ format_argument(value['enum']) }}
+                {%- endif -%}
+            {%- elif value['type'] | upper == 'ARRAY' -%}
+                {%- if value['items'] is mapping and value['items'] -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    items:{
+                    {%- set ns_items = namespace(found_first=false) -%}
+                    {%- for item_key, item_value in value['items'] | dictsort -%}
+                        {%- if item_value is not none -%}
+                            {%- if ns_items.found_first %},{% endif -%}
+                            {%- set ns_items.found_first = true -%}
+                            {%- if item_key == 'properties' -%}
+                                properties:{
+                                {%- if item_value is mapping -%}
+                                    {{- format_parameters(item_value, value['items']['required'] | default([])) -}}
+                                {%- endif -%}
+                                }
+                            {%- elif item_key == 'required' -%}
+                                required:[
+                                {%- for req_item in item_value -%}
+                                    <|"|>{{- req_item -}}<|"|>
+                                    {%- if not loop.last %},{% endif -%}
+                                {%- endfor -%}
+                                ]
+                            {%- elif item_key == 'type' -%}
+                                {%- if item_value is string -%}
+                                    type:{{ format_argument(item_value | upper) }}
+                                {%- else -%}
+                                    type:{{ format_argument(item_value | map('upper') | list) }}
+                                {%- endif -%}
+                            {%- else -%}
+                                {{ item_key }}:{{ format_argument(item_value) }}
+                            {%- endif -%}
+                        {%- endif -%}
+                    {%- endfor -%}
+                    }
+                {%- endif -%}
+            {%- endif -%}
+            {%- if value['nullable'] %}
+                {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                nullable:true
+            {%- endif -%}
+            {%- if value['type'] | upper == 'OBJECT' -%}
+                {%- if value['properties'] is defined and value['properties'] is mapping -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    properties:{
+                    {{- format_parameters(value['properties'], value['required'] | default([])) -}}
+                    }
+                {%- elif value is mapping -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    properties:{
+                    {{- format_parameters(value, value['required'] | default([]), filter_keys=true) -}}
+                    }
+                {%- endif -%}
+                {%- if value['required'] -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    required:[
+                    {%- for item in value['required'] | default([]) -%}
+                        <|"|>{{- item -}}<|"|>
+                        {%- if not loop.last %},{% endif -%}
+                    {%- endfor -%}
+                    ]
+                {%- endif -%}
+            {%- endif -%}
+            {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+            type:<|"|>{{ value['type'] | upper }}<|"|>}
+        {%- endif -%}
+    {%- endfor -%}
+{%- endmacro -%}
+{%- macro format_function_declaration(tool_data) -%}
+    declaration:{{- tool_data['function']['name'] -}}{description:<|"|>{{- tool_data['function']['description'] -}}<|"|>
+    {%- set params = tool_data['function']['parameters'] -%}
+    {%- if params -%}
+        ,parameters:{
+        {%- if params['properties'] -%}
+            properties:{ {{- format_parameters(params['properties'], params['required']) -}} },
+        {%- endif -%}
+        {%- if params['required'] -%}
+            required:[
+            {%- for item in params['required'] -%}
+                <|"|>{{- item -}}<|"|>
+                {{- ',' if not loop.last -}}
+            {%- endfor -%}
+            ],
+        {%- endif -%}
+        {%- if params['type'] -%}
+            type:<|"|>{{- params['type'] | upper -}}<|"|>}
+        {%- endif -%}
+    {%- endif -%}
+    {%- if 'response' in tool_data['function'] -%}
+        {%- set response_declaration = tool_data['function']['response'] -%}
+        ,response:{
+        {%- if response_declaration['description'] -%}
+            description:<|"|>{{- response_declaration['description'] -}}<|"|>,
+        {%- endif -%}
+        {%- if response_declaration['type'] | upper == 'OBJECT' -%}
+            type:<|"|>{{- response_declaration['type'] | upper -}}<|"|>}
+        {%- endif -%}
+    {%- endif -%}
+    }
+{%- endmacro -%}
+{%- macro format_argument(argument, escape_keys=True) -%}
+    {%- if argument is string -%}
+        {{- '<|"|>' + argument + '<|"|>' -}}
+    {%- elif argument is boolean -%}
+        {{- 'true' if argument else 'false' -}}
+    {%- elif argument is mapping -%}
+        {{- '{' -}}
+        {%- set ns = namespace(found_first=false) -%}
+        {%- for key, value in argument | dictsort -%}
+            {%- if ns.found_first %},{% endif -%}
+            {%- set ns.found_first = true -%}
+            {%- if escape_keys -%}
+                {{- '<|"|>' + key + '<|"|>' -}}
+            {%- else -%}
+                {{- key -}}
+            {%- endif -%}
+            :{{- format_argument(value, escape_keys=escape_keys) -}}
+        {%- endfor -%}
+        {{- '}' -}}
+    {%- elif argument is sequence -%}
+        {{- '[' -}}
+        {%- for item in argument -%}
+            {{- format_argument(item, escape_keys=escape_keys) -}}
+            {%- if not loop.last %},{% endif -%}
+        {%- endfor -%}
+        {{- ']' -}}
+    {%- else -%}
+        {{- argument -}}
+    {%- endif -%}
+{%- endmacro -%}
+{%- macro strip_thinking(text) -%}
+    {%- set ns = namespace(result='') -%}
+    {%- for part in text.split('<channel|>') -%}
+        {%- if '<|channel>' in part -%}
+            {%- set ns.result = ns.result + part.split('<|channel>')[0] -%}
+        {%- else -%}
+            {%- set ns.result = ns.result + part -%}
+        {%- endif -%}
+    {%- endfor -%}
+    {{- ns.result | trim -}}
+{%- endmacro -%}
+{%- macro format_tool_response_block(tool_name, response) -%}
+    {{- '<|tool_response>' -}}
+    {%- if response is mapping -%}
+        {{- 'response:' + tool_name + '{' -}}
+        {%- for key, value in response | dictsort -%}
+            {{- key -}}:{{- format_argument(value, escape_keys=False) -}}
+            {%- if not loop.last %},{% endif -%}
+        {%- endfor -%}
+        {{- '}' -}}
+    {%- else -%}
+        {{- 'response:' + tool_name + '{value:' + format_argument(response, escape_keys=False) + '}' -}}
+    {%- endif -%}
+    {{- '<tool_response|>' -}}
+{%- endmacro -%}
+{%- set ns = namespace(prev_message_type=None) -%}
+{%- set loop_messages = messages -%}
+{{- bos_token -}}
+{#- Handle System/Tool Definitions Block -#}
+{%- if (enable_thinking is defined and enable_thinking) or tools or messages[0]['role'] in ['system', 'developer'] -%}
+    {{- '<|turn>system\n' -}}
+    {#- Inject Thinking token at the very top of the FIRST system turn -#}
+    {%- if enable_thinking is defined and enable_thinking -%}
+        {{- '<|think|>\n' -}}
+        {%- set ns.prev_message_type = 'think' -%}
+    {%- endif -%}
+    {%- if messages[0]['role'] in ['system', 'developer'] -%}
+        {%- if messages[0]['content'] is string -%}
+            {{- messages[0]['content'] | trim -}}
+        {%- elif messages[0]['content'] is sequence -%}
+            {%- for item in messages[0]['content'] -%}
+                {{- item['text'] | trim + ' '-}}
+            {%- endfor -%}
+        {%- endif -%}
+        {%- set loop_messages = messages[1:] -%}
+    {%- endif -%}
+    {%- if tools -%}
+        {%- for tool in tools %}
+            {{- '<|tool>' -}}
+            {{- format_function_declaration(tool) | trim -}}
+            {{- '<tool|>' -}}
+        {%- endfor %}
+        {%- set ns.prev_message_type = 'tool' -%}
+    {%- endif -%}
+    {{- '<turn|>\n' -}}
+{%- endif %}
+{#- Pre-scan: find last user message index for reasoning guard -#}
+{%- set ns_turn = namespace(last_user_idx=-1) -%}
+{%- for i in range(loop_messages | length) -%}
+    {%- if loop_messages[i]['role'] == 'user' -%}
+        {%- set ns_turn.last_user_idx = i -%}
+    {%- endif -%}
+{%- endfor -%}
+{#- Loop through messages -#}
+{%- for message in loop_messages -%}
+    {%- if message['role'] != 'tool' -%}
+    {%- set ns.prev_message_type = None -%}
+    {%- set role = 'model' if message['role'] == 'assistant' else message['role'] -%}
+    {#- Detect continuation: suppress duplicate <|turn>model when previous non-tool message was also assistant -#}
+    {%- set prev_nt = namespace(role=None, found=false) -%}
+    {%- if loop.index0 > 0 -%}
+        {%- for j in range(loop.index0 - 1, -1, -1) -%}
+            {%- if not prev_nt.found -%}
+                {%- if loop_messages[j]['role'] != 'tool' -%}
+                    {%- set prev_nt.role = loop_messages[j]['role'] -%}
+                    {%- set prev_nt.found = true -%}
+                {%- endif -%}
+            {%- endif -%}
+        {%- endfor -%}
+    {%- endif -%}
+    {%- set continue_same_model_turn = (role == 'model' and prev_nt.role == 'assistant') -%}
+    {%- if not continue_same_model_turn -%}
+        {{- '<|turn>' + role + '\n' }}
+    {%- endif -%}
+    {#- Render reasoning/reasoning_content as thinking channel -#}
+    {%- set thinking_text = message.get('reasoning') or message.get('reasoning_content') -%}
+    {%- if thinking_text and loop.index0 > ns_turn.last_user_idx and message.get('tool_calls') -%}
+        {{- '<|channel>thought\n' + thinking_text + '\n<channel|>' -}}
+    {%- endif -%}
+            {%- if message['tool_calls'] -%}
+                {%- for tool_call in message['tool_calls'] -%}
+                    {%- set function = tool_call['function'] -%}
+                    {{- '<|tool_call>call:' + function['name'] + '{' -}}
+                    {%- if function['arguments'] is mapping -%}
+                        {%- set ns_args = namespace(found_first=false) -%}
+                        {%- for key, value in function['arguments'] | dictsort -%}
+                            {%- if ns_args.found_first %},{% endif -%}
+                            {%- set ns_args.found_first = true -%}
+                            {{- key -}}:{{- format_argument(value, escape_keys=False) -}}
+                        {%- endfor -%}
+                    {%- elif function['arguments'] is string -%}
+                        {{- function['arguments'] -}}
+                    {%- endif -%}
+                    {{- '}<tool_call|>' -}}
+                {%- endfor -%}
+                {%- set ns.prev_message_type = 'tool_call' -%}
+            {%- endif -%}
+            {%- set ns_tr_out = namespace(flag=false) -%}
+            {%- if message.get('tool_responses') -%}
+                {#- Legacy: tool_responses embedded on the assistant message (Google/Gemma native) -#}
+                {%- for tool_response in message['tool_responses'] -%}
+                    {{- format_tool_response_block(tool_response['name'] | default('unknown'), tool_response['response']) -}}
+                    {%- set ns_tr_out.flag = true -%}
+                    {%- set ns.prev_message_type = 'tool_response' -%}
+                {%- endfor -%}
+            {%- elif message.get('tool_calls') -%}
+                {#- OpenAI Chat Completions: forward-scan consecutive role:tool messages -#}
+                {%- set ns_tool_scan = namespace(stopped=false) -%}
+                {%- for k in range(loop.index0 + 1, loop_messages | length) -%}
+                    {%- if ns_tool_scan.stopped -%}
+                    {%- elif loop_messages[k]['role'] != 'tool' -%}
+                        {%- set ns_tool_scan.stopped = true -%}
+                    {%- else -%}
+                        {%- set follow = loop_messages[k] -%}
+                        {#- Resolve tool_call_id to function name -#}
+                        {%- set ns_tname = namespace(name=follow.get('name') | default('unknown')) -%}
+                        {%- for tc in message['tool_calls'] -%}
+                            {%- if tc.get('id') == follow.get('tool_call_id') -%}
+                                {%- set ns_tname.name = tc['function']['name'] -%}
+                            {%- endif -%}
+                        {%- endfor -%}
+                        {#- Handle content as string or content-parts array -#}
+                        {%- set tool_body = follow.get('content') -%}
+                        {%- if tool_body is string -%}
+                            {{- format_tool_response_block(ns_tname.name, tool_body) -}}
+                        {%- elif tool_body is sequence and tool_body is not string -%}
+                            {%- set ns_txt = namespace(s='') -%}
+                            {%- for part in tool_body -%}
+                                {%- if part.get('type') == 'text' -%}
+                                    {%- set ns_txt.s = ns_txt.s + (part.get('text') | default('')) -%}
+                                {%- endif -%}
+                            {%- endfor -%}
+                            {{- format_tool_response_block(ns_tname.name, ns_txt.s) -}}
+                        {%- else -%}
+                            {{- format_tool_response_block(ns_tname.name, tool_body) -}}
+                        {%- endif -%}
+                        {%- set ns_tr_out.flag = true -%}
+                        {%- set ns.prev_message_type = 'tool_response' -%}
+                    {%- endif -%}
+                {%- endfor -%}
+            {%- endif -%}
+            {%- set captured_content -%}
+            {%- if message['content'] is string -%}
+                {%- if role == 'model' -%}
+                    {{- strip_thinking(message['content']) -}}
+                {%- else -%}
+                    {{- message['content'] | trim -}}
+                {%- endif -%}
+            {%- elif message['content'] is sequence -%}
+                {%- for item in message['content'] -%}
+                    {%- if item['type'] == 'text' -%}
+                        {%- if role == 'model' -%}
+                            {{- strip_thinking(item['text']) -}}
+                        {%- else -%}
+                            {{- item['text'] | trim -}}
+                        {%- endif -%}
+                    {%- elif item['type'] == 'image' -%}
+                        {{- '<|image|>' -}}
+                        {%- set ns.prev_message_type = 'image' -%}
+                    {%- elif item['type'] == 'audio' -%}
+                        {{- '<|audio|>' -}}
+                        {%- set ns.prev_message_type = 'audio' -%}
+                    {%- elif item['type'] == 'video' -%}
+                        {{- '<|video|>' -}}
+                        {%- set ns.prev_message_type = 'video' -%}
+                    {%- endif -%}
+                {%- endfor -%}
+            {%- endif -%}
+            {%- endset -%}
+            {{- captured_content -}}
+            {%- set has_content = captured_content | trim | length > 0 -%}
+        {%- if ns.prev_message_type == 'tool_call' and not ns_tr_out.flag -%}
+            {{- '<|tool_response>' -}}
+        {%- elif not (ns_tr_out.flag and not has_content) -%}
+            {{- '<turn|>\n' -}}
+        {%- endif -%}
+    {%- endif -%}
+{%- endfor -%}
+{%- if add_generation_prompt -%}
+    {%- if ns.prev_message_type != 'tool_response' and ns.prev_message_type != 'tool_call' -%}
+        {{- '<|turn>model\n' -}}
+    {%- endif -%}
+{%- endif -%}

checkpoint-1100/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:66120ce4d55186cce9be5cdf28e030e89994c81dac5711321d07d2b5ce8153e3
+size 72807355

checkpoint-1100/processor_config.json ADDED Viewed

	@@ -0,0 +1,75 @@

+{
+  "audio_ms_per_token": 40,
+  "audio_seq_length": 750,
+  "feature_extractor": {
+    "dither": 0.0,
+    "feature_extractor_type": "Gemma4AudioFeatureExtractor",
+    "feature_size": 128,
+    "fft_length": 512,
+    "fft_overdrive": false,
+    "frame_length": 320,
+    "hop_length": 160,
+    "input_scale_factor": 1.0,
+    "max_frequency": 8000.0,
+    "mel_floor": 0.001,
+    "min_frequency": 0.0,
+    "padding_side": "left",
+    "padding_value": 0.0,
+    "per_bin_mean": null,
+    "per_bin_stddev": null,
+    "preemphasis": 0.0,
+    "preemphasis_htk_flavor": true,
+    "return_attention_mask": true,
+    "sampling_rate": 16000
+  },
+  "image_processor": {
+    "do_convert_rgb": true,
+    "do_normalize": false,
+    "do_rescale": true,
+    "do_resize": true,
+    "image_mean": [
+      0.0,
+      0.0,
+      0.0
+    ],
+    "image_processor_type": "Gemma4ImageProcessor",
+    "image_seq_length": 280,
+    "image_std": [
+      1.0,
+      1.0,
+      1.0
+    ],
+    "max_soft_tokens": 280,
+    "patch_size": 16,
+    "pooling_kernel_size": 3,
+    "resample": 3,
+    "rescale_factor": 0.00392156862745098
+  },
+  "image_seq_length": 280,
+  "processor_class": "Gemma4Processor",
+  "video_processor": {
+    "do_convert_rgb": true,
+    "do_normalize": true,
+    "do_rescale": true,
+    "do_resize": true,
+    "do_sample_frames": true,
+    "image_mean": [
+      0.0,
+      0.0,
+      0.0
+    ],
+    "image_std": [
+      1.0,
+      1.0,
+      1.0
+    ],
+    "max_soft_tokens": 70,
+    "num_frames": 32,
+    "patch_size": 16,
+    "pooling_kernel_size": 3,
+    "resample": 3,
+    "rescale_factor": 0.00392156862745098,
+    "return_metadata": false,
+    "video_processor_type": "Gemma4VideoProcessor"
+  }
+}

checkpoint-1100/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:098b29492211804ab324a36f37466821d948280bb74fce4ba895c03f13ecd878
+size 14645

checkpoint-1100/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:936724e73ecd7ecf26460f7aeb2b5af5460899f93c78695a46fc00c541454d94
+size 1465

checkpoint-1100/tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cc8d3a0ce36466ccc1278bf987df5f71db1719b9ca6b4118264f45cb627bfe0f
+size 32169626

checkpoint-1100/tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,289 @@

+{
+  "audio_token": "<|audio|>",
+  "backend": "tokenizers",
+  "boa_token": "<|audio>",
+  "boi_token": "<|image>",
+  "bos_token": "<bos>",
+  "eoa_token": "<audio|>",
+  "eoc_token": "<channel|>",
+  "eoi_token": "<image|>",
+  "eos_token": "<turn|>",
+  "eot_token": "<turn|>",
+  "escape_token": "<|\"|>",
+  "etc_token": "<tool_call|>",
+  "etd_token": "<tool|>",
+  "etr_token": "<tool_response|>",
+  "extra_special_tokens": [
+    "<|video|>"
+  ],
+  "image_token": "<|image|>",
+  "is_local": false,
+  "mask_token": "<mask>",
+  "model_max_length": 131072,
+  "model_specific_special_tokens": {
+    "audio_token": "<|audio|>",
+    "boa_token": "<|audio>",
+    "boi_token": "<|image>",
+    "eoa_token": "<audio|>",
+    "eoc_token": "<channel|>",
+    "eoi_token": "<image|>",
+    "eot_token": "<turn|>",
+    "escape_token": "<|\"|>",
+    "etc_token": "<tool_call|>",
+    "etd_token": "<tool|>",
+    "etr_token": "<tool_response|>",
+    "image_token": "<|image|>",
+    "soc_token": "<|channel>",
+    "sot_token": "<|turn>",
+    "stc_token": "<|tool_call>",
+    "std_token": "<|tool>",
+    "str_token": "<|tool_response>",
+    "think_token": "<|think|>"
+  },
+  "pad_token": "<pad>",
+  "padding_side": "right",
+  "processor_class": "Gemma4Processor",
+  "response_schema": {
+    "properties": {
+      "content": {
+        "type": "string"
+      },
+      "role": {
+        "const": "assistant"
+      },
+      "thinking": {
+        "type": "string"
+      },
+      "tool_calls": {
+        "items": {
+          "properties": {
+            "function": {
+              "properties": {
+                "arguments": {
+                  "additionalProperties": {},
+                  "type": "object",
+                  "x-parser": "gemma4-tool-call"
+                },
+                "name": {
+                  "type": "string"
+                }
+              },
+              "type": "object",
+              "x-regex": "call\\:(?P<name>\\w+)(?P<arguments>\\{.*\\})"
+            },
+            "type": {
+              "const": "function"
+            }
+          },
+          "type": "object"
+        },
+        "type": "array",
+        "x-regex-iterator": "<\\|tool_call>(.*?)<tool_call\\|>"
+      }
+    },
+    "type": "object",
+    "x-regex": "(\\<\\|channel\\>thought\\n(?P<thinking>.*?)\\<channel\\|\\>)?(?P<tool_calls>\\<\\|tool_call\\>.*\\<tool_call\\|\\>)?(?P<content>(?:(?!\\<turn\\|\\>)(?!\\<\\|tool_response\\>).)+)?(?:\\<turn\\|\\>|\\<\\|tool_response\\>)?"
+  },
+  "soc_token": "<|channel>",
+  "sot_token": "<|turn>",
+  "stc_token": "<|tool_call>",
+  "std_token": "<|tool>",
+  "str_token": "<|tool_response>",
+  "think_token": "<|think|>",
+  "tokenizer_class": "GemmaTokenizer",
+  "unk_token": "<unk>",
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<pad>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "1": {
+      "content": "<eos>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "2": {
+      "content": "<bos>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "3": {
+      "content": "<unk>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "4": {
+      "content": "<mask>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "46": {
+      "content": "<|tool>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "47": {
+      "content": "<tool|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "48": {
+      "content": "<|tool_call>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "49": {
+      "content": "<tool_call|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "50": {
+      "content": "<|tool_response>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "51": {
+      "content": "<tool_response|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "52": {
+      "content": "<|\"|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "98": {
+      "content": "<|think|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "100": {
+      "content": "<|channel>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "101": {
+      "content": "<channel|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "105": {
+      "content": "<|turn>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "106": {
+      "content": "<turn|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "255999": {
+      "content": "<|image>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "256000": {
+      "content": "<|audio>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "258880": {
+      "content": "<|image|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "258881": {
+      "content": "<|audio|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "258882": {
+      "content": "<image|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "258883": {
+      "content": "<audio|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    },
+    "258884": {
+      "content": "<|video|>",
+      "single_word": false,
+      "lstrip": false,
+      "rstrip": false,
+      "normalized": false,
+      "special": true
+    }
+  }
+}

checkpoint-1100/trainer_state.json ADDED Viewed

	@@ -0,0 +1,1582 @@

+{
+  "best_global_step": null,
+  "best_metric": null,
+  "best_model_checkpoint": null,
+  "epoch": 0.20014556040756915,
+  "eval_steps": 100,
+  "global_step": 1100,
+  "is_hyper_param_search": false,
+  "is_local_process_zero": true,
+  "is_world_process_zero": true,
+  "log_history": [
+    {
+      "epoch": 0.0009097525473071324,
+      "grad_norm": 1.0602493286132812,
+      "learning_rate": 1.2121212121212122e-06,
+      "loss": 1.7156932830810547,
+      "step": 5
+    },
+    {
+      "epoch": 0.001819505094614265,
+      "grad_norm": 1.1577719449996948,
+      "learning_rate": 2.7272727272727272e-06,
+      "loss": 1.6629371643066406,
+      "step": 10
+    },
+    {
+      "epoch": 0.0027292576419213972,
+      "grad_norm": 1.0288419723510742,
+      "learning_rate": 4.242424242424243e-06,
+      "loss": 1.6706295013427734,
+      "step": 15
+    },
+    {
+      "epoch": 0.00363901018922853,
+      "grad_norm": 2.129403829574585,
+      "learning_rate": 5.7575757575757586e-06,
+      "loss": 1.7363752365112304,
+      "step": 20
+    },
+    {
+      "epoch": 0.004548762736535662,
+      "grad_norm": 1.9468326568603516,
+      "learning_rate": 7.272727272727272e-06,
+      "loss": 1.7111135482788087,
+      "step": 25
+    },
+    {
+      "epoch": 0.0054585152838427945,
+      "grad_norm": 1.1269357204437256,
+      "learning_rate": 8.787878787878788e-06,
+      "loss": 1.6924203872680663,
+      "step": 30
+    },
+    {
+      "epoch": 0.006368267831149927,
+      "grad_norm": 1.4021248817443848,
+      "learning_rate": 1.0303030303030304e-05,
+      "loss": 1.658310317993164,
+      "step": 35
+    },
+    {
+      "epoch": 0.00727802037845706,
+      "grad_norm": 1.313381314277649,
+      "learning_rate": 1.1818181818181819e-05,
+      "loss": 1.5383296012878418,
+      "step": 40
+    },
+    {
+      "epoch": 0.008187772925764192,
+      "grad_norm": 2.4359891414642334,
+      "learning_rate": 1.3333333333333333e-05,
+      "loss": 1.4302565574645996,
+      "step": 45
+    },
+    {
+      "epoch": 0.009097525473071324,
+      "grad_norm": 1.6459542512893677,
+      "learning_rate": 1.484848484848485e-05,
+      "loss": 1.2602953910827637,
+      "step": 50
+    },
+    {
+      "epoch": 0.010007278020378457,
+      "grad_norm": 0.7953159213066101,
+      "learning_rate": 1.6363636363636366e-05,
+      "loss": 1.204326343536377,
+      "step": 55
+    },
+    {
+      "epoch": 0.010917030567685589,
+      "grad_norm": 0.5824465155601501,
+      "learning_rate": 1.787878787878788e-05,
+      "loss": 1.068561840057373,
+      "step": 60
+    },
+    {
+      "epoch": 0.011826783114992722,
+      "grad_norm": 0.39265626668930054,
+      "learning_rate": 1.9393939393939395e-05,
+      "loss": 0.9570062637329102,
+      "step": 65
+    },
+    {
+      "epoch": 0.012736535662299854,
+      "grad_norm": 0.3387283384799957,
+      "learning_rate": 2.090909090909091e-05,
+      "loss": 0.9454713821411133,
+      "step": 70
+    },
+    {
+      "epoch": 0.013646288209606987,
+      "grad_norm": 0.3182811141014099,
+      "learning_rate": 2.2424242424242424e-05,
+      "loss": 0.8901592254638672,
+      "step": 75
+    },
+    {
+      "epoch": 0.01455604075691412,
+      "grad_norm": 0.2735312879085541,
+      "learning_rate": 2.393939393939394e-05,
+      "loss": 0.8491583824157715,
+      "step": 80
+    },
+    {
+      "epoch": 0.015465793304221253,
+      "grad_norm": 0.2376435250043869,
+      "learning_rate": 2.5454545454545454e-05,
+      "loss": 0.8109179496765136,
+      "step": 85
+    },
+    {
+      "epoch": 0.016375545851528384,
+      "grad_norm": 0.2161586880683899,
+      "learning_rate": 2.696969696969697e-05,
+      "loss": 0.76962308883667,
+      "step": 90
+    },
+    {
+      "epoch": 0.017285298398835518,
+      "grad_norm": 0.19587980210781097,
+      "learning_rate": 2.8484848484848486e-05,
+      "loss": 0.7301986694335938,
+      "step": 95
+    },
+    {
+      "epoch": 0.018195050946142648,
+      "grad_norm": 0.20971694588661194,
+      "learning_rate": 3e-05,
+      "loss": 0.7269618034362793,
+      "step": 100
+    },
+    {
+      "epoch": 0.018195050946142648,
+      "eval_loss": 2.605874538421631,
+      "eval_runtime": 1120.0905,
+      "eval_samples_per_second": 33.935,
+      "eval_steps_per_second": 8.484,
+      "step": 100
+    },
+    {
+      "epoch": 0.01910480349344978,
+      "grad_norm": 0.10413152724504471,
+      "learning_rate": 3.151515151515151e-05,
+      "loss": 0.3250573635101318,
+      "step": 105
+    },
+    {
+      "epoch": 0.020014556040756915,
+      "grad_norm": 0.09383206814527512,
+      "learning_rate": 3.303030303030303e-05,
+      "loss": 0.3277724742889404,
+      "step": 110
+    },
+    {
+      "epoch": 0.020924308588064048,
+      "grad_norm": 0.1195850670337677,
+      "learning_rate": 3.454545454545455e-05,
+      "loss": 0.3215961217880249,
+      "step": 115
+    },
+    {
+      "epoch": 0.021834061135371178,
+      "grad_norm": 0.0715397521853447,
+      "learning_rate": 3.606060606060606e-05,
+      "loss": 0.3120795965194702,
+      "step": 120
+    },
+    {
+      "epoch": 0.02274381368267831,
+      "grad_norm": 0.068007692694664,
+      "learning_rate": 3.757575757575758e-05,
+      "loss": 0.2964257955551147,
+      "step": 125
+    },
+    {
+      "epoch": 0.023653566229985445,
+      "grad_norm": 0.09345484524965286,
+      "learning_rate": 3.909090909090909e-05,
+      "loss": 0.30776252746582033,
+      "step": 130
+    },
+    {
+      "epoch": 0.024563318777292575,
+      "grad_norm": 0.05577846243977547,
+      "learning_rate": 4.0606060606060606e-05,
+      "loss": 0.3180255889892578,
+      "step": 135
+    },
+    {
+      "epoch": 0.025473071324599708,
+      "grad_norm": 0.05919989198446274,
+      "learning_rate": 4.212121212121212e-05,
+      "loss": 0.31608285903930666,
+      "step": 140
+    },
+    {
+      "epoch": 0.02638282387190684,
+      "grad_norm": 0.05644674599170685,
+      "learning_rate": 4.3636363636363636e-05,
+      "loss": 0.2993780136108398,
+      "step": 145
+    },
+    {
+      "epoch": 0.027292576419213975,
+      "grad_norm": 0.059986088424921036,
+      "learning_rate": 4.515151515151516e-05,
+      "loss": 0.2931638479232788,
+      "step": 150
+    },
+    {
+      "epoch": 0.028202328966521105,
+      "grad_norm": 0.05941484495997429,
+      "learning_rate": 4.666666666666667e-05,
+      "loss": 0.29284651279449464,
+      "step": 155
+    },
+    {
+      "epoch": 0.02911208151382824,
+      "grad_norm": 0.0579044483602047,
+      "learning_rate": 4.8181818181818186e-05,
+      "loss": 0.2927037000656128,
+      "step": 160
+    },
+    {
+      "epoch": 0.030021834061135372,
+      "grad_norm": 0.061985693871974945,
+      "learning_rate": 4.9696969696969694e-05,
+      "loss": 0.28671720027923586,
+      "step": 165
+    },
+    {
+      "epoch": 0.030931586608442505,
+      "grad_norm": 0.05715535953640938,
+      "learning_rate": 4.999993064772809e-05,
+      "loss": 0.2817929744720459,
+      "step": 170
+    },
+    {
+      "epoch": 0.03184133915574964,
+      "grad_norm": 0.06549780815839767,
+      "learning_rate": 4.999964890478288e-05,
+      "loss": 0.27853829860687257,
+      "step": 175
+    },
+    {
+      "epoch": 0.03275109170305677,
+      "grad_norm": 0.05948757752776146,
+      "learning_rate": 4.999915043908795e-05,
+      "loss": 0.27522289752960205,
+      "step": 180
+    },
+    {
+      "epoch": 0.0336608442503639,
+      "grad_norm": 0.06262889504432678,
+      "learning_rate": 4.9998435254964515e-05,
+      "loss": 0.270997428894043,
+      "step": 185
+    },
+    {
+      "epoch": 0.034570596797671035,
+      "grad_norm": 0.06916829943656921,
+      "learning_rate": 4.999750335861253e-05,
+      "loss": 0.2788438558578491,
+      "step": 190
+    },
+    {
+      "epoch": 0.035480349344978165,
+      "grad_norm": 0.06128217652440071,
+      "learning_rate": 4.9996354758110624e-05,
+      "loss": 0.25649352073669435,
+      "step": 195
+    },
+    {
+      "epoch": 0.036390101892285295,
+      "grad_norm": 0.06704027950763702,
+      "learning_rate": 4.999498946341606e-05,
+      "loss": 0.25619523525238036,
+      "step": 200
+    },
+    {
+      "epoch": 0.03729985443959243,
+      "grad_norm": 0.061678580939769745,
+      "learning_rate": 4.999340748636462e-05,
+      "loss": 0.24956226348876953,
+      "step": 205
+    },
+    {
+      "epoch": 0.03820960698689956,
+      "grad_norm": 0.07328873127698898,
+      "learning_rate": 4.999160884067051e-05,
+      "loss": 0.26169676780700685,
+      "step": 210
+    },
+    {
+      "epoch": 0.0391193595342067,
+      "grad_norm": 0.08287990838289261,
+      "learning_rate": 4.9989593541926246e-05,
+      "loss": 0.2574604034423828,
+      "step": 215
+    },
+    {
+      "epoch": 0.04002911208151383,
+      "grad_norm": 0.06787359714508057,
+      "learning_rate": 4.9987361607602525e-05,
+      "loss": 0.25351409912109374,
+      "step": 220
+    },
+    {
+      "epoch": 0.04093886462882096,
+      "grad_norm": 0.06695502996444702,
+      "learning_rate": 4.998491305704805e-05,
+      "loss": 0.24522039890289307,
+      "step": 225
+    },
+    {
+      "epoch": 0.041848617176128096,
+      "grad_norm": 0.08872214704751968,
+      "learning_rate": 4.9982247911489375e-05,
+      "loss": 0.2581867933273315,
+      "step": 230
+    },
+    {
+      "epoch": 0.042758369723435226,
+      "grad_norm": 0.07637131959199905,
+      "learning_rate": 4.9979366194030743e-05,
+      "loss": 0.25569658279418944,
+      "step": 235
+    },
+    {
+      "epoch": 0.043668122270742356,
+      "grad_norm": 0.08158119022846222,
+      "learning_rate": 4.997626792965385e-05,
+      "loss": 0.2529409646987915,
+      "step": 240
+    },
+    {
+      "epoch": 0.04457787481804949,
+      "grad_norm": 0.07529161125421524,
+      "learning_rate": 4.997295314521766e-05,
+      "loss": 0.24049024581909179,
+      "step": 245
+    },
+    {
+      "epoch": 0.04548762736535662,
+      "grad_norm": 0.08860139548778534,
+      "learning_rate": 4.996942186945813e-05,
+      "loss": 0.2490522861480713,
+      "step": 250
+    },
+    {
+      "epoch": 0.04639737991266375,
+      "grad_norm": 0.0850321501493454,
+      "learning_rate": 4.9965674132988005e-05,
+      "loss": 0.24180831909179687,
+      "step": 255
+    },
+    {
+      "epoch": 0.04730713245997089,
+      "grad_norm": 0.07556115090847015,
+      "learning_rate": 4.996170996829653e-05,
+      "loss": 0.2509631872177124,
+      "step": 260
+    },
+    {
+      "epoch": 0.04821688500727802,
+      "grad_norm": 0.07971206307411194,
+      "learning_rate": 4.995752940974918e-05,
+      "loss": 0.24398891925811766,
+      "step": 265
+    },
+    {
+      "epoch": 0.04912663755458515,
+      "grad_norm": 0.09149336814880371,
+      "learning_rate": 4.9953132493587344e-05,
+      "loss": 0.2300492286682129,
+      "step": 270
+    },
+    {
+      "epoch": 0.050036390101892286,
+      "grad_norm": 0.08265820890665054,
+      "learning_rate": 4.9948519257928034e-05,
+      "loss": 0.24246792793273925,
+      "step": 275
+    },
+    {
+      "epoch": 0.050946142649199416,
+      "grad_norm": 0.10328587144613266,
+      "learning_rate": 4.9943689742763534e-05,
+      "loss": 0.2367171049118042,
+      "step": 280
+    },
+    {
+      "epoch": 0.05185589519650655,
+      "grad_norm": 0.0836917981505394,
+      "learning_rate": 4.993864398996105e-05,
+      "loss": 0.23215813636779786,
+      "step": 285
+    },
+    {
+      "epoch": 0.05276564774381368,
+      "grad_norm": 0.09475161135196686,
+      "learning_rate": 4.99333820432624e-05,
+      "loss": 0.2350748062133789,
+      "step": 290
+    },
+    {
+      "epoch": 0.05367540029112081,
+      "grad_norm": 0.08040128648281097,
+      "learning_rate": 4.992790394828355e-05,
+      "loss": 0.23253886699676513,
+      "step": 295
+    },
+    {
+      "epoch": 0.05458515283842795,
+      "grad_norm": 0.08852150291204453,
+      "learning_rate": 4.992220975251428e-05,
+      "loss": 0.23856515884399415,
+      "step": 300
+    },
+    {
+      "epoch": 0.05549490538573508,
+      "grad_norm": 0.09565229713916779,
+      "learning_rate": 4.991629950531775e-05,
+      "loss": 0.23311660289764405,
+      "step": 305
+    },
+    {
+      "epoch": 0.05640465793304221,
+      "grad_norm": 0.08158160001039505,
+      "learning_rate": 4.991017325793009e-05,
+      "loss": 0.22467944622039795,
+      "step": 310
+    },
+    {
+      "epoch": 0.05731441048034935,
+      "grad_norm": 0.07746429741382599,
+      "learning_rate": 4.990383106345994e-05,
+      "loss": 0.229844069480896,
+      "step": 315
+    },
+    {
+      "epoch": 0.05822416302765648,
+      "grad_norm": 0.08564355969429016,
+      "learning_rate": 4.989727297688797e-05,
+      "loss": 0.22414517402648926,
+      "step": 320
+    },
+    {
+      "epoch": 0.05913391557496361,
+      "grad_norm": 0.07517435401678085,
+      "learning_rate": 4.9890499055066435e-05,
+      "loss": 0.2236532211303711,
+      "step": 325
+    },
+    {
+      "epoch": 0.060043668122270744,
+      "grad_norm": 0.111734539270401,
+      "learning_rate": 4.988350935671869e-05,
+      "loss": 0.21474847793579102,
+      "step": 330
+    },
+    {
+      "epoch": 0.060953420669577874,
+      "grad_norm": 0.09906989336013794,
+      "learning_rate": 4.987630394243866e-05,
+      "loss": 0.23321933746337892,
+      "step": 335
+    },
+    {
+      "epoch": 0.06186317321688501,
+      "grad_norm": 0.10131457448005676,
+      "learning_rate": 4.98688828746903e-05,
+      "loss": 0.2310662031173706,
+      "step": 340
+    },
+    {
+      "epoch": 0.06277292576419213,
+      "grad_norm": 0.09203507006168365,
+      "learning_rate": 4.986124621780708e-05,
+      "loss": 0.22021169662475587,
+      "step": 345
+    },
+    {
+      "epoch": 0.06368267831149928,
+      "grad_norm": 0.09505912661552429,
+      "learning_rate": 4.9853394037991416e-05,
+      "loss": 0.2197155237197876,
+      "step": 350
+    },
+    {
+      "epoch": 0.06459243085880641,
+      "grad_norm": 0.09038657695055008,
+      "learning_rate": 4.984532640331412e-05,
+      "loss": 0.22066287994384765,
+      "step": 355
+    },
+    {
+      "epoch": 0.06550218340611354,
+      "grad_norm": 0.09707064181566238,
+      "learning_rate": 4.9837043383713753e-05,
+      "loss": 0.22455451488494874,
+      "step": 360
+    },
+    {
+      "epoch": 0.06641193595342067,
+      "grad_norm": 0.10367228090763092,
+      "learning_rate": 4.98285450509961e-05,
+      "loss": 0.21993820667266845,
+      "step": 365
+    },
+    {
+      "epoch": 0.0673216885007278,
+      "grad_norm": 0.12229471653699875,
+      "learning_rate": 4.9819831478833456e-05,
+      "loss": 0.2168867588043213,
+      "step": 370
+    },
+    {
+      "epoch": 0.06823144104803494,
+      "grad_norm": 0.0964592918753624,
+      "learning_rate": 4.981090274276406e-05,
+      "loss": 0.21579203605651856,
+      "step": 375
+    },
+    {
+      "epoch": 0.06914119359534207,
+      "grad_norm": 0.09400496631860733,
+      "learning_rate": 4.980175892019141e-05,
+      "loss": 0.20972180366516113,
+      "step": 380
+    },
+    {
+      "epoch": 0.0700509461426492,
+      "grad_norm": 0.08158645778894424,
+      "learning_rate": 4.9792400090383594e-05,
+      "loss": 0.22148358821868896,
+      "step": 385
+    },
+    {
+      "epoch": 0.07096069868995633,
+      "grad_norm": 0.10916394740343094,
+      "learning_rate": 4.978282633447261e-05,
+      "loss": 0.2214418649673462,
+      "step": 390
+    },
+    {
+      "epoch": 0.07187045123726346,
+      "grad_norm": 0.11138810962438583,
+      "learning_rate": 4.9773037735453636e-05,
+      "loss": 0.21814754009246826,
+      "step": 395
+    },
+    {
+      "epoch": 0.07278020378457059,
+      "grad_norm": 0.10914396494626999,
+      "learning_rate": 4.9763034378184365e-05,
+      "loss": 0.21310818195343018,
+      "step": 400
+    },
+    {
+      "epoch": 0.07368995633187773,
+      "grad_norm": 0.1043366864323616,
+      "learning_rate": 4.975281634938421e-05,
+      "loss": 0.21266789436340333,
+      "step": 405
+    },
+    {
+      "epoch": 0.07459970887918486,
+      "grad_norm": 0.1036868542432785,
+      "learning_rate": 4.9742383737633594e-05,
+      "loss": 0.21606721878051757,
+      "step": 410
+    },
+    {
+      "epoch": 0.075509461426492,
+      "grad_norm": 0.11640442907810211,
+      "learning_rate": 4.9731736633373144e-05,
+      "loss": 0.21532948017120362,
+      "step": 415
+    },
+    {
+      "epoch": 0.07641921397379912,
+      "grad_norm": 0.11219926178455353,
+      "learning_rate": 4.9720875128902956e-05,
+      "loss": 0.2191627025604248,
+      "step": 420
+    },
+    {
+      "epoch": 0.07732896652110625,
+      "grad_norm": 0.12103637307882309,
+      "learning_rate": 4.970979931838176e-05,
+      "loss": 0.20938868522644044,
+      "step": 425
+    },
+    {
+      "epoch": 0.0782387190684134,
+      "grad_norm": 0.13274189829826355,
+      "learning_rate": 4.96985092978261e-05,
+      "loss": 0.21792960166931152,
+      "step": 430
+    },
+    {
+      "epoch": 0.07914847161572053,
+      "grad_norm": 0.11164513230323792,
+      "learning_rate": 4.968700516510954e-05,
+      "loss": 0.2022618055343628,
+      "step": 435
+    },
+    {
+      "epoch": 0.08005822416302766,
+      "grad_norm": 0.09532847255468369,
+      "learning_rate": 4.967528701996174e-05,
+      "loss": 0.21255812644958497,
+      "step": 440
+    },
+    {
+      "epoch": 0.08096797671033479,
+      "grad_norm": 0.10279258340597153,
+      "learning_rate": 4.96633549639677e-05,
+      "loss": 0.20683050155639648,
+      "step": 445
+    },
+    {
+      "epoch": 0.08187772925764192,
+      "grad_norm": 0.1257462352514267,
+      "learning_rate": 4.965120910056677e-05,
+      "loss": 0.21419920921325683,
+      "step": 450
+    },
+    {
+      "epoch": 0.08278748180494905,
+      "grad_norm": 0.11663137376308441,
+      "learning_rate": 4.963884953505186e-05,
+      "loss": 0.2072287082672119,
+      "step": 455
+    },
+    {
+      "epoch": 0.08369723435225619,
+      "grad_norm": 0.10488224029541016,
+      "learning_rate": 4.96262763745684e-05,
+      "loss": 0.1982678532600403,
+      "step": 460
+    },
+    {
+      "epoch": 0.08460698689956332,
+      "grad_norm": 0.11801692098379135,
+      "learning_rate": 4.961348972811354e-05,
+      "loss": 0.20662031173706055,
+      "step": 465
+    },
+    {
+      "epoch": 0.08551673944687045,
+      "grad_norm": 0.11318827420473099,
+      "learning_rate": 4.96004897065351e-05,
+      "loss": 0.20947303771972656,
+      "step": 470
+    },
+    {
+      "epoch": 0.08642649199417758,
+      "grad_norm": 0.13409486413002014,
+      "learning_rate": 4.95872764225307e-05,
+      "loss": 0.19670876264572143,
+      "step": 475
+    },
+    {
+      "epoch": 0.08733624454148471,
+      "grad_norm": 0.14440792798995972,
+      "learning_rate": 4.957384999064672e-05,
+      "loss": 0.19842848777770997,
+      "step": 480
+    },
+    {
+      "epoch": 0.08824599708879186,
+      "grad_norm": 0.12246996909379959,
+      "learning_rate": 4.956021052727731e-05,
+      "loss": 0.20318071842193602,
+      "step": 485
+    },
+    {
+      "epoch": 0.08915574963609899,
+      "grad_norm": 0.13437233865261078,
+      "learning_rate": 4.954635815066342e-05,
+      "loss": 0.21675212383270265,
+      "step": 490
+    },
+    {
+      "epoch": 0.09006550218340612,
+      "grad_norm": 0.11109672486782074,
+      "learning_rate": 4.9532292980891744e-05,
+      "loss": 0.2100757837295532,
+      "step": 495
+    },
+    {
+      "epoch": 0.09097525473071325,
+      "grad_norm": 0.1388893872499466,
+      "learning_rate": 4.9518015139893675e-05,
+      "loss": 0.20303285121917725,
+      "step": 500
+    },
+    {
+      "epoch": 0.09188500727802038,
+      "grad_norm": 0.13239721953868866,
+      "learning_rate": 4.950352475144427e-05,
+      "loss": 0.2152268409729004,
+      "step": 505
+    },
+    {
+      "epoch": 0.0927947598253275,
+      "grad_norm": 0.12834979593753815,
+      "learning_rate": 4.948882194116119e-05,
+      "loss": 0.20799248218536376,
+      "step": 510
+    },
+    {
+      "epoch": 0.09370451237263465,
+      "grad_norm": 0.11886704713106155,
+      "learning_rate": 4.947390683650354e-05,
+      "loss": 0.20394976139068605,
+      "step": 515
+    },
+    {
+      "epoch": 0.09461426491994178,
+      "grad_norm": 0.11398876458406448,
+      "learning_rate": 4.945877956677083e-05,
+      "loss": 0.2091092586517334,
+      "step": 520
+    },
+    {
+      "epoch": 0.09552401746724891,
+      "grad_norm": 0.1422540694475174,
+      "learning_rate": 4.944344026310186e-05,
+      "loss": 0.19564238786697388,
+      "step": 525
+    },
+    {
+      "epoch": 0.09643377001455604,
+      "grad_norm": 0.11359584331512451,
+      "learning_rate": 4.9427889058473535e-05,
+      "loss": 0.20493624210357667,
+      "step": 530
+    },
+    {
+      "epoch": 0.09734352256186317,
+      "grad_norm": 0.11703553050756454,
+      "learning_rate": 4.941212608769974e-05,
+      "loss": 0.2098615884780884,
+      "step": 535
+    },
+    {
+      "epoch": 0.0982532751091703,
+      "grad_norm": 0.14552047848701477,
+      "learning_rate": 4.939615148743017e-05,
+      "loss": 0.20382182598114013,
+      "step": 540
+    },
+    {
+      "epoch": 0.09916302765647744,
+      "grad_norm": 0.13178016245365143,
+      "learning_rate": 4.937996539614914e-05,
+      "loss": 0.19901862144470214,
+      "step": 545
+    },
+    {
+      "epoch": 0.10007278020378457,
+      "grad_norm": 0.635392427444458,
+      "learning_rate": 4.936356795417439e-05,
+      "loss": 0.20694944858551026,
+      "step": 550
+    },
+    {
+      "epoch": 0.1009825327510917,
+      "grad_norm": 0.15019077062606812,
+      "learning_rate": 4.934695930365586e-05,
+      "loss": 0.19313746690750122,
+      "step": 555
+    },
+    {
+      "epoch": 0.10189228529839883,
+      "grad_norm": 0.12941956520080566,
+      "learning_rate": 4.9330139588574474e-05,
+      "loss": 0.19671722650527954,
+      "step": 560
+    },
+    {
+      "epoch": 0.10280203784570596,
+      "grad_norm": 0.13818831741809845,
+      "learning_rate": 4.931310895474088e-05,
+      "loss": 0.20026786327362062,
+      "step": 565
+    },
+    {
+      "epoch": 0.1037117903930131,
+      "grad_norm": 0.12011194974184036,
+      "learning_rate": 4.929586754979417e-05,
+      "loss": 0.1932437539100647,
+      "step": 570
+    },
+    {
+      "epoch": 0.10462154294032024,
+      "grad_norm": 0.1345364898443222,
+      "learning_rate": 4.9278415523200644e-05,
+      "loss": 0.20245940685272218,
+      "step": 575
+    },
+    {
+      "epoch": 0.10553129548762737,
+      "grad_norm": 0.13281017541885376,
+      "learning_rate": 4.926075302625247e-05,
+      "loss": 0.19864981174468993,
+      "step": 580
+    },
+    {
+      "epoch": 0.1064410480349345,
+      "grad_norm": 0.13465586304664612,
+      "learning_rate": 4.924288021206639e-05,
+      "loss": 0.19573183059692384,
+      "step": 585
+    },
+    {
+      "epoch": 0.10735080058224163,
+      "grad_norm": 0.15225961804389954,
+      "learning_rate": 4.9224797235582396e-05,
+      "loss": 0.19946801662445068,
+      "step": 590
+    },
+    {
+      "epoch": 0.10826055312954876,
+      "grad_norm": 0.12816746532917023,
+      "learning_rate": 4.92065042535624e-05,
+      "loss": 0.19851526021957397,
+      "step": 595
+    },
+    {
+      "epoch": 0.1091703056768559,
+      "grad_norm": 0.13802853226661682,
+      "learning_rate": 4.9188001424588824e-05,
+      "loss": 0.19321763515472412,
+      "step": 600
+    },
+    {
+      "epoch": 0.11008005822416303,
+      "grad_norm": 0.17504797875881195,
+      "learning_rate": 4.9169288909063295e-05,
+      "loss": 0.2032616138458252,
+      "step": 605
+    },
+    {
+      "epoch": 0.11098981077147016,
+      "grad_norm": 0.13544194400310516,
+      "learning_rate": 4.91503668692052e-05,
+      "loss": 0.2011256456375122,
+      "step": 610
+    },
+    {
+      "epoch": 0.11189956331877729,
+      "grad_norm": 1.3976134061813354,
+      "learning_rate": 4.91312354690503e-05,
+      "loss": 0.19916868209838867,
+      "step": 615
+    },
+    {
+      "epoch": 0.11280931586608442,
+      "grad_norm": 0.1465059071779251,
+      "learning_rate": 4.91118948744493e-05,
+      "loss": 0.19487457275390624,
+      "step": 620
+    },
+    {
+      "epoch": 0.11371906841339156,
+      "grad_norm": 0.12103168666362762,
+      "learning_rate": 4.909234525306645e-05,
+      "loss": 0.1907251238822937,
+      "step": 625
+    },
+    {
+      "epoch": 0.1146288209606987,
+      "grad_norm": 0.12660574913024902,
+      "learning_rate": 4.907258677437802e-05,
+      "loss": 0.19327253103256226,
+      "step": 630
+    },
+    {
+      "epoch": 0.11553857350800582,
+      "grad_norm": 0.1347813606262207,
+      "learning_rate": 4.90526196096709e-05,
+      "loss": 0.19637736082077026,
+      "step": 635
+    },
+    {
+      "epoch": 0.11644832605531295,
+      "grad_norm": 0.14953652024269104,
+      "learning_rate": 4.903244393204107e-05,
+      "loss": 0.20325069427490233,
+      "step": 640
+    },
+    {
+      "epoch": 0.11735807860262008,
+      "grad_norm": 0.13936272263526917,
+      "learning_rate": 4.901205991639213e-05,
+      "loss": 0.1930275321006775,
+      "step": 645
+    },
+    {
+      "epoch": 0.11826783114992721,
+      "grad_norm": 0.1448420137166977,
+      "learning_rate": 4.899146773943374e-05,
+      "loss": 0.20026936531066894,
+      "step": 650
+    },
+    {
+      "epoch": 0.11917758369723436,
+      "grad_norm": 0.1312534064054489,
+      "learning_rate": 4.897066757968014e-05,
+      "loss": 0.19062033891677857,
+      "step": 655
+    },
+    {
+      "epoch": 0.12008733624454149,
+      "grad_norm": 0.13644742965698242,
+      "learning_rate": 4.894965961744859e-05,
+      "loss": 0.18719595670700073,
+      "step": 660
+    },
+    {
+      "epoch": 0.12099708879184862,
+      "grad_norm": 0.14276087284088135,
+      "learning_rate": 4.892844403485777e-05,
+      "loss": 0.19784307479858398,
+      "step": 665
+    },
+    {
+      "epoch": 0.12190684133915575,
+      "grad_norm": 0.14735399186611176,
+      "learning_rate": 4.890702101582623e-05,
+      "loss": 0.19163782596588136,
+      "step": 670
+    },
+    {
+      "epoch": 0.12281659388646288,
+      "grad_norm": 0.15742065012454987,
+      "learning_rate": 4.888539074607082e-05,
+      "loss": 0.19312986135482788,
+      "step": 675
+    },
+    {
+      "epoch": 0.12372634643377002,
+      "grad_norm": 0.12917031347751617,
+      "learning_rate": 4.8863553413105025e-05,
+      "loss": 0.20066320896148682,
+      "step": 680
+    },
+    {
+      "epoch": 0.12463609898107715,
+      "grad_norm": 0.1484801322221756,
+      "learning_rate": 4.884150920623737e-05,
+      "loss": 0.20096096992492676,
+      "step": 685
+    },
+    {
+      "epoch": 0.12554585152838427,
+      "grad_norm": 0.1455296128988266,
+      "learning_rate": 4.88192583165698e-05,
+      "loss": 0.20518505573272705,
+      "step": 690
+    },
+    {
+      "epoch": 0.12645560407569142,
+      "grad_norm": 0.14517490565776825,
+      "learning_rate": 4.879680093699598e-05,
+      "loss": 0.18859238624572755,
+      "step": 695
+    },
+    {
+      "epoch": 0.12736535662299855,
+      "grad_norm": 0.18778090178966522,
+      "learning_rate": 4.877413726219964e-05,
+      "loss": 0.197074818611145,
+      "step": 700
+    },
+    {
+      "epoch": 0.12827510917030568,
+      "grad_norm": 0.13497677445411682,
+      "learning_rate": 4.87512674886529e-05,
+      "loss": 0.18713107109069824,
+      "step": 705
+    },
+    {
+      "epoch": 0.12918486171761281,
+      "grad_norm": 0.12657155096530914,
+      "learning_rate": 4.872819181461455e-05,
+      "loss": 0.1858484387397766,
+      "step": 710
+    },
+    {
+      "epoch": 0.13009461426491994,
+      "grad_norm": 0.11458148807287216,
+      "learning_rate": 4.870491044012834e-05,
+      "loss": 0.18732179403305055,
+      "step": 715
+    },
+    {
+      "epoch": 0.13100436681222707,
+      "grad_norm": 0.13000249862670898,
+      "learning_rate": 4.8681423567021244e-05,
+      "loss": 0.1872936010360718,
+      "step": 720
+    },
+    {
+      "epoch": 0.1319141193595342,
+      "grad_norm": 0.14580890536308289,
+      "learning_rate": 4.865773139890172e-05,
+      "loss": 0.19280019998550416,
+      "step": 725
+    },
+    {
+      "epoch": 0.13282387190684133,
+      "grad_norm": 0.1507277935743332,
+      "learning_rate": 4.8633834141157913e-05,
+      "loss": 0.1898929238319397,
+      "step": 730
+    },
+    {
+      "epoch": 0.13373362445414846,
+      "grad_norm": 0.1418737769126892,
+      "learning_rate": 4.860973200095592e-05,
+      "loss": 0.17926375865936278,
+      "step": 735
+    },
+    {
+      "epoch": 0.1346433770014556,
+      "grad_norm": 0.17151866853237152,
+      "learning_rate": 4.858542518723794e-05,
+      "loss": 0.18963592052459716,
+      "step": 740
+    },
+    {
+      "epoch": 0.13555312954876272,
+      "grad_norm": 0.11162743717432022,
+      "learning_rate": 4.8560913910720535e-05,
+      "loss": 0.19466646909713745,
+      "step": 745
+    },
+    {
+      "epoch": 0.13646288209606988,
+      "grad_norm": 0.15628376603126526,
+      "learning_rate": 4.8536198383892725e-05,
+      "loss": 0.19494034051895143,
+      "step": 750
+    },
+    {
+      "epoch": 0.137372634643377,
+      "grad_norm": 0.18209289014339447,
+      "learning_rate": 4.851127882101421e-05,
+      "loss": 0.18747550249099731,
+      "step": 755
+    },
+    {
+      "epoch": 0.13828238719068414,
+      "grad_norm": 0.14559614658355713,
+      "learning_rate": 4.8486155438113454e-05,
+      "loss": 0.1897158980369568,
+      "step": 760
+    },
+    {
+      "epoch": 0.13919213973799127,
+      "grad_norm": 0.3198587894439697,
+      "learning_rate": 4.846082845298586e-05,
+      "loss": 0.18571001291275024,
+      "step": 765
+    },
+    {
+      "epoch": 0.1401018922852984,
+      "grad_norm": 0.1486678421497345,
+      "learning_rate": 4.843529808519189e-05,
+      "loss": 0.19561930894851684,
+      "step": 770
+    },
+    {
+      "epoch": 0.14101164483260553,
+      "grad_norm": 0.15318170189857483,
+      "learning_rate": 4.840956455605509e-05,
+      "loss": 0.187040114402771,
+      "step": 775
+    },
+    {
+      "epoch": 0.14192139737991266,
+      "grad_norm": 0.13754244148731232,
+      "learning_rate": 4.838362808866025e-05,
+      "loss": 0.18345539569854735,
+      "step": 780
+    },
+    {
+      "epoch": 0.1428311499272198,
+      "grad_norm": 0.12943248450756073,
+      "learning_rate": 4.835748890785143e-05,
+      "loss": 0.1921079397201538,
+      "step": 785
+    },
+    {
+      "epoch": 0.14374090247452692,
+      "grad_norm": 0.110458143055439,
+      "learning_rate": 4.833114724023001e-05,
+      "loss": 0.17927205562591553,
+      "step": 790
+    },
+    {
+      "epoch": 0.14465065502183405,
+      "grad_norm": 0.2421770840883255,
+      "learning_rate": 4.830460331415275e-05,
+      "loss": 0.18317567110061644,
+      "step": 795
+    },
+    {
+      "epoch": 0.14556040756914118,
+      "grad_norm": 0.14752762019634247,
+      "learning_rate": 4.8277857359729787e-05,
+      "loss": 0.1843916058540344,
+      "step": 800
+    },
+    {
+      "epoch": 0.14647016011644834,
+      "grad_norm": 0.15043556690216064,
+      "learning_rate": 4.8250909608822644e-05,
+      "loss": 0.18354393243789674,
+      "step": 805
+    },
+    {
+      "epoch": 0.14737991266375547,
+      "grad_norm": 0.1381794661283493,
+      "learning_rate": 4.822376029504223e-05,
+      "loss": 0.1789781332015991,
+      "step": 810
+    },
+    {
+      "epoch": 0.1482896652110626,
+      "grad_norm": 0.18386174738407135,
+      "learning_rate": 4.819640965374681e-05,
+      "loss": 0.19494292736053467,
+      "step": 815
+    },
+    {
+      "epoch": 0.14919941775836973,
+      "grad_norm": 0.13829593360424042,
+      "learning_rate": 4.816885792203996e-05,
+      "loss": 0.18486063480377196,
+      "step": 820
+    },
+    {
+      "epoch": 0.15010917030567686,
+      "grad_norm": 0.15033291280269623,
+      "learning_rate": 4.814110533876852e-05,
+      "loss": 0.18061509132385253,
+      "step": 825
+    },
+    {
+      "epoch": 0.151018922852984,
+      "grad_norm": 0.17150473594665527,
+      "learning_rate": 4.811315214452051e-05,
+      "loss": 0.18464866876602173,
+      "step": 830
+    },
+    {
+      "epoch": 0.15192867540029112,
+      "grad_norm": 0.15317125618457794,
+      "learning_rate": 4.808499858162307e-05,
+      "loss": 0.1837708592414856,
+      "step": 835
+    },
+    {
+      "epoch": 0.15283842794759825,
+      "grad_norm": 0.2671392560005188,
+      "learning_rate": 4.805664489414031e-05,
+      "loss": 0.19338636398315429,
+      "step": 840
+    },
+    {
+      "epoch": 0.15374818049490538,
+      "grad_norm": 0.14047028124332428,
+      "learning_rate": 4.802809132787125e-05,
+      "loss": 0.17069108486175538,
+      "step": 845
+    },
+    {
+      "epoch": 0.1546579330422125,
+      "grad_norm": 0.1520431935787201,
+      "learning_rate": 4.799933813034768e-05,
+      "loss": 0.18607735633850098,
+      "step": 850
+    },
+    {
+      "epoch": 0.15556768558951964,
+      "grad_norm": 0.17239463329315186,
+      "learning_rate": 4.797038555083197e-05,
+      "loss": 0.18069062232971192,
+      "step": 855
+    },
+    {
+      "epoch": 0.1564774381368268,
+      "grad_norm": 0.1377955675125122,
+      "learning_rate": 4.794123384031495e-05,
+      "loss": 0.18870222568511963,
+      "step": 860
+    },
+    {
+      "epoch": 0.15738719068413393,
+      "grad_norm": 0.15901461243629456,
+      "learning_rate": 4.791188325151373e-05,
+      "loss": 0.18128334283828734,
+      "step": 865
+    },
+    {
+      "epoch": 0.15829694323144106,
+      "grad_norm": 0.14634132385253906,
+      "learning_rate": 4.7882334038869495e-05,
+      "loss": 0.1866163969039917,
+      "step": 870
+    },
+    {
+      "epoch": 0.1592066957787482,
+      "grad_norm": 0.15361061692237854,
+      "learning_rate": 4.785258645854529e-05,
+      "loss": 0.17850807905197144,
+      "step": 875
+    },
+    {
+      "epoch": 0.16011644832605532,
+      "grad_norm": 0.13751649856567383,
+      "learning_rate": 4.782264076842385e-05,
+      "loss": 0.17731113433837892,
+      "step": 880
+    },
+    {
+      "epoch": 0.16102620087336245,
+      "grad_norm": 0.17909638583660126,
+      "learning_rate": 4.7792497228105314e-05,
+      "loss": 0.18344542980194092,
+      "step": 885
+    },
+    {
+      "epoch": 0.16193595342066958,
+      "grad_norm": 0.16038304567337036,
+      "learning_rate": 4.776215609890498e-05,
+      "loss": 0.18868647813796996,
+      "step": 890
+    },
+    {
+      "epoch": 0.1628457059679767,
+      "grad_norm": 0.1653951108455658,
+      "learning_rate": 4.773161764385107e-05,
+      "loss": 0.18614152669906617,
+      "step": 895
+    },
+    {
+      "epoch": 0.16375545851528384,
+      "grad_norm": 0.16193026304244995,
+      "learning_rate": 4.770088212768241e-05,
+      "loss": 0.18564575910568237,
+      "step": 900
+    },
+    {
+      "epoch": 0.16466521106259097,
+      "grad_norm": 0.16048531234264374,
+      "learning_rate": 4.7669949816846173e-05,
+      "loss": 0.18330031633377075,
+      "step": 905
+    },
+    {
+      "epoch": 0.1655749636098981,
+      "grad_norm": 0.1440177708864212,
+      "learning_rate": 4.7638820979495534e-05,
+      "loss": 0.17712442874908446,
+      "step": 910
+    },
+    {
+      "epoch": 0.16648471615720525,
+      "grad_norm": 0.19635969400405884,
+      "learning_rate": 4.760749588548738e-05,
+      "loss": 0.18679027557373046,
+      "step": 915
+    },
+    {
+      "epoch": 0.16739446870451238,
+      "grad_norm": 0.15576541423797607,
+      "learning_rate": 4.757597480637995e-05,
+      "loss": 0.19283764362335204,
+      "step": 920
+    },
+    {
+      "epoch": 0.1683042212518195,
+      "grad_norm": 0.1550331562757492,
+      "learning_rate": 4.7544258015430463e-05,
+      "loss": 0.18269542455673218,
+      "step": 925
+    },
+    {
+      "epoch": 0.16921397379912664,
+      "grad_norm": 0.18369626998901367,
+      "learning_rate": 4.75123457875928e-05,
+      "loss": 0.1697891116142273,
+      "step": 930
+    },
+    {
+      "epoch": 0.17012372634643377,
+      "grad_norm": 0.15266314148902893,
+      "learning_rate": 4.7480238399515074e-05,
+      "loss": 0.18523451089859008,
+      "step": 935
+    },
+    {
+      "epoch": 0.1710334788937409,
+      "grad_norm": 0.16709664463996887,
+      "learning_rate": 4.744793612953724e-05,
+      "loss": 0.1803238034248352,
+      "step": 940
+    },
+    {
+      "epoch": 0.17194323144104803,
+      "grad_norm": 0.14929179847240448,
+      "learning_rate": 4.741543925768872e-05,
+      "loss": 0.1861217737197876,
+      "step": 945
+    },
+    {
+      "epoch": 0.17285298398835516,
+      "grad_norm": 0.1362280696630478,
+      "learning_rate": 4.7382748065685915e-05,
+      "loss": 0.17896100282669067,
+      "step": 950
+    },
+    {
+      "epoch": 0.1737627365356623,
+      "grad_norm": 0.15290239453315735,
+      "learning_rate": 4.734986283692982e-05,
+      "loss": 0.18432788848876952,
+      "step": 955
+    },
+    {
+      "epoch": 0.17467248908296942,
+      "grad_norm": 0.1287035197019577,
+      "learning_rate": 4.73167838565035e-05,
+      "loss": 0.18485682010650634,
+      "step": 960
+    },
+    {
+      "epoch": 0.17558224163027655,
+      "grad_norm": 0.17969627678394318,
+      "learning_rate": 4.728351141116971e-05,
+      "loss": 0.17361557483673096,
+      "step": 965
+    },
+    {
+      "epoch": 0.1764919941775837,
+      "grad_norm": 0.13751201331615448,
+      "learning_rate": 4.7250045789368326e-05,
+      "loss": 0.1731679320335388,
+      "step": 970
+    },
+    {
+      "epoch": 0.17740174672489084,
+      "grad_norm": 0.1603265255689621,
+      "learning_rate": 4.721638728121388e-05,
+      "loss": 0.17308170795440675,
+      "step": 975
+    },
+    {
+      "epoch": 0.17831149927219797,
+      "grad_norm": 0.1592789888381958,
+      "learning_rate": 4.718253617849306e-05,
+      "loss": 0.17534757852554322,
+      "step": 980
+    },
+    {
+      "epoch": 0.1792212518195051,
+      "grad_norm": 0.12727224826812744,
+      "learning_rate": 4.714849277466214e-05,
+      "loss": 0.17817609310150145,
+      "step": 985
+    },
+    {
+      "epoch": 0.18013100436681223,
+      "grad_norm": 0.15401554107666016,
+      "learning_rate": 4.711425736484447e-05,
+      "loss": 0.1733405351638794,
+      "step": 990
+    },
+    {
+      "epoch": 0.18104075691411936,
+      "grad_norm": 0.13253968954086304,
+      "learning_rate": 4.7079830245827906e-05,
+      "loss": 0.17846795320510864,
+      "step": 995
+    },
+    {
+      "epoch": 0.1819505094614265,
+      "grad_norm": 0.21846213936805725,
+      "learning_rate": 4.7045211716062245e-05,
+      "loss": 0.18021599054336548,
+      "step": 1000
+    },
+    {
+      "epoch": 0.18286026200873362,
+      "grad_norm": 0.16867990791797638,
+      "learning_rate": 4.7010402075656595e-05,
+      "loss": 0.18232386112213134,
+      "step": 1005
+    },
+    {
+      "epoch": 0.18377001455604075,
+      "grad_norm": 0.17180582880973816,
+      "learning_rate": 4.697540162637686e-05,
+      "loss": 0.1816317319869995,
+      "step": 1010
+    },
+    {
+      "epoch": 0.18467976710334788,
+      "grad_norm": 0.16480213403701782,
+      "learning_rate": 4.694021067164303e-05,
+      "loss": 0.17718446254730225,
+      "step": 1015
+    },
+    {
+      "epoch": 0.185589519650655,
+      "grad_norm": 0.15015918016433716,
+      "learning_rate": 4.6904829516526605e-05,
+      "loss": 0.17412011623382567,
+      "step": 1020
+    },
+    {
+      "epoch": 0.18649927219796217,
+      "grad_norm": 0.14445139467716217,
+      "learning_rate": 4.686925846774795e-05,
+      "loss": 0.1778018832206726,
+      "step": 1025
+    },
+    {
+      "epoch": 0.1874090247452693,
+      "grad_norm": 0.1701960265636444,
+      "learning_rate": 4.683349783367362e-05,
+      "loss": 0.16901081800460815,
+      "step": 1030
+    },
+    {
+      "epoch": 0.18831877729257643,
+      "grad_norm": 0.15894867479801178,
+      "learning_rate": 4.679754792431368e-05,
+      "loss": 0.17055928707122803,
+      "step": 1035
+    },
+    {
+      "epoch": 0.18922852983988356,
+      "grad_norm": 0.1511942446231842,
+      "learning_rate": 4.676140905131903e-05,
+      "loss": 0.17339680194854737,
+      "step": 1040
+    },
+    {
+      "epoch": 0.1901382823871907,
+      "grad_norm": 0.14735209941864014,
+      "learning_rate": 4.672508152797872e-05,
+      "loss": 0.17802717685699462,
+      "step": 1045
+    },
+    {
+      "epoch": 0.19104803493449782,
+      "grad_norm": 0.17367291450500488,
+      "learning_rate": 4.66885656692172e-05,
+      "loss": 0.1732744097709656,
+      "step": 1050
+    },
+    {
+      "epoch": 0.19195778748180495,
+      "grad_norm": 0.147227481007576,
+      "learning_rate": 4.665186179159159e-05,
+      "loss": 0.17040517330169677,
+      "step": 1055
+    },
+    {
+      "epoch": 0.19286754002911208,
+      "grad_norm": 0.1709655076265335,
+      "learning_rate": 4.6614970213289e-05,
+      "loss": 0.17794088125228882,
+      "step": 1060
+    },
+    {
+      "epoch": 0.1937772925764192,
+      "grad_norm": 0.1588088721036911,
+      "learning_rate": 4.657789125412366e-05,
+      "loss": 0.17180380821228028,
+      "step": 1065
+    },
+    {
+      "epoch": 0.19468704512372634,
+      "grad_norm": 0.14827021956443787,
+      "learning_rate": 4.654062523553428e-05,
+      "loss": 0.182997989654541,
+      "step": 1070
+    },
+    {
+      "epoch": 0.19559679767103347,
+      "grad_norm": 0.16230466961860657,
+      "learning_rate": 4.6503172480581126e-05,
+      "loss": 0.17346880435943604,
+      "step": 1075
+    },
+    {
+      "epoch": 0.1965065502183406,
+      "grad_norm": 0.1637624353170395,
+      "learning_rate": 4.646553331394333e-05,
+      "loss": 0.17263576984405518,
+      "step": 1080
+    },
+    {
+      "epoch": 0.19741630276564776,
+      "grad_norm": 0.15977843105793,
+      "learning_rate": 4.642770806191603e-05,
+      "loss": 0.17284308671951293,
+      "step": 1085
+    },
+    {
+      "epoch": 0.19832605531295489,
+      "grad_norm": 0.15394869446754456,
+      "learning_rate": 4.6389697052407534e-05,
+      "loss": 0.17797101736068727,
+      "step": 1090
+    },
+    {
+      "epoch": 0.19923580786026202,
+      "grad_norm": 0.15995225310325623,
+      "learning_rate": 4.6351500614936485e-05,
+      "loss": 0.18137198686599731,
+      "step": 1095
+    },
+    {
+      "epoch": 0.20014556040756915,
+      "grad_norm": 0.1779479682445526,
+      "learning_rate": 4.6313119080629006e-05,
+      "loss": 0.17998344898223878,
+      "step": 1100
+    }
+  ],
+  "logging_steps": 5,
+  "max_steps": 5500,
+  "num_input_tokens_seen": 0,
+  "num_train_epochs": 2,
+  "save_steps": 100,
+  "stateful_callbacks": {
+    "TrainerControl": {
+      "args": {
+        "should_epoch_stop": false,
+        "should_evaluate": false,
+        "should_log": false,
+        "should_save": true,
+        "should_training_stop": false
+      },
+      "attributes": {}
+    }
+  },
+  "total_flos": 6.127484770153037e+17,
+  "train_batch_size": 8,
+  "trial_name": null,
+  "trial_params": null
+}

checkpoint-1100/training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c9abb54db33ad7a1865253346fbffbda40d7b72587ff8ccf0cc69c9680b59201
+size 5777

checkpoint-1200/README.md ADDED Viewed

	@@ -0,0 +1,210 @@

+---
+base_model: unsloth/gemma-4-E4B-it
+library_name: peft
+pipeline_tag: text-generation
+tags:
+- base_model:adapter:unsloth/gemma-4-E4B-it
+- lora
+- sft
+- transformers
+- trl
+- unsloth
+---
+# Model Card for Model ID
+<!-- Provide a quick summary of what the model is/does. -->
+## Model Details
+### Model Description
+<!-- Provide a longer summary of what this model is. -->
+- **Developed by:** [More Information Needed]
+- **Funded by [optional]:** [More Information Needed]
+- **Shared by [optional]:** [More Information Needed]
+- **Model type:** [More Information Needed]
+- **Language(s) (NLP):** [More Information Needed]
+- **License:** [More Information Needed]
+- **Finetuned from model [optional]:** [More Information Needed]
+### Model Sources [optional]
+<!-- Provide the basic links for the model. -->
+- **Repository:** [More Information Needed]
+- **Paper [optional]:** [More Information Needed]
+- **Demo [optional]:** [More Information Needed]
+## Uses
+<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+### Direct Use
+<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
+[More Information Needed]
+### Downstream Use [optional]
+<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
+[More Information Needed]
+### Out-of-Scope Use
+<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
+[More Information Needed]
+## Bias, Risks, and Limitations
+<!-- This section is meant to convey both technical and sociotechnical limitations. -->
+[More Information Needed]
+### Recommendations
+<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
+## How to Get Started with the Model
+Use the code below to get started with the model.
+[More Information Needed]
+## Training Details
+### Training Data
+<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
+[More Information Needed]
+### Training Procedure
+<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
+#### Preprocessing [optional]
+[More Information Needed]
+#### Training Hyperparameters
+- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
+#### Speeds, Sizes, Times [optional]
+<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
+[More Information Needed]
+## Evaluation
+<!-- This section describes the evaluation protocols and provides the results. -->
+### Testing Data, Factors & Metrics
+#### Testing Data
+<!-- This should link to a Dataset Card if possible. -->
+[More Information Needed]
+#### Factors
+<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
+[More Information Needed]
+#### Metrics
+<!-- These are the evaluation metrics being used, ideally with a description of why. -->
+[More Information Needed]
+### Results
+[More Information Needed]
+#### Summary
+## Model Examination [optional]
+<!-- Relevant interpretability work for the model goes here -->
+[More Information Needed]
+## Environmental Impact
+<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
+Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
+- **Hardware Type:** [More Information Needed]
+- **Hours used:** [More Information Needed]
+- **Cloud Provider:** [More Information Needed]
+- **Compute Region:** [More Information Needed]
+- **Carbon Emitted:** [More Information Needed]
+## Technical Specifications [optional]
+### Model Architecture and Objective
+[More Information Needed]
+### Compute Infrastructure
+[More Information Needed]
+#### Hardware
+[More Information Needed]
+#### Software
+[More Information Needed]
+## Citation [optional]
+<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
+**BibTeX:**
+[More Information Needed]
+**APA:**
+[More Information Needed]
+## Glossary [optional]
+<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
+[More Information Needed]
+## More Information [optional]
+[More Information Needed]
+## Model Card Authors [optional]
+[More Information Needed]
+## Model Card Contact
+[More Information Needed]
+### Framework versions
+- PEFT 0.19.1

checkpoint-1200/adapter_config.json ADDED Viewed

	@@ -0,0 +1,52 @@

+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": {
+    "base_model_class": "Gemma4ForConditionalGeneration",
+    "parent_library": "transformers.models.gemma4.modeling_gemma4",
+    "unsloth_fixed": true
+  },
+  "base_model_name_or_path": "unsloth/gemma-4-E4B-it",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 16,
+  "lora_bias": false,
+  "lora_dropout": 0.0,
+  "lora_ga_config": null,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.19.1",
+  "qalora_group_size": 16,
+  "r": 16,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "gate_proj",
+    "v_proj",
+    "o_proj",
+    "k_proj",
+    "up_proj",
+    "down_proj",
+    "q_proj"
+  ],
+  "target_parameters": null,
+  "task_type": "CAUSAL_LM",
+  "trainable_token_indices": null,
+  "use_bdlora": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

checkpoint-1200/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:758b7e5c64f7b3b9a2dfb7f9c3f402266b67013f70427ae941acb07350f0c694
+size 169741912

checkpoint-1200/chat_template.jinja ADDED Viewed

	@@ -0,0 +1,351 @@

+{%- macro format_parameters(properties, required, filter_keys=false) -%}
+    {%- set standard_keys = ['description', 'type', 'properties', 'required', 'nullable'] -%}
+    {%- set ns = namespace(found_first=false) -%}
+    {%- for key, value in properties | dictsort -%}
+        {%- set add_comma = false -%}
+        {%- if not filter_keys or key not in standard_keys -%}
+            {%- if ns.found_first %},{% endif -%}
+            {%- set ns.found_first = true -%}
+            {{ key }}:{
+            {%- if value['description'] -%}
+                description:<|"|>{{ value['description'] }}<|"|>
+                {%- set add_comma = true -%}
+            {%- endif -%}
+            {%- if value['type'] | upper == 'STRING' -%}
+                {%- if value['enum'] -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    enum:{{ format_argument(value['enum']) }}
+                {%- endif -%}
+            {%- elif value['type'] | upper == 'ARRAY' -%}
+                {%- if value['items'] is mapping and value['items'] -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    items:{
+                    {%- set ns_items = namespace(found_first=false) -%}
+                    {%- for item_key, item_value in value['items'] | dictsort -%}
+                        {%- if item_value is not none -%}
+                            {%- if ns_items.found_first %},{% endif -%}
+                            {%- set ns_items.found_first = true -%}
+                            {%- if item_key == 'properties' -%}
+                                properties:{
+                                {%- if item_value is mapping -%}
+                                    {{- format_parameters(item_value, value['items']['required'] | default([])) -}}
+                                {%- endif -%}
+                                }
+                            {%- elif item_key == 'required' -%}
+                                required:[
+                                {%- for req_item in item_value -%}
+                                    <|"|>{{- req_item -}}<|"|>
+                                    {%- if not loop.last %},{% endif -%}
+                                {%- endfor -%}
+                                ]
+                            {%- elif item_key == 'type' -%}
+                                {%- if item_value is string -%}
+                                    type:{{ format_argument(item_value | upper) }}
+                                {%- else -%}
+                                    type:{{ format_argument(item_value | map('upper') | list) }}
+                                {%- endif -%}
+                            {%- else -%}
+                                {{ item_key }}:{{ format_argument(item_value) }}
+                            {%- endif -%}
+                        {%- endif -%}
+                    {%- endfor -%}
+                    }
+                {%- endif -%}
+            {%- endif -%}
+            {%- if value['nullable'] %}
+                {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                nullable:true
+            {%- endif -%}
+            {%- if value['type'] | upper == 'OBJECT' -%}
+                {%- if value['properties'] is defined and value['properties'] is mapping -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    properties:{
+                    {{- format_parameters(value['properties'], value['required'] | default([])) -}}
+                    }
+                {%- elif value is mapping -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    properties:{
+                    {{- format_parameters(value, value['required'] | default([]), filter_keys=true) -}}
+                    }
+                {%- endif -%}
+                {%- if value['required'] -%}
+                    {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+                    required:[
+                    {%- for item in value['required'] | default([]) -%}
+                        <|"|>{{- item -}}<|"|>
+                        {%- if not loop.last %},{% endif -%}
+                    {%- endfor -%}
+                    ]
+                {%- endif -%}
+            {%- endif -%}
+            {%- if add_comma %},{%- else -%} {%- set add_comma = true -%} {% endif -%}
+            type:<|"|>{{ value['type'] | upper }}<|"|>}
+        {%- endif -%}
+    {%- endfor -%}
+{%- endmacro -%}
+{%- macro format_function_declaration(tool_data) -%}
+    declaration:{{- tool_data['function']['name'] -}}{description:<|"|>{{- tool_data['function']['description'] -}}<|"|>
+    {%- set params = tool_data['function']['parameters'] -%}
+    {%- if params -%}
+        ,parameters:{
+        {%- if params['properties'] -%}
+            properties:{ {{- format_parameters(params['properties'], params['required']) -}} },
+        {%- endif -%}
+        {%- if params['required'] -%}
+            required:[
+            {%- for item in params['required'] -%}
+                <|"|>{{- item -}}<|"|>
+                {{- ',' if not loop.last -}}
+            {%- endfor -%}
+            ],
+        {%- endif -%}
+        {%- if params['type'] -%}
+            type:<|"|>{{- params['type'] | upper -}}<|"|>}
+        {%- endif -%}
+    {%- endif -%}
+    {%- if 'response' in tool_data['function'] -%}
+        {%- set response_declaration = tool_data['function']['response'] -%}
+        ,response:{
+        {%- if response_declaration['description'] -%}
+            description:<|"|>{{- response_declaration['description'] -}}<|"|>,
+        {%- endif -%}
+        {%- if response_declaration['type'] | upper == 'OBJECT' -%}
+            type:<|"|>{{- response_declaration['type'] | upper -}}<|"|>}
+        {%- endif -%}
+    {%- endif -%}
+    }
+{%- endmacro -%}
+{%- macro format_argument(argument, escape_keys=True) -%}
+    {%- if argument is string -%}
+        {{- '<|"|>' + argument + '<|"|>' -}}
+    {%- elif argument is boolean -%}
+        {{- 'true' if argument else 'false' -}}
+    {%- elif argument is mapping -%}
+        {{- '{' -}}
+        {%- set ns = namespace(found_first=false) -%}
+        {%- for key, value in argument | dictsort -%}
+            {%- if ns.found_first %},{% endif -%}
+            {%- set ns.found_first = true -%}
+            {%- if escape_keys -%}
+                {{- '<|"|>' + key + '<|"|>' -}}
+            {%- else -%}
+                {{- key -}}
+            {%- endif -%}
+            :{{- format_argument(value, escape_keys=escape_keys) -}}
+        {%- endfor -%}
+        {{- '}' -}}
+    {%- elif argument is sequence -%}
+        {{- '[' -}}
+        {%- for item in argument -%}
+            {{- format_argument(item, escape_keys=escape_keys) -}}
+            {%- if not loop.last %},{% endif -%}
+        {%- endfor -%}
+        {{- ']' -}}
+    {%- else -%}
+        {{- argument -}}
+    {%- endif -%}
+{%- endmacro -%}
+{%- macro strip_thinking(text) -%}
+    {%- set ns = namespace(result='') -%}
+    {%- for part in text.split('<channel|>') -%}
+        {%- if '<|channel>' in part -%}
+            {%- set ns.result = ns.result + part.split('<|channel>')[0] -%}
+        {%- else -%}
+            {%- set ns.result = ns.result + part -%}
+        {%- endif -%}
+    {%- endfor -%}
+    {{- ns.result | trim -}}
+{%- endmacro -%}
+{%- macro format_tool_response_block(tool_name, response) -%}
+    {{- '<|tool_response>' -}}
+    {%- if response is mapping -%}
+        {{- 'response:' + tool_name + '{' -}}
+        {%- for key, value in response | dictsort -%}
+            {{- key -}}:{{- format_argument(value, escape_keys=False) -}}
+            {%- if not loop.last %},{% endif -%}
+        {%- endfor -%}
+        {{- '}' -}}
+    {%- else -%}
+        {{- 'response:' + tool_name + '{value:' + format_argument(response, escape_keys=False) + '}' -}}
+    {%- endif -%}
+    {{- '<tool_response|>' -}}
+{%- endmacro -%}
+{%- set ns = namespace(prev_message_type=None) -%}
+{%- set loop_messages = messages -%}
+{{- bos_token -}}
+{#- Handle System/Tool Definitions Block -#}
+{%- if (enable_thinking is defined and enable_thinking) or tools or messages[0]['role'] in ['system', 'developer'] -%}
+    {{- '<|turn>system\n' -}}
+    {#- Inject Thinking token at the very top of the FIRST system turn -#}
+    {%- if enable_thinking is defined and enable_thinking -%}
+        {{- '<|think|>\n' -}}
+        {%- set ns.prev_message_type = 'think' -%}
+    {%- endif -%}
+    {%- if messages[0]['role'] in ['system', 'developer'] -%}
+        {%- if messages[0]['content'] is string -%}
+            {{- messages[0]['content'] | trim -}}
+        {%- elif messages[0]['content'] is sequence -%}
+            {%- for item in messages[0]['content'] -%}
+                {{- item['text'] | trim + ' '-}}
+            {%- endfor -%}
+        {%- endif -%}
+        {%- set loop_messages = messages[1:] -%}
+    {%- endif -%}
+    {%- if tools -%}
+        {%- for tool in tools %}
+            {{- '<|tool>' -}}
+            {{- format_function_declaration(tool) | trim -}}
+            {{- '<tool|>' -}}
+        {%- endfor %}
+        {%- set ns.prev_message_type = 'tool' -%}
+    {%- endif -%}
+    {{- '<turn|>\n' -}}
+{%- endif %}
+{#- Pre-scan: find last user message index for reasoning guard -#}
+{%- set ns_turn = namespace(last_user_idx=-1) -%}
+{%- for i in range(loop_messages | length) -%}
+    {%- if loop_messages[i]['role'] == 'user' -%}
+        {%- set ns_turn.last_user_idx = i -%}
+    {%- endif -%}
+{%- endfor -%}
+{#- Loop through messages -#}
+{%- for message in loop_messages -%}
+    {%- if message['role'] != 'tool' -%}
+    {%- set ns.prev_message_type = None -%}
+    {%- set role = 'model' if message['role'] == 'assistant' else message['role'] -%}
+    {#- Detect continuation: suppress duplicate <|turn>model when previous non-tool message was also assistant -#}
+    {%- set prev_nt = namespace(role=None, found=false) -%}
+    {%- if loop.index0 > 0 -%}
+        {%- for j in range(loop.index0 - 1, -1, -1) -%}
+            {%- if not prev_nt.found -%}
+                {%- if loop_messages[j]['role'] != 'tool' -%}
+                    {%- set prev_nt.role = loop_messages[j]['role'] -%}
+                    {%- set prev_nt.found = true -%}
+                {%- endif -%}
+            {%- endif -%}
+        {%- endfor -%}
+    {%- endif -%}
+    {%- set continue_same_model_turn = (role == 'model' and prev_nt.role == 'assistant') -%}
+    {%- if not continue_same_model_turn -%}
+        {{- '<|turn>' + role + '\n' }}
+    {%- endif -%}
+    {#- Render reasoning/reasoning_content as thinking channel -#}
+    {%- set thinking_text = message.get('reasoning') or message.get('reasoning_content') -%}
+    {%- if thinking_text and loop.index0 > ns_turn.last_user_idx and message.get('tool_calls') -%}
+        {{- '<|channel>thought\n' + thinking_text + '\n<channel|>' -}}
+    {%- endif -%}
+            {%- if message['tool_calls'] -%}
+                {%- for tool_call in message['tool_calls'] -%}
+                    {%- set function = tool_call['function'] -%}
+                    {{- '<|tool_call>call:' + function['name'] + '{' -}}
+                    {%- if function['arguments'] is mapping -%}
+                        {%- set ns_args = namespace(found_first=false) -%}
+                        {%- for key, value in function['arguments'] | dictsort -%}
+                            {%- if ns_args.found_first %},{% endif -%}
+                            {%- set ns_args.found_first = true -%}
+                            {{- key -}}:{{- format_argument(value, escape_keys=False) -}}
+                        {%- endfor -%}
+                    {%- elif function['arguments'] is string -%}
+                        {{- function['arguments'] -}}
+                    {%- endif -%}
+                    {{- '}<tool_call|>' -}}
+                {%- endfor -%}
+                {%- set ns.prev_message_type = 'tool_call' -%}
+            {%- endif -%}
+            {%- set ns_tr_out = namespace(flag=false) -%}
+            {%- if message.get('tool_responses') -%}
+                {#- Legacy: tool_responses embedded on the assistant message (Google/Gemma native) -#}
+                {%- for tool_response in message['tool_responses'] -%}
+                    {{- format_tool_response_block(tool_response['name'] | default('unknown'), tool_response['response']) -}}
+                    {%- set ns_tr_out.flag = true -%}
+                    {%- set ns.prev_message_type = 'tool_response' -%}
+                {%- endfor -%}
+            {%- elif message.get('tool_calls') -%}
+                {#- OpenAI Chat Completions: forward-scan consecutive role:tool messages -#}
+                {%- set ns_tool_scan = namespace(stopped=false) -%}
+                {%- for k in range(loop.index0 + 1, loop_messages | length) -%}
+                    {%- if ns_tool_scan.stopped -%}
+                    {%- elif loop_messages[k]['role'] != 'tool' -%}
+                        {%- set ns_tool_scan.stopped = true -%}
+                    {%- else -%}
+                        {%- set follow = loop_messages[k] -%}
+                        {#- Resolve tool_call_id to function name -#}
+                        {%- set ns_tname = namespace(name=follow.get('name') | default('unknown')) -%}
+                        {%- for tc in message['tool_calls'] -%}
+                            {%- if tc.get('id') == follow.get('tool_call_id') -%}
+                                {%- set ns_tname.name = tc['function']['name'] -%}
+                            {%- endif -%}
+                        {%- endfor -%}
+                        {#- Handle content as string or content-parts array -#}
+                        {%- set tool_body = follow.get('content') -%}
+                        {%- if tool_body is string -%}
+                            {{- format_tool_response_block(ns_tname.name, tool_body) -}}
+                        {%- elif tool_body is sequence and tool_body is not string -%}
+                            {%- set ns_txt = namespace(s='') -%}
+                            {%- for part in tool_body -%}
+                                {%- if part.get('type') == 'text' -%}
+                                    {%- set ns_txt.s = ns_txt.s + (part.get('text') | default('')) -%}
+                                {%- endif -%}
+                            {%- endfor -%}
+                            {{- format_tool_response_block(ns_tname.name, ns_txt.s) -}}
+                        {%- else -%}
+                            {{- format_tool_response_block(ns_tname.name, tool_body) -}}
+                        {%- endif -%}
+                        {%- set ns_tr_out.flag = true -%}
+                        {%- set ns.prev_message_type = 'tool_response' -%}
+                    {%- endif -%}
+                {%- endfor -%}
+            {%- endif -%}
+            {%- set captured_content -%}
+            {%- if message['content'] is string -%}
+                {%- if role == 'model' -%}
+                    {{- strip_thinking(message['content']) -}}
+                {%- else -%}
+                    {{- message['content'] | trim -}}
+                {%- endif -%}
+            {%- elif message['content'] is sequence -%}
+                {%- for item in message['content'] -%}
+                    {%- if item['type'] == 'text' -%}
+                        {%- if role == 'model' -%}
+                            {{- strip_thinking(item['text']) -}}
+                        {%- else -%}
+                            {{- item['text'] | trim -}}
+                        {%- endif -%}
+                    {%- elif item['type'] == 'image' -%}
+                        {{- '<|image|>' -}}
+                        {%- set ns.prev_message_type = 'image' -%}
+                    {%- elif item['type'] == 'audio' -%}
+                        {{- '<|audio|>' -}}
+                        {%- set ns.prev_message_type = 'audio' -%}
+                    {%- elif item['type'] == 'video' -%}
+                        {{- '<|video|>' -}}
+                        {%- set ns.prev_message_type = 'video' -%}
+                    {%- endif -%}
+                {%- endfor -%}
+            {%- endif -%}
+            {%- endset -%}
+            {{- captured_content -}}
+            {%- set has_content = captured_content | trim | length > 0 -%}
+        {%- if ns.prev_message_type == 'tool_call' and not ns_tr_out.flag -%}
+            {{- '<|tool_response>' -}}
+        {%- elif not (ns_tr_out.flag and not has_content) -%}
+            {{- '<turn|>\n' -}}
+        {%- endif -%}
+    {%- endif -%}
+{%- endfor -%}
+{%- if add_generation_prompt -%}
+    {%- if ns.prev_message_type != 'tool_response' and ns.prev_message_type != 'tool_call' -%}
+        {{- '<|turn>model\n' -}}
+    {%- endif -%}
+{%- endif -%}

checkpoint-1200/optimizer.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6e1e647229ebd58f619f9224174e6d5fab90526935a57bf68b5a5fbc119fb909
+size 72807355

checkpoint-1200/processor_config.json ADDED Viewed

	@@ -0,0 +1,75 @@

+{
+  "audio_ms_per_token": 40,
+  "audio_seq_length": 750,
+  "feature_extractor": {
+    "dither": 0.0,
+    "feature_extractor_type": "Gemma4AudioFeatureExtractor",
+    "feature_size": 128,
+    "fft_length": 512,
+    "fft_overdrive": false,
+    "frame_length": 320,
+    "hop_length": 160,
+    "input_scale_factor": 1.0,
+    "max_frequency": 8000.0,
+    "mel_floor": 0.001,
+    "min_frequency": 0.0,
+    "padding_side": "left",
+    "padding_value": 0.0,
+    "per_bin_mean": null,
+    "per_bin_stddev": null,
+    "preemphasis": 0.0,
+    "preemphasis_htk_flavor": true,
+    "return_attention_mask": true,
+    "sampling_rate": 16000
+  },
+  "image_processor": {
+    "do_convert_rgb": true,
+    "do_normalize": false,
+    "do_rescale": true,
+    "do_resize": true,
+    "image_mean": [
+      0.0,
+      0.0,
+      0.0
+    ],
+    "image_processor_type": "Gemma4ImageProcessor",
+    "image_seq_length": 280,
+    "image_std": [
+      1.0,
+      1.0,
+      1.0
+    ],
+    "max_soft_tokens": 280,
+    "patch_size": 16,
+    "pooling_kernel_size": 3,
+    "resample": 3,
+    "rescale_factor": 0.00392156862745098
+  },
+  "image_seq_length": 280,
+  "processor_class": "Gemma4Processor",
+  "video_processor": {
+    "do_convert_rgb": true,
+    "do_normalize": true,
+    "do_rescale": true,
+    "do_resize": true,
+    "do_sample_frames": true,
+    "image_mean": [
+      0.0,
+      0.0,
+      0.0
+    ],
+    "image_std": [
+      1.0,
+      1.0,
+      1.0
+    ],
+    "max_soft_tokens": 70,
+    "num_frames": 32,
+    "patch_size": 16,
+    "pooling_kernel_size": 3,
+    "resample": 3,
+    "rescale_factor": 0.00392156862745098,
+    "return_metadata": false,
+    "video_processor_type": "Gemma4VideoProcessor"
+  }
+}

checkpoint-1200/rng_state.pth ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:098b29492211804ab324a36f37466821d948280bb74fce4ba895c03f13ecd878
+size 14645

checkpoint-1200/scheduler.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:efcf962131305188aae5d8c42fb21f39c330e15fc73bc76b4411e357b0d01cee
+size 1465

checkpoint-1200/tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cc8d3a0ce36466ccc1278bf987df5f71db1719b9ca6b4118264f45cb627bfe0f
+size 32169626