comethrusws commited on
Commit
ba2408a
·
verified ·
1 Parent(s): c309eb2

fixed: readme

Browse files
Files changed (1) hide show
  1. README.md +318 -318
README.md CHANGED
@@ -1,319 +1,319 @@
1
- ---
2
- license: llama3.2
3
- library_name: transformers
4
- pipeline_tag: text-generation
5
- language:
6
- - en
7
- - ko
8
- - fr
9
- - zh
10
- - es
11
- ---
12
-
13
- <div align="center">
14
-
15
- ![License](https://img.shields.io/badge/License-Llama%203.2-blue.svg)
16
- ![Model](https://img.shields.io/badge/Model-3B%20Parameters-green.svg)
17
- ![Library](https://img.shields.io/badge/Library-Transformers-orange.svg)
18
- ![Pipeline](https://img.shields.io/badge/Pipeline-Text%20Generation-red.svg)
19
- ![Languages](https://img.shields.io/badge/Languages-30+-purple.svg)
20
- ![Context](https://img.shields.io/badge/Context%20Window-128k-brightgreen.svg)
21
-
22
- <img src="images/sagea-logo.png" alt="SAGE Logo" width="75%">
23
-
24
- # SAGE Reasoning 8B
25
-
26
- *Advanced Hybrid Reasoning Model with Tool-Calling Capabilities*
27
-
28
- [![Open in HuggingFace](https://img.shields.io/badge/🤗%20Hugging%20Face-Open%20Model-yellow)](https://huggingface.co/)
29
- [![License](https://img.shields.io/badge/License-Llama%203.2%20Community-blue)](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE)
30
-
31
- </div>
32
-
33
- ---
34
-
35
- ## Table of Contents
36
-
37
- - [Overview](#overview)
38
- - [Key Features](#key-features)
39
- - [Evaluations](#evaluations)
40
- - [License](#license)
41
- - [Contact](#contact)
42
-
43
- ---
44
-
45
- ## Overview
46
-
47
- SAGE Reasoning Family Models are instruction-tuned, text-in/text-out generative systems released under a permissive open license for commercial use.
48
-
49
- ## Key Features
50
-
51
- ### **Hybrid Reasoning Architecture**
52
- - **Dual Mode Operation**: Capable of producing fast direct responses in standard LLM mode, or applying self-reflection before answering in reasoning mode
53
- - **Advanced Training**: Uses **Iterated Distillation and Amplification (IDA)** - a scalable alignment method based on iterative self-improvement
54
-
55
- ### **Specialized Capabilities**
56
- - **Code Generation**: Optimized for programming tasks with strong coding abilities
57
- - **STEM Excellence**: Enhanced performance on science, technology, engineering, and mathematics problems
58
- - **Instruction Following**: Superior adherence to complex instructions and prompts
59
- - **Tool Calling**: Notable strength in tool-calling ability compared to similar-sized models
60
-
61
- ### **Global Reach**
62
- - **Multilingual Support**: Over 30 languages supported
63
- - **Extended Context**: 128k context window for handling large documents and conversations
64
- - **Consistent Performance**: Both standard and reasoning variants consistently outperform other models in the same parameter class on public benchmarks
65
-
66
- ## Evaluations
67
-
68
- We compare our models against state-of-the-art size-equivalent models in both direct mode and reasoning mode. For direct mode, we compare against Llama/Qwen instruct counterparts. For reasoning, we use Deepseek's R1 distilled counterparts and Qwen's QwQ model.
69
-
70
- ### Overall Performance Benchmarks
71
-
72
- <div align="center">
73
- <img src="images/3b_benchmarks.png" alt="Overall Performance Benchmarks" width="85%">
74
- <p><em>Comprehensive benchmark results showing SAGE Reasoning 3B performance across multiple evaluation metrics</em></p>
75
- </div>
76
-
77
- ### Livebench Global Average
78
-
79
- <div align="center">
80
- <img src="images/3b_8b_tools.png" alt="Livebench Global Average Performance" width="75%">
81
- <p><em>Livebench global performance comparison demonstrating consistent superiority</em></p>
82
- </div>
83
-
84
- ### Tool Calling Performance
85
-
86
- <div align="center">
87
- <img src="images/3b_8b_tool_calling_benchmarks.png" alt="Tool Calling Benchmarks" width="85%">
88
- <p><em>Tool calling capabilities comparison showing enhanced performance in function calling and tool utilization</em></p>
89
- </div>
90
-
91
- ---
92
-
93
-
94
- # Usage
95
- Here is a snippet below for usage with Transformers:
96
-
97
- ```python
98
- import transformers
99
- import torch
100
-
101
- model_id = "sagea-ai/sage-reasoning-3b"
102
-
103
- pipeline = transformers.pipeline(
104
- "text-generation",
105
- model=model_id,
106
- model_kwargs={"torch_dtype": torch.bfloat16},
107
- device_map="auto",
108
- )
109
-
110
- messages = [
111
- {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
112
- {"role": "user", "content": "Give me a short introduction to LLMs."},
113
- ]
114
-
115
- outputs = pipeline(
116
- messages,
117
- max_new_tokens=512,
118
- )
119
-
120
- print(outputs[0]["generated_text"][-1])
121
- ```
122
-
123
-
124
-
125
- ## Implementing extended thinking
126
- - By default, the model will answer in the standard mode.
127
- - To enable thinking, you can do any one of the two methods:
128
- - Add a specific system prompt, or
129
- - Set `enable_thinking=True` while applying the chat template.
130
-
131
- > **_NOTE:_** For the SAGE reasoning 3b model, we suggest using `repetition_penalty=1.1` while implementing extended thinking.
132
-
133
- ### Method 1 - Add a specific system prompt.
134
- To enable thinking, simply use this in the system prompt `system_instruction = 'Enable deep thinking subroutine.'`
135
-
136
- If you already have a system_instruction, then use `system_instruction = 'Enable deep thinking subroutine.' + '\n\n' + system_instruction`.
137
-
138
- Here is an example -
139
-
140
- ```python
141
- import transformers
142
- import torch
143
-
144
- model_id = "sagea-ai/sage-reasoning-3b"
145
-
146
- pipeline = transformers.pipeline(
147
- "text-generation",
148
- model=model_id,
149
- model_kwargs={"torch_dtype": torch.bfloat16},
150
- device_map="auto",
151
- )
152
-
153
- DEEP_THINKING_INSTRUCTION = "Enable deep thinking subroutine."
154
-
155
- messages = [
156
- {"role": "system", "content": DEEP_THINKING_INSTRUCTION},
157
- {"role": "user", "content": "Write a bash script that takes a matrix represented as a string with format '[1,2],[3,4],[5,6]' and prints the transpose in the same format."},
158
- ]
159
-
160
- outputs = pipeline(
161
- messages,
162
- max_new_tokens=512,
163
- )
164
-
165
- print(outputs[0]["generated_text"][-1])
166
- ```
167
-
168
-
169
- Similarly, if you have a system prompt, you can append the `DEEP_THINKING_INSTRUCTION` to the beginning in this way -
170
-
171
- ```python
172
- DEEP_THINKING_INSTRUCTION = "Enable deep thinking subroutine."
173
-
174
- system_prompt = "Reply to each prompt with only the actual code - no explanations."
175
- prompt = "Write a bash script that takes a matrix represented as a string with format '[1,2],[3,4],[5,6]' and prints the transpose in the same format."
176
-
177
- messages = [
178
- {"role": "system", "content": DEEP_THINKING_INSTRUCTION + '\n\n' + system_prompt},
179
- {"role": "user", "content": prompt}
180
- ]
181
- ```
182
-
183
- ### Method 2 - Set enable_thinking=True in the tokenizer
184
- If you are using Huggingface tokenizers, then you can simply use add the argument `enable_thinking=True` to the tokenization (this option is added to the chat template).
185
-
186
- Here is an example -
187
- ```python
188
- from transformers import AutoModelForCausalLM, AutoTokenizer
189
-
190
- model_name = "sagea-ai/sage-reasoning-3b"
191
-
192
- model = AutoModelForCausalLM.from_pretrained(
193
- model_name,
194
- torch_dtype="auto",
195
- device_map="auto"
196
- )
197
- tokenizer = AutoTokenizer.from_pretrained(model_name)
198
-
199
- prompt = "Give me a short introduction to LLMs."
200
- messages = [
201
- {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
202
- {"role": "user", "content": prompt}
203
- ]
204
-
205
- text = tokenizer.apply_chat_template(
206
- messages,
207
- tokenize=False,
208
- add_generation_prompt=True,
209
- enable_thinking=True
210
- )
211
- model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
212
-
213
- generated_ids = model.generate(
214
- **model_inputs,
215
- max_new_tokens=512
216
- )
217
- generated_ids = [
218
- output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
219
- ]
220
-
221
- response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
222
- print(response)
223
- ```
224
-
225
- # Tool Calling
226
- SAGE reasoning 3b models support tool calling (single, parallel, multiple and parallel_multiple) both in standard and extended thinking mode.
227
-
228
- Here is a snippet -
229
-
230
- ```python
231
- # First, define a tool
232
- def get_current_temperature(location: str) -> float:
233
- """
234
- Get the current temperature at a location.
235
-
236
- Args:
237
- location: The location to get the temperature for, in the format "City, Country"
238
- Returns:
239
- The current temperature at the specified location in the specified units, as a float.
240
- """
241
- return 22. # A real function should probably actually get the temperature!
242
-
243
- # Next, create a chat and apply the chat template
244
- messages = [
245
- {"role": "user", "content": "Hey, what's the temperature in Paris right now?"}
246
- ]
247
-
248
- model_inputs = tokenizer.apply_chat_template(messages, tools=[get_current_temperature], add_generation_prompt=True)
249
-
250
- text = tokenizer.apply_chat_template(messages, tools=[get_current_temperature], add_generation_prompt=True, tokenize=False)
251
- inputs = tokenizer(text, return_tensors="pt", add_special_tokens=False).to(model.device)
252
- outputs = model.generate(**inputs, max_new_tokens=512)
253
- output_text = tokenizer.batch_decode(outputs)[0][len(text):]
254
- print(output_text)
255
- ```
256
-
257
- This will result in the output -
258
- ```
259
- <tool_call>
260
- {"name": "get_current_temperature", "arguments": {"location": "Paris, France"}}
261
- </tool_call><|eot_id|>
262
- ```
263
-
264
- You can then generate text from this input as normal. If the model generates a tool call, you should add it to the chat like so:
265
-
266
- ```python
267
- tool_call = {"name": "get_current_temperature", "arguments": {"location": "Paris, France"}}
268
- messages.append({"role": "assistant", "tool_calls": [{"type": "function", "function": tool_call}]})
269
- ```
270
-
271
- and then call the tool and append the result, with the `tool` role, like so:
272
-
273
- ```python
274
- messages.append({"role": "tool", "name": "get_current_temperature", "content": "22.0"})
275
- ```
276
-
277
- After that, you can `generate()` again to let the model use the tool result in the chat:
278
-
279
- ```python
280
- text = tokenizer.apply_chat_template(messages, tools=[get_current_temperature], add_generation_prompt=True, tokenize=False)
281
- inputs = tokenizer(text, return_tensors="pt", add_special_tokens=False).to(model.device)
282
- outputs = model.generate(**inputs, max_new_tokens=512)
283
- output_text = tokenizer.batch_decode(outputs)[0][len(text):]
284
- ```
285
-
286
- This should result in the string -
287
-
288
- 'The current temperature in Paris is 22.0 degrees.<|eot_id|>'
289
-
290
- ## License
291
-
292
- This repository and the model weights are licensed under the [**Llama 3.2 Community License Agreement**](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE) (Llama models' default license agreement).
293
-
294
- <div align="center">
295
-
296
- [![License](https://img.shields.io/badge/License-Llama%203.2%20Community-blue.svg)](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE)
297
-
298
- </div>
299
-
300
- ## Contact
301
-
302
- <div align="center">
303
-
304
- **Get in Touch with Our Team**
305
-
306
- For inquiries, collaborations, or support, please reach out to us:
307
-
308
- **Email**: [founders@sagea.space](mailto:founders@sagea.space)
309
-
310
- ---
311
-
312
- <p>
313
- <strong>SAGE Reasoning 3B</strong><br>
314
- <em>Advancing the frontier of hybrid reasoning models</em>
315
- </p>
316
-
317
- ![Made by SAGEA](https://img.shields.io/badge/Made%20by-SAGEA%20-red.svg)
318
-
319
  </div>
 
1
+ ---
2
+ license: llama3.2
3
+ library_name: transformers
4
+ pipeline_tag: text-generation
5
+ language:
6
+ - en
7
+ - ko
8
+ - fr
9
+ - zh
10
+ - es
11
+ ---
12
+
13
+ <div align="center">
14
+
15
+ ![License](https://img.shields.io/badge/License-Llama%203.2-blue.svg)
16
+ ![Model](https://img.shields.io/badge/Model-8B%20Parameters-green.svg)
17
+ ![Library](https://img.shields.io/badge/Library-Transformers-orange.svg)
18
+ ![Pipeline](https://img.shields.io/badge/Pipeline-Text%20Generation-red.svg)
19
+ ![Languages](https://img.shields.io/badge/Languages-30+-purple.svg)
20
+ ![Context](https://img.shields.io/badge/Context%20Window-128k-brightgreen.svg)
21
+
22
+ <img src="images/sagea-logo.png" alt="SAGE Logo" width="75%">
23
+
24
+ # SAGE Reasoning 8B
25
+
26
+ *Advanced Hybrid Reasoning Model with Tool-Calling Capabilities*
27
+
28
+ [![Open in HuggingFace](https://img.shields.io/badge/🤗%20Hugging%20Face-Open%20Model-yellow)](https://huggingface.co/)
29
+ [![License](https://img.shields.io/badge/License-Llama%203.2%20Community-blue)](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE)
30
+
31
+ </div>
32
+
33
+ ---
34
+
35
+ ## Table of Contents
36
+
37
+ - [Overview](#overview)
38
+ - [Key Features](#key-features)
39
+ - [Evaluations](#evaluations)
40
+ - [License](#license)
41
+ - [Contact](#contact)
42
+
43
+ ---
44
+
45
+ ## Overview
46
+
47
+ SAGE Reasoning Family Models are instruction-tuned, text-in/text-out generative systems released under a permissive open license for commercial use.
48
+
49
+ ## Key Features
50
+
51
+ ### **Hybrid Reasoning Architecture**
52
+ - **Dual Mode Operation**: Capable of producing fast direct responses in standard LLM mode, or applying self-reflection before answering in reasoning mode
53
+ - **Advanced Training**: Uses **Iterated Distillation and Amplification (IDA)** - a scalable alignment method based on iterative self-improvement
54
+
55
+ ### **Specialized Capabilities**
56
+ - **Code Generation**: Optimized for programming tasks with strong coding abilities
57
+ - **STEM Excellence**: Enhanced performance on science, technology, engineering, and mathematics problems
58
+ - **Instruction Following**: Superior adherence to complex instructions and prompts
59
+ - **Tool Calling**: Notable strength in tool-calling ability compared to similar-sized models
60
+
61
+ ### **Global Reach**
62
+ - **Multilingual Support**: Over 30 languages supported
63
+ - **Extended Context**: 128k context window for handling large documents and conversations
64
+ - **Consistent Performance**: Both standard and reasoning variants consistently outperform other models in the same parameter class on public benchmarks
65
+
66
+ ## Evaluations
67
+
68
+ We compare our models against state-of-the-art size-equivalent models in both direct mode and reasoning mode. For direct mode, we compare against Llama/Qwen instruct counterparts. For reasoning, we use Deepseek's R1 distilled counterparts and Qwen's QwQ model.
69
+
70
+ ### Overall Performance Benchmarks
71
+
72
+ <div align="center">
73
+ <img src="images/8b_benchmarks.png" alt="Overall Performance Benchmarks" width="85%">
74
+ <p><em>Comprehensive benchmark results showing SAGE Reasoning 3B performance across multiple evaluation metrics</em></p>
75
+ </div>
76
+
77
+ ### Livebench Global Average
78
+
79
+ <div align="center">
80
+ <img src="images/3b_8b_tools.png" alt="Livebench Global Average Performance" width="75%">
81
+ <p><em>Livebench global performance comparison demonstrating consistent superiority</em></p>
82
+ </div>
83
+
84
+ ### Tool Calling Performance
85
+
86
+ <div align="center">
87
+ <img src="images/3b_8b_tool_calling_benchmarks (1).png" alt="Tool Calling Benchmarks" width="85%">
88
+ <p><em>Tool calling capabilities comparison showing enhanced performance in function calling and tool utilization</em></p>
89
+ </div>
90
+
91
+ ---
92
+
93
+
94
+ # Usage
95
+ Here is a snippet below for usage with Transformers:
96
+
97
+ ```python
98
+ import transformers
99
+ import torch
100
+
101
+ model_id = "sagea-ai/sage-reasoning-8b"
102
+
103
+ pipeline = transformers.pipeline(
104
+ "text-generation",
105
+ model=model_id,
106
+ model_kwargs={"torch_dtype": torch.bfloat16},
107
+ device_map="auto",
108
+ )
109
+
110
+ messages = [
111
+ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
112
+ {"role": "user", "content": "Give me a short introduction to LLMs."},
113
+ ]
114
+
115
+ outputs = pipeline(
116
+ messages,
117
+ max_new_tokens=512,
118
+ )
119
+
120
+ print(outputs[0]["generated_text"][-1])
121
+ ```
122
+
123
+
124
+
125
+ ## Implementing extended thinking
126
+ - By default, the model will answer in the standard mode.
127
+ - To enable thinking, you can do any one of the two methods:
128
+ - Add a specific system prompt, or
129
+ - Set `enable_thinking=True` while applying the chat template.
130
+
131
+ > **_NOTE:_** For the SAGE reasoning 3b model, we suggest using `repetition_penalty=1.1` while implementing extended thinking.
132
+
133
+ ### Method 1 - Add a specific system prompt.
134
+ To enable thinking, simply use this in the system prompt `system_instruction = 'Enable deep thinking subroutine.'`
135
+
136
+ If you already have a system_instruction, then use `system_instruction = 'Enable deep thinking subroutine.' + '\n\n' + system_instruction`.
137
+
138
+ Here is an example -
139
+
140
+ ```python
141
+ import transformers
142
+ import torch
143
+
144
+ model_id = "sagea-ai/sage-reasoning-8b"
145
+
146
+ pipeline = transformers.pipeline(
147
+ "text-generation",
148
+ model=model_id,
149
+ model_kwargs={"torch_dtype": torch.bfloat16},
150
+ device_map="auto",
151
+ )
152
+
153
+ DEEP_THINKING_INSTRUCTION = "Enable deep thinking subroutine."
154
+
155
+ messages = [
156
+ {"role": "system", "content": DEEP_THINKING_INSTRUCTION},
157
+ {"role": "user", "content": "Write a bash script that takes a matrix represented as a string with format '[1,2],[3,4],[5,6]' and prints the transpose in the same format."},
158
+ ]
159
+
160
+ outputs = pipeline(
161
+ messages,
162
+ max_new_tokens=512,
163
+ )
164
+
165
+ print(outputs[0]["generated_text"][-1])
166
+ ```
167
+
168
+
169
+ Similarly, if you have a system prompt, you can append the `DEEP_THINKING_INSTRUCTION` to the beginning in this way -
170
+
171
+ ```python
172
+ DEEP_THINKING_INSTRUCTION = "Enable deep thinking subroutine."
173
+
174
+ system_prompt = "Reply to each prompt with only the actual code - no explanations."
175
+ prompt = "Write a bash script that takes a matrix represented as a string with format '[1,2],[3,4],[5,6]' and prints the transpose in the same format."
176
+
177
+ messages = [
178
+ {"role": "system", "content": DEEP_THINKING_INSTRUCTION + '\n\n' + system_prompt},
179
+ {"role": "user", "content": prompt}
180
+ ]
181
+ ```
182
+
183
+ ### Method 2 - Set enable_thinking=True in the tokenizer
184
+ If you are using Huggingface tokenizers, then you can simply use add the argument `enable_thinking=True` to the tokenization (this option is added to the chat template).
185
+
186
+ Here is an example -
187
+ ```python
188
+ from transformers import AutoModelForCausalLM, AutoTokenizer
189
+
190
+ model_name = "sagea-ai/sage-reasoning-8b"
191
+
192
+ model = AutoModelForCausalLM.from_pretrained(
193
+ model_name,
194
+ torch_dtype="auto",
195
+ device_map="auto"
196
+ )
197
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
198
+
199
+ prompt = "Give me a short introduction to LLMs."
200
+ messages = [
201
+ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
202
+ {"role": "user", "content": prompt}
203
+ ]
204
+
205
+ text = tokenizer.apply_chat_template(
206
+ messages,
207
+ tokenize=False,
208
+ add_generation_prompt=True,
209
+ enable_thinking=True
210
+ )
211
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
212
+
213
+ generated_ids = model.generate(
214
+ **model_inputs,
215
+ max_new_tokens=512
216
+ )
217
+ generated_ids = [
218
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
219
+ ]
220
+
221
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
222
+ print(response)
223
+ ```
224
+
225
+ # Tool Calling
226
+ SAGE reasoning models support tool calling (single, parallel, multiple and parallel_multiple) both in standard and extended thinking mode.
227
+
228
+ Here is a snippet -
229
+
230
+ ```python
231
+ # First, define a tool
232
+ def get_current_temperature(location: str) -> float:
233
+ """
234
+ Get the current temperature at a location.
235
+
236
+ Args:
237
+ location: The location to get the temperature for, in the format "City, Country"
238
+ Returns:
239
+ The current temperature at the specified location in the specified units, as a float.
240
+ """
241
+ return 22. # A real function should probably actually get the temperature!
242
+
243
+ # Next, create a chat and apply the chat template
244
+ messages = [
245
+ {"role": "user", "content": "Hey, what's the temperature in Paris right now?"}
246
+ ]
247
+
248
+ model_inputs = tokenizer.apply_chat_template(messages, tools=[get_current_temperature], add_generation_prompt=True)
249
+
250
+ text = tokenizer.apply_chat_template(messages, tools=[get_current_temperature], add_generation_prompt=True, tokenize=False)
251
+ inputs = tokenizer(text, return_tensors="pt", add_special_tokens=False).to(model.device)
252
+ outputs = model.generate(**inputs, max_new_tokens=512)
253
+ output_text = tokenizer.batch_decode(outputs)[0][len(text):]
254
+ print(output_text)
255
+ ```
256
+
257
+ This will result in the output -
258
+ ```
259
+ <tool_call>
260
+ {"name": "get_current_temperature", "arguments": {"location": "Paris, France"}}
261
+ </tool_call><|eot_id|>
262
+ ```
263
+
264
+ You can then generate text from this input as normal. If the model generates a tool call, you should add it to the chat like so:
265
+
266
+ ```python
267
+ tool_call = {"name": "get_current_temperature", "arguments": {"location": "Paris, France"}}
268
+ messages.append({"role": "assistant", "tool_calls": [{"type": "function", "function": tool_call}]})
269
+ ```
270
+
271
+ and then call the tool and append the result, with the `tool` role, like so:
272
+
273
+ ```python
274
+ messages.append({"role": "tool", "name": "get_current_temperature", "content": "22.0"})
275
+ ```
276
+
277
+ After that, you can `generate()` again to let the model use the tool result in the chat:
278
+
279
+ ```python
280
+ text = tokenizer.apply_chat_template(messages, tools=[get_current_temperature], add_generation_prompt=True, tokenize=False)
281
+ inputs = tokenizer(text, return_tensors="pt", add_special_tokens=False).to(model.device)
282
+ outputs = model.generate(**inputs, max_new_tokens=512)
283
+ output_text = tokenizer.batch_decode(outputs)[0][len(text):]
284
+ ```
285
+
286
+ This should result in the string -
287
+
288
+ 'The current temperature in Paris is 22.0 degrees.<|eot_id|>'
289
+
290
+ ## License
291
+
292
+ This repository and the model weights are licensed under the [**Llama 3.2 Community License Agreement**](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE) (Llama models' default license agreement).
293
+
294
+ <div align="center">
295
+
296
+ [![License](https://img.shields.io/badge/License-Llama%203.2%20Community-blue.svg)](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/LICENSE)
297
+
298
+ </div>
299
+
300
+ ## Contact
301
+
302
+ <div align="center">
303
+
304
+ **Get in Touch with Our Team**
305
+
306
+ For inquiries, collaborations, or support, please reach out to us:
307
+
308
+ **Email**: [founders@sagea.space](mailto:founders@sagea.space)
309
+
310
+ ---
311
+
312
+ <p>
313
+ <strong>SAGE Reasoning 3B</strong><br>
314
+ <em>Advancing the frontier of hybrid reasoning models</em>
315
+ </p>
316
+
317
+ ![Made by SAGEA](https://img.shields.io/badge/Made%20by-SAGEA%20-red.svg)
318
+
319
  </div>