Downtown-Case commited on
Commit
731c682
·
verified ·
1 Parent(s): 4120d51

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,296 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ inference: false
3
+ language:
4
+ - en
5
+ - fr
6
+ - de
7
+ - es
8
+ - it
9
+ - pt
10
+ - ja
11
+ - ko
12
+ - zh
13
+ - ar
14
+ license: cc-by-nc-4.0
15
+ library_name: transformers
16
+ extra_gated_prompt: >-
17
+ By submitting this form, you agree to the [License
18
+ Agreement](https://cohere.com/c4ai-cc-by-nc-license) and acknowledge that the
19
+ information you provide will be collected, used, and shared in accordance with
20
+ Cohere’s [Privacy Policy]( https://cohere.com/privacy). You’ll receive email
21
+ updates about Cohere Labs and Cohere research, events, products and services.
22
+ You can unsubscribe at any time.
23
+ extra_gated_fields:
24
+ Name: text
25
+ Affiliation: text
26
+ Country:
27
+ type: select
28
+ options:
29
+ - Aruba
30
+ - Afghanistan
31
+ - Angola
32
+ - Anguilla
33
+ - Åland Islands
34
+ - Albania
35
+ - Andorra
36
+ - United Arab Emirates
37
+ - Argentina
38
+ - Armenia
39
+ - American Samoa
40
+ - Antarctica
41
+ - French Southern Territories
42
+ - Antigua and Barbuda
43
+ - Australia
44
+ - Austria
45
+ - Azerbaijan
46
+ - Burundi
47
+ - Belgium
48
+ - Benin
49
+ - Bonaire Sint Eustatius and Saba
50
+ - Burkina Faso
51
+ - Bangladesh
52
+ - Bulgaria
53
+ - Bahrain
54
+ - Bahamas
55
+ - Bosnia and Herzegovina
56
+ - Saint Barthélemy
57
+ - Belarus
58
+ - Belize
59
+ - Bermuda
60
+ - Plurinational State of Bolivia
61
+ - Brazil
62
+ - Barbados
63
+ - Brunei-Darussalam
64
+ - Bhutan
65
+ - Bouvet-Island
66
+ - Botswana
67
+ - Central African Republic
68
+ - Canada
69
+ - Cocos (Keeling) Islands
70
+ - Switzerland
71
+ - Chile
72
+ - China
73
+ - Côte-dIvoire
74
+ - Cameroon
75
+ - Democratic Republic of the Congo
76
+ - Cook Islands
77
+ - Colombia
78
+ - Comoros
79
+ - Cabo Verde
80
+ - Costa Rica
81
+ - Cuba
82
+ - Curaçao
83
+ - Christmas Island
84
+ - Cayman Islands
85
+ - Cyprus
86
+ - Czechia
87
+ - Germany
88
+ - Djibouti
89
+ - Dominica
90
+ - Denmark
91
+ - Dominican Republic
92
+ - Algeria
93
+ - Ecuador
94
+ - Egypt
95
+ - Eritrea
96
+ - Western Sahara
97
+ - Spain
98
+ - Estonia
99
+ - Ethiopia
100
+ - Finland
101
+ - Fiji
102
+ - Falkland Islands (Malvinas)
103
+ - France
104
+ - Faroe Islands
105
+ - Federated States of Micronesia
106
+ - Gabon
107
+ - United Kingdom
108
+ - Georgia
109
+ - Guernsey
110
+ - Ghana
111
+ - Gibraltar
112
+ - Guinea
113
+ - Guadeloupe
114
+ - Gambia
115
+ - Guinea Bissau
116
+ - Equatorial Guinea
117
+ - Greece
118
+ - Grenada
119
+ - Greenland
120
+ - Guatemala
121
+ - French Guiana
122
+ - Guam
123
+ - Guyana
124
+ - Hong Kong
125
+ - Heard Island and McDonald Islands
126
+ - Honduras
127
+ - Croatia
128
+ - Haiti
129
+ - Hungary
130
+ - Indonesia
131
+ - Isle of Man
132
+ - India
133
+ - British Indian Ocean Territory
134
+ - Ireland
135
+ - Islamic Republic of Iran
136
+ - Iraq
137
+ - Iceland
138
+ - Israel
139
+ - Italy
140
+ - Jamaica
141
+ - Jersey
142
+ - Jordan
143
+ - Japan
144
+ - Kazakhstan
145
+ - Kenya
146
+ - Kyrgyzstan
147
+ - Cambodia
148
+ - Kiribati
149
+ - Saint-Kitts-and-Nevis
150
+ - South Korea
151
+ - Kuwait
152
+ - Lao-Peoples-Democratic-Republic
153
+ - Lebanon
154
+ - Liberia
155
+ - Libya
156
+ - Saint-Lucia
157
+ - Liechtenstein
158
+ - Sri Lanka
159
+ - Lesotho
160
+ - Lithuania
161
+ - Luxembourg
162
+ - Latvia
163
+ - Macao
164
+ - Saint Martin (French-part)
165
+ - Morocco
166
+ - Monaco
167
+ - Republic of Moldova
168
+ - Madagascar
169
+ - Maldives
170
+ - Mexico
171
+ - Marshall Islands
172
+ - North Macedonia
173
+ - Mali
174
+ - Malta
175
+ - Myanmar
176
+ - Montenegro
177
+ - Mongolia
178
+ - Northern Mariana Islands
179
+ - Mozambique
180
+ - Mauritania
181
+ - Montserrat
182
+ - Martinique
183
+ - Mauritius
184
+ - Malawi
185
+ - Malaysia
186
+ - Mayotte
187
+ - Namibia
188
+ - New Caledonia
189
+ - Niger
190
+ - Norfolk Island
191
+ - Nigeria
192
+ - Nicaragua
193
+ - Niue
194
+ - Netherlands
195
+ - Norway
196
+ - Nepal
197
+ - Nauru
198
+ - New Zealand
199
+ - Oman
200
+ - Pakistan
201
+ - Panama
202
+ - Pitcairn
203
+ - Peru
204
+ - Philippines
205
+ - Palau
206
+ - Papua New Guinea
207
+ - Poland
208
+ - Puerto Rico
209
+ - North Korea
210
+ - Portugal
211
+ - Paraguay
212
+ - State of Palestine
213
+ - French Polynesia
214
+ - Qatar
215
+ - Réunion
216
+ - Romania
217
+ - Russia
218
+ - Rwanda
219
+ - Saudi Arabia
220
+ - Sudan
221
+ - Senegal
222
+ - Singapore
223
+ - South Georgia and the South Sandwich Islands
224
+ - Saint Helena Ascension and Tristan da Cunha
225
+ - Svalbard and Jan Mayen
226
+ - Solomon Islands
227
+ - Sierra Leone
228
+ - El Salvador
229
+ - San Marino
230
+ - Somalia
231
+ - Saint Pierre and Miquelon
232
+ - Serbia
233
+ - South Sudan
234
+ - Sao Tome and Principe
235
+ - Suriname
236
+ - Slovakia
237
+ - Slovenia
238
+ - Sweden
239
+ - Eswatini
240
+ - Sint Maarten (Dutch-part)
241
+ - Seychelles
242
+ - Syrian Arab Republic
243
+ - Turks and Caicos Islands
244
+ - Chad
245
+ - Togo
246
+ - Thailand
247
+ - Tajikistan
248
+ - Tokelau
249
+ - Turkmenistan
250
+ - Timor Leste
251
+ - Tonga
252
+ - Trinidad and Tobago
253
+ - Tunisia
254
+ - Turkey
255
+ - Tuvalu
256
+ - Taiwan
257
+ - United Republic of Tanzania
258
+ - Uganda
259
+ - Ukraine
260
+ - United States Minor Outlying Islands
261
+ - Uruguay
262
+ - United-States
263
+ - Uzbekistan
264
+ - Holy See (Vatican City State)
265
+ - Saint Vincent and the Grenadines
266
+ - Bolivarian Republic of Venezuela
267
+ - Virgin Islands British
268
+ - Virgin Islands U.S.
269
+ - VietNam
270
+ - Vanuatu
271
+ - Wallis and Futuna
272
+ - Samoa
273
+ - Yemen
274
+ - South Africa
275
+ - Zambia
276
+ - Zimbabwe
277
+ I agree to use this model for non-commercial use ONLY: checkbox
278
+ base_model:
279
+ - CohereLabs/c4ai-command-r-v01
280
+ ---
281
+
282
+ EXL3 quant with 3bpw MLP projection layer and 4bpw for all other layers, to fit in 24GB cards with 16K context. Original description:
283
+
284
+
285
+ Merged [jukofyork/command-r-35b-writer-v3-multiplicative-lora](https://huggingface.co/jukofyork/command-r-35b-writer-v3-multiplicative-lora) into [CohereLabs/c4ai-command-r-v01](https://huggingface.co/CohereLabs/c4ai-command-r-v01) using [jukofyork/merge-lora](https://huggingface.co/spaces/jukofyork/merge-lora).
286
+
287
+ Untested... But appears to have worked:
288
+
289
+ ```
290
+ ✓ Successfully merged and uploaded model!
291
+ Model URL: https://huggingface.co/jukofyork/command-r-35b-writer-v3
292
+ Merge mode: Multiplicative
293
+ Scale factor: 1
294
+ Processed 15 shards
295
+ Merged 72 layers with LoRA weights
296
+ ```
config.json ADDED
@@ -0,0 +1,41 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "/home/ahmet_cohere_com/HF_Final_weight_tie",
3
+ "architectures": [
4
+ "CohereForCausalLM"
5
+ ],
6
+ "attention_bias": false,
7
+ "attention_dropout": 0.0,
8
+ "bos_token_id": 5,
9
+ "eos_token_id": 255001,
10
+ "hidden_act": "silu",
11
+ "hidden_size": 8192,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 22528,
14
+ "layer_norm_eps": 1e-05,
15
+ "logit_scale": 0.0625,
16
+ "max_position_embeddings": 8192,
17
+ "model_max_length": 131072,
18
+ "model_type": "cohere",
19
+ "num_attention_heads": 64,
20
+ "num_hidden_layers": 40,
21
+ "num_key_value_heads": 64,
22
+ "pad_token_id": 0,
23
+ "pretraining_tp": 1,
24
+ "rope_theta": 8000000.0,
25
+ "torch_dtype": "float16",
26
+ "transformers_version": "4.38.2",
27
+ "use_cache": true,
28
+ "vocab_size": 256000,
29
+ "tie_word_embeddings": true,
30
+ "quantization_config": {
31
+ "quant_method": "exl3",
32
+ "version": "0.0.6",
33
+ "bits": 4.0,
34
+ "head_bits": 6,
35
+ "calibration": {
36
+ "rows": 100,
37
+ "cols": 2048
38
+ },
39
+ "out_scales": "auto"
40
+ }
41
+ }
generation_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 5,
4
+ "eos_token_id": 255001,
5
+ "pad_token_id": 0,
6
+ "transformers_version": "4.38.2"
7
+ }
model-00001-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e9c3d16715f6f5f15c8ec52f24a7e87bfbb337ee123ab68e881359f2fb43a2cf
3
+ size 4194304112
model-00002-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3b7c3ddbd4076e6081c20115f042d2d74f42599c8fe2f87331f4b133dcc25250
3
+ size 1941536008
model-00003-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:83a83e3d456ffece81f5b84520e34f43d0624151d7b4164fa602c0acbac04c6c
3
+ size 1941536008
model-00004-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e04e2c3c305cea41198ee9f7b741cc07b9b22d9b17ee8209954d7f844dca25f
3
+ size 1941536120
model-00005-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:309e8057cfcee12c55e9f18ed286d29d804cbf2e9ae6ff33e3ac6c350bb12d8e
3
+ size 1941536120
model-00006-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1c00016d7be8862c8c59b7c8b8e6387530eab9ef481771aedb3b7f37d058d14f
3
+ size 1941536120
model-00007-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3097d293d4bdb07fdeeee74a6a22e1b62af359415e33f99f2703de1fdead94e7
3
+ size 1941536120
model-00008-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:54e37498d697116b49a91f8d4d38f7fafb8a1e334b3b85f823571188e122f2f4
3
+ size 1941536120
model-00009-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0f4541257a168e88e45b248945a7bd8569311bd8bb9f473f47e067a49751feec
3
+ size 1941552584
model-00010-of-00010.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8a641d0c9c01e20497d6b45486ce996ebf1c1cb47f57061945a2ef8cb9b2a413
3
+ size 1573392632
model.safetensors.index.json ADDED
@@ -0,0 +1,892 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "total_size": 21299908608
4
+ },
5
+ "weight_map": {
6
+ "model.embed_tokens.weight": "model-00001-of-00010.safetensors",
7
+ "model.layers.0.self_attn.k_proj.suh": "model-00002-of-00010.safetensors",
8
+ "model.layers.0.self_attn.k_proj.svh": "model-00002-of-00010.safetensors",
9
+ "model.layers.0.self_attn.k_proj.trellis": "model-00002-of-00010.safetensors",
10
+ "model.layers.0.self_attn.o_proj.suh": "model-00002-of-00010.safetensors",
11
+ "model.layers.0.self_attn.o_proj.svh": "model-00002-of-00010.safetensors",
12
+ "model.layers.0.self_attn.o_proj.trellis": "model-00002-of-00010.safetensors",
13
+ "model.layers.0.self_attn.q_proj.suh": "model-00002-of-00010.safetensors",
14
+ "model.layers.0.self_attn.q_proj.svh": "model-00002-of-00010.safetensors",
15
+ "model.layers.0.self_attn.q_proj.trellis": "model-00002-of-00010.safetensors",
16
+ "model.layers.0.self_attn.v_proj.suh": "model-00002-of-00010.safetensors",
17
+ "model.layers.0.self_attn.v_proj.svh": "model-00002-of-00010.safetensors",
18
+ "model.layers.0.self_attn.v_proj.trellis": "model-00002-of-00010.safetensors",
19
+ "model.layers.0.mlp.down_proj.suh": "model-00002-of-00010.safetensors",
20
+ "model.layers.0.mlp.down_proj.svh": "model-00002-of-00010.safetensors",
21
+ "model.layers.0.mlp.down_proj.trellis": "model-00002-of-00010.safetensors",
22
+ "model.layers.0.mlp.gate_proj.suh": "model-00002-of-00010.safetensors",
23
+ "model.layers.0.mlp.gate_proj.svh": "model-00002-of-00010.safetensors",
24
+ "model.layers.0.mlp.gate_proj.trellis": "model-00002-of-00010.safetensors",
25
+ "model.layers.0.mlp.up_proj.suh": "model-00002-of-00010.safetensors",
26
+ "model.layers.0.mlp.up_proj.svh": "model-00002-of-00010.safetensors",
27
+ "model.layers.0.mlp.up_proj.trellis": "model-00002-of-00010.safetensors",
28
+ "model.layers.0.input_layernorm.weight": "model-00002-of-00010.safetensors",
29
+ "model.layers.1.self_attn.k_proj.suh": "model-00002-of-00010.safetensors",
30
+ "model.layers.1.self_attn.k_proj.svh": "model-00002-of-00010.safetensors",
31
+ "model.layers.1.self_attn.k_proj.trellis": "model-00002-of-00010.safetensors",
32
+ "model.layers.1.self_attn.o_proj.suh": "model-00002-of-00010.safetensors",
33
+ "model.layers.1.self_attn.o_proj.svh": "model-00002-of-00010.safetensors",
34
+ "model.layers.1.self_attn.o_proj.trellis": "model-00002-of-00010.safetensors",
35
+ "model.layers.1.self_attn.q_proj.suh": "model-00002-of-00010.safetensors",
36
+ "model.layers.1.self_attn.q_proj.svh": "model-00002-of-00010.safetensors",
37
+ "model.layers.1.self_attn.q_proj.trellis": "model-00002-of-00010.safetensors",
38
+ "model.layers.1.self_attn.v_proj.suh": "model-00002-of-00010.safetensors",
39
+ "model.layers.1.self_attn.v_proj.svh": "model-00002-of-00010.safetensors",
40
+ "model.layers.1.self_attn.v_proj.trellis": "model-00002-of-00010.safetensors",
41
+ "model.layers.1.mlp.down_proj.suh": "model-00002-of-00010.safetensors",
42
+ "model.layers.1.mlp.down_proj.svh": "model-00002-of-00010.safetensors",
43
+ "model.layers.1.mlp.down_proj.trellis": "model-00002-of-00010.safetensors",
44
+ "model.layers.1.mlp.gate_proj.suh": "model-00002-of-00010.safetensors",
45
+ "model.layers.1.mlp.gate_proj.svh": "model-00002-of-00010.safetensors",
46
+ "model.layers.1.mlp.gate_proj.trellis": "model-00002-of-00010.safetensors",
47
+ "model.layers.1.mlp.up_proj.suh": "model-00002-of-00010.safetensors",
48
+ "model.layers.1.mlp.up_proj.svh": "model-00002-of-00010.safetensors",
49
+ "model.layers.1.mlp.up_proj.trellis": "model-00002-of-00010.safetensors",
50
+ "model.layers.1.input_layernorm.weight": "model-00002-of-00010.safetensors",
51
+ "model.layers.2.self_attn.k_proj.suh": "model-00002-of-00010.safetensors",
52
+ "model.layers.2.self_attn.k_proj.svh": "model-00002-of-00010.safetensors",
53
+ "model.layers.2.self_attn.k_proj.trellis": "model-00002-of-00010.safetensors",
54
+ "model.layers.2.self_attn.o_proj.suh": "model-00002-of-00010.safetensors",
55
+ "model.layers.2.self_attn.o_proj.svh": "model-00002-of-00010.safetensors",
56
+ "model.layers.2.self_attn.o_proj.trellis": "model-00002-of-00010.safetensors",
57
+ "model.layers.2.self_attn.q_proj.suh": "model-00002-of-00010.safetensors",
58
+ "model.layers.2.self_attn.q_proj.svh": "model-00002-of-00010.safetensors",
59
+ "model.layers.2.self_attn.q_proj.trellis": "model-00002-of-00010.safetensors",
60
+ "model.layers.2.self_attn.v_proj.suh": "model-00002-of-00010.safetensors",
61
+ "model.layers.2.self_attn.v_proj.svh": "model-00002-of-00010.safetensors",
62
+ "model.layers.2.self_attn.v_proj.trellis": "model-00002-of-00010.safetensors",
63
+ "model.layers.2.mlp.down_proj.suh": "model-00002-of-00010.safetensors",
64
+ "model.layers.2.mlp.down_proj.svh": "model-00002-of-00010.safetensors",
65
+ "model.layers.2.mlp.down_proj.trellis": "model-00002-of-00010.safetensors",
66
+ "model.layers.2.mlp.gate_proj.suh": "model-00002-of-00010.safetensors",
67
+ "model.layers.2.mlp.gate_proj.svh": "model-00002-of-00010.safetensors",
68
+ "model.layers.2.mlp.gate_proj.trellis": "model-00002-of-00010.safetensors",
69
+ "model.layers.2.mlp.up_proj.suh": "model-00002-of-00010.safetensors",
70
+ "model.layers.2.mlp.up_proj.svh": "model-00002-of-00010.safetensors",
71
+ "model.layers.2.mlp.up_proj.trellis": "model-00002-of-00010.safetensors",
72
+ "model.layers.2.input_layernorm.weight": "model-00002-of-00010.safetensors",
73
+ "model.layers.3.self_attn.k_proj.suh": "model-00002-of-00010.safetensors",
74
+ "model.layers.3.self_attn.k_proj.svh": "model-00002-of-00010.safetensors",
75
+ "model.layers.3.self_attn.k_proj.trellis": "model-00002-of-00010.safetensors",
76
+ "model.layers.3.self_attn.o_proj.suh": "model-00002-of-00010.safetensors",
77
+ "model.layers.3.self_attn.o_proj.svh": "model-00002-of-00010.safetensors",
78
+ "model.layers.3.self_attn.o_proj.trellis": "model-00002-of-00010.safetensors",
79
+ "model.layers.3.self_attn.q_proj.suh": "model-00002-of-00010.safetensors",
80
+ "model.layers.3.self_attn.q_proj.svh": "model-00002-of-00010.safetensors",
81
+ "model.layers.3.self_attn.q_proj.trellis": "model-00002-of-00010.safetensors",
82
+ "model.layers.3.self_attn.v_proj.suh": "model-00002-of-00010.safetensors",
83
+ "model.layers.3.self_attn.v_proj.svh": "model-00002-of-00010.safetensors",
84
+ "model.layers.3.self_attn.v_proj.trellis": "model-00002-of-00010.safetensors",
85
+ "model.layers.3.mlp.down_proj.suh": "model-00002-of-00010.safetensors",
86
+ "model.layers.3.mlp.down_proj.svh": "model-00002-of-00010.safetensors",
87
+ "model.layers.3.mlp.down_proj.trellis": "model-00002-of-00010.safetensors",
88
+ "model.layers.3.mlp.gate_proj.suh": "model-00002-of-00010.safetensors",
89
+ "model.layers.3.mlp.gate_proj.svh": "model-00002-of-00010.safetensors",
90
+ "model.layers.3.mlp.gate_proj.trellis": "model-00002-of-00010.safetensors",
91
+ "model.layers.3.mlp.up_proj.suh": "model-00002-of-00010.safetensors",
92
+ "model.layers.3.mlp.up_proj.svh": "model-00002-of-00010.safetensors",
93
+ "model.layers.3.mlp.up_proj.trellis": "model-00002-of-00010.safetensors",
94
+ "model.layers.3.input_layernorm.weight": "model-00002-of-00010.safetensors",
95
+ "model.layers.4.self_attn.k_proj.suh": "model-00002-of-00010.safetensors",
96
+ "model.layers.4.self_attn.k_proj.svh": "model-00002-of-00010.safetensors",
97
+ "model.layers.4.self_attn.k_proj.trellis": "model-00002-of-00010.safetensors",
98
+ "model.layers.4.self_attn.o_proj.suh": "model-00002-of-00010.safetensors",
99
+ "model.layers.4.self_attn.o_proj.svh": "model-00002-of-00010.safetensors",
100
+ "model.layers.4.self_attn.o_proj.trellis": "model-00002-of-00010.safetensors",
101
+ "model.layers.4.self_attn.q_proj.suh": "model-00002-of-00010.safetensors",
102
+ "model.layers.4.self_attn.q_proj.svh": "model-00002-of-00010.safetensors",
103
+ "model.layers.4.self_attn.q_proj.trellis": "model-00002-of-00010.safetensors",
104
+ "model.layers.4.self_attn.v_proj.suh": "model-00002-of-00010.safetensors",
105
+ "model.layers.4.self_attn.v_proj.svh": "model-00002-of-00010.safetensors",
106
+ "model.layers.4.self_attn.v_proj.trellis": "model-00002-of-00010.safetensors",
107
+ "model.layers.4.mlp.down_proj.suh": "model-00002-of-00010.safetensors",
108
+ "model.layers.4.mlp.down_proj.svh": "model-00002-of-00010.safetensors",
109
+ "model.layers.4.mlp.down_proj.trellis": "model-00002-of-00010.safetensors",
110
+ "model.layers.4.mlp.gate_proj.suh": "model-00002-of-00010.safetensors",
111
+ "model.layers.4.mlp.gate_proj.svh": "model-00002-of-00010.safetensors",
112
+ "model.layers.4.mlp.gate_proj.trellis": "model-00002-of-00010.safetensors",
113
+ "model.layers.4.mlp.up_proj.suh": "model-00002-of-00010.safetensors",
114
+ "model.layers.4.mlp.up_proj.svh": "model-00002-of-00010.safetensors",
115
+ "model.layers.4.mlp.up_proj.trellis": "model-00002-of-00010.safetensors",
116
+ "model.layers.4.input_layernorm.weight": "model-00002-of-00010.safetensors",
117
+ "model.layers.5.self_attn.k_proj.suh": "model-00003-of-00010.safetensors",
118
+ "model.layers.5.self_attn.k_proj.svh": "model-00003-of-00010.safetensors",
119
+ "model.layers.5.self_attn.k_proj.trellis": "model-00003-of-00010.safetensors",
120
+ "model.layers.5.self_attn.o_proj.suh": "model-00003-of-00010.safetensors",
121
+ "model.layers.5.self_attn.o_proj.svh": "model-00003-of-00010.safetensors",
122
+ "model.layers.5.self_attn.o_proj.trellis": "model-00003-of-00010.safetensors",
123
+ "model.layers.5.self_attn.q_proj.suh": "model-00003-of-00010.safetensors",
124
+ "model.layers.5.self_attn.q_proj.svh": "model-00003-of-00010.safetensors",
125
+ "model.layers.5.self_attn.q_proj.trellis": "model-00003-of-00010.safetensors",
126
+ "model.layers.5.self_attn.v_proj.suh": "model-00003-of-00010.safetensors",
127
+ "model.layers.5.self_attn.v_proj.svh": "model-00003-of-00010.safetensors",
128
+ "model.layers.5.self_attn.v_proj.trellis": "model-00003-of-00010.safetensors",
129
+ "model.layers.5.mlp.down_proj.suh": "model-00003-of-00010.safetensors",
130
+ "model.layers.5.mlp.down_proj.svh": "model-00003-of-00010.safetensors",
131
+ "model.layers.5.mlp.down_proj.trellis": "model-00003-of-00010.safetensors",
132
+ "model.layers.5.mlp.gate_proj.suh": "model-00003-of-00010.safetensors",
133
+ "model.layers.5.mlp.gate_proj.svh": "model-00003-of-00010.safetensors",
134
+ "model.layers.5.mlp.gate_proj.trellis": "model-00003-of-00010.safetensors",
135
+ "model.layers.5.mlp.up_proj.suh": "model-00003-of-00010.safetensors",
136
+ "model.layers.5.mlp.up_proj.svh": "model-00003-of-00010.safetensors",
137
+ "model.layers.5.mlp.up_proj.trellis": "model-00003-of-00010.safetensors",
138
+ "model.layers.5.input_layernorm.weight": "model-00003-of-00010.safetensors",
139
+ "model.layers.6.self_attn.k_proj.suh": "model-00003-of-00010.safetensors",
140
+ "model.layers.6.self_attn.k_proj.svh": "model-00003-of-00010.safetensors",
141
+ "model.layers.6.self_attn.k_proj.trellis": "model-00003-of-00010.safetensors",
142
+ "model.layers.6.self_attn.o_proj.suh": "model-00003-of-00010.safetensors",
143
+ "model.layers.6.self_attn.o_proj.svh": "model-00003-of-00010.safetensors",
144
+ "model.layers.6.self_attn.o_proj.trellis": "model-00003-of-00010.safetensors",
145
+ "model.layers.6.self_attn.q_proj.suh": "model-00003-of-00010.safetensors",
146
+ "model.layers.6.self_attn.q_proj.svh": "model-00003-of-00010.safetensors",
147
+ "model.layers.6.self_attn.q_proj.trellis": "model-00003-of-00010.safetensors",
148
+ "model.layers.6.self_attn.v_proj.suh": "model-00003-of-00010.safetensors",
149
+ "model.layers.6.self_attn.v_proj.svh": "model-00003-of-00010.safetensors",
150
+ "model.layers.6.self_attn.v_proj.trellis": "model-00003-of-00010.safetensors",
151
+ "model.layers.6.mlp.down_proj.suh": "model-00003-of-00010.safetensors",
152
+ "model.layers.6.mlp.down_proj.svh": "model-00003-of-00010.safetensors",
153
+ "model.layers.6.mlp.down_proj.trellis": "model-00003-of-00010.safetensors",
154
+ "model.layers.6.mlp.gate_proj.suh": "model-00003-of-00010.safetensors",
155
+ "model.layers.6.mlp.gate_proj.svh": "model-00003-of-00010.safetensors",
156
+ "model.layers.6.mlp.gate_proj.trellis": "model-00003-of-00010.safetensors",
157
+ "model.layers.6.mlp.up_proj.suh": "model-00003-of-00010.safetensors",
158
+ "model.layers.6.mlp.up_proj.svh": "model-00003-of-00010.safetensors",
159
+ "model.layers.6.mlp.up_proj.trellis": "model-00003-of-00010.safetensors",
160
+ "model.layers.6.input_layernorm.weight": "model-00003-of-00010.safetensors",
161
+ "model.layers.7.self_attn.k_proj.suh": "model-00003-of-00010.safetensors",
162
+ "model.layers.7.self_attn.k_proj.svh": "model-00003-of-00010.safetensors",
163
+ "model.layers.7.self_attn.k_proj.trellis": "model-00003-of-00010.safetensors",
164
+ "model.layers.7.self_attn.o_proj.suh": "model-00003-of-00010.safetensors",
165
+ "model.layers.7.self_attn.o_proj.svh": "model-00003-of-00010.safetensors",
166
+ "model.layers.7.self_attn.o_proj.trellis": "model-00003-of-00010.safetensors",
167
+ "model.layers.7.self_attn.q_proj.suh": "model-00003-of-00010.safetensors",
168
+ "model.layers.7.self_attn.q_proj.svh": "model-00003-of-00010.safetensors",
169
+ "model.layers.7.self_attn.q_proj.trellis": "model-00003-of-00010.safetensors",
170
+ "model.layers.7.self_attn.v_proj.suh": "model-00003-of-00010.safetensors",
171
+ "model.layers.7.self_attn.v_proj.svh": "model-00003-of-00010.safetensors",
172
+ "model.layers.7.self_attn.v_proj.trellis": "model-00003-of-00010.safetensors",
173
+ "model.layers.7.mlp.down_proj.suh": "model-00003-of-00010.safetensors",
174
+ "model.layers.7.mlp.down_proj.svh": "model-00003-of-00010.safetensors",
175
+ "model.layers.7.mlp.down_proj.trellis": "model-00003-of-00010.safetensors",
176
+ "model.layers.7.mlp.gate_proj.suh": "model-00003-of-00010.safetensors",
177
+ "model.layers.7.mlp.gate_proj.svh": "model-00003-of-00010.safetensors",
178
+ "model.layers.7.mlp.gate_proj.trellis": "model-00003-of-00010.safetensors",
179
+ "model.layers.7.mlp.up_proj.suh": "model-00003-of-00010.safetensors",
180
+ "model.layers.7.mlp.up_proj.svh": "model-00003-of-00010.safetensors",
181
+ "model.layers.7.mlp.up_proj.trellis": "model-00003-of-00010.safetensors",
182
+ "model.layers.7.input_layernorm.weight": "model-00003-of-00010.safetensors",
183
+ "model.layers.8.self_attn.k_proj.suh": "model-00003-of-00010.safetensors",
184
+ "model.layers.8.self_attn.k_proj.svh": "model-00003-of-00010.safetensors",
185
+ "model.layers.8.self_attn.k_proj.trellis": "model-00003-of-00010.safetensors",
186
+ "model.layers.8.self_attn.o_proj.suh": "model-00003-of-00010.safetensors",
187
+ "model.layers.8.self_attn.o_proj.svh": "model-00003-of-00010.safetensors",
188
+ "model.layers.8.self_attn.o_proj.trellis": "model-00003-of-00010.safetensors",
189
+ "model.layers.8.self_attn.q_proj.suh": "model-00003-of-00010.safetensors",
190
+ "model.layers.8.self_attn.q_proj.svh": "model-00003-of-00010.safetensors",
191
+ "model.layers.8.self_attn.q_proj.trellis": "model-00003-of-00010.safetensors",
192
+ "model.layers.8.self_attn.v_proj.suh": "model-00003-of-00010.safetensors",
193
+ "model.layers.8.self_attn.v_proj.svh": "model-00003-of-00010.safetensors",
194
+ "model.layers.8.self_attn.v_proj.trellis": "model-00003-of-00010.safetensors",
195
+ "model.layers.8.mlp.down_proj.suh": "model-00003-of-00010.safetensors",
196
+ "model.layers.8.mlp.down_proj.svh": "model-00003-of-00010.safetensors",
197
+ "model.layers.8.mlp.down_proj.trellis": "model-00003-of-00010.safetensors",
198
+ "model.layers.8.mlp.gate_proj.suh": "model-00003-of-00010.safetensors",
199
+ "model.layers.8.mlp.gate_proj.svh": "model-00003-of-00010.safetensors",
200
+ "model.layers.8.mlp.gate_proj.trellis": "model-00003-of-00010.safetensors",
201
+ "model.layers.8.mlp.up_proj.suh": "model-00003-of-00010.safetensors",
202
+ "model.layers.8.mlp.up_proj.svh": "model-00003-of-00010.safetensors",
203
+ "model.layers.8.mlp.up_proj.trellis": "model-00003-of-00010.safetensors",
204
+ "model.layers.8.input_layernorm.weight": "model-00003-of-00010.safetensors",
205
+ "model.layers.9.self_attn.k_proj.suh": "model-00003-of-00010.safetensors",
206
+ "model.layers.9.self_attn.k_proj.svh": "model-00003-of-00010.safetensors",
207
+ "model.layers.9.self_attn.k_proj.trellis": "model-00003-of-00010.safetensors",
208
+ "model.layers.9.self_attn.o_proj.suh": "model-00003-of-00010.safetensors",
209
+ "model.layers.9.self_attn.o_proj.svh": "model-00003-of-00010.safetensors",
210
+ "model.layers.9.self_attn.o_proj.trellis": "model-00003-of-00010.safetensors",
211
+ "model.layers.9.self_attn.q_proj.suh": "model-00003-of-00010.safetensors",
212
+ "model.layers.9.self_attn.q_proj.svh": "model-00003-of-00010.safetensors",
213
+ "model.layers.9.self_attn.q_proj.trellis": "model-00003-of-00010.safetensors",
214
+ "model.layers.9.self_attn.v_proj.suh": "model-00003-of-00010.safetensors",
215
+ "model.layers.9.self_attn.v_proj.svh": "model-00003-of-00010.safetensors",
216
+ "model.layers.9.self_attn.v_proj.trellis": "model-00003-of-00010.safetensors",
217
+ "model.layers.9.mlp.down_proj.suh": "model-00003-of-00010.safetensors",
218
+ "model.layers.9.mlp.down_proj.svh": "model-00003-of-00010.safetensors",
219
+ "model.layers.9.mlp.down_proj.trellis": "model-00003-of-00010.safetensors",
220
+ "model.layers.9.mlp.gate_proj.suh": "model-00003-of-00010.safetensors",
221
+ "model.layers.9.mlp.gate_proj.svh": "model-00003-of-00010.safetensors",
222
+ "model.layers.9.mlp.gate_proj.trellis": "model-00003-of-00010.safetensors",
223
+ "model.layers.9.mlp.up_proj.suh": "model-00003-of-00010.safetensors",
224
+ "model.layers.9.mlp.up_proj.svh": "model-00003-of-00010.safetensors",
225
+ "model.layers.9.mlp.up_proj.trellis": "model-00003-of-00010.safetensors",
226
+ "model.layers.9.input_layernorm.weight": "model-00003-of-00010.safetensors",
227
+ "model.layers.10.self_attn.k_proj.suh": "model-00004-of-00010.safetensors",
228
+ "model.layers.10.self_attn.k_proj.svh": "model-00004-of-00010.safetensors",
229
+ "model.layers.10.self_attn.k_proj.trellis": "model-00004-of-00010.safetensors",
230
+ "model.layers.10.self_attn.o_proj.suh": "model-00004-of-00010.safetensors",
231
+ "model.layers.10.self_attn.o_proj.svh": "model-00004-of-00010.safetensors",
232
+ "model.layers.10.self_attn.o_proj.trellis": "model-00004-of-00010.safetensors",
233
+ "model.layers.10.self_attn.q_proj.suh": "model-00004-of-00010.safetensors",
234
+ "model.layers.10.self_attn.q_proj.svh": "model-00004-of-00010.safetensors",
235
+ "model.layers.10.self_attn.q_proj.trellis": "model-00004-of-00010.safetensors",
236
+ "model.layers.10.self_attn.v_proj.suh": "model-00004-of-00010.safetensors",
237
+ "model.layers.10.self_attn.v_proj.svh": "model-00004-of-00010.safetensors",
238
+ "model.layers.10.self_attn.v_proj.trellis": "model-00004-of-00010.safetensors",
239
+ "model.layers.10.mlp.down_proj.suh": "model-00004-of-00010.safetensors",
240
+ "model.layers.10.mlp.down_proj.svh": "model-00004-of-00010.safetensors",
241
+ "model.layers.10.mlp.down_proj.trellis": "model-00004-of-00010.safetensors",
242
+ "model.layers.10.mlp.gate_proj.suh": "model-00004-of-00010.safetensors",
243
+ "model.layers.10.mlp.gate_proj.svh": "model-00004-of-00010.safetensors",
244
+ "model.layers.10.mlp.gate_proj.trellis": "model-00004-of-00010.safetensors",
245
+ "model.layers.10.mlp.up_proj.suh": "model-00004-of-00010.safetensors",
246
+ "model.layers.10.mlp.up_proj.svh": "model-00004-of-00010.safetensors",
247
+ "model.layers.10.mlp.up_proj.trellis": "model-00004-of-00010.safetensors",
248
+ "model.layers.10.input_layernorm.weight": "model-00004-of-00010.safetensors",
249
+ "model.layers.11.self_attn.k_proj.suh": "model-00004-of-00010.safetensors",
250
+ "model.layers.11.self_attn.k_proj.svh": "model-00004-of-00010.safetensors",
251
+ "model.layers.11.self_attn.k_proj.trellis": "model-00004-of-00010.safetensors",
252
+ "model.layers.11.self_attn.o_proj.suh": "model-00004-of-00010.safetensors",
253
+ "model.layers.11.self_attn.o_proj.svh": "model-00004-of-00010.safetensors",
254
+ "model.layers.11.self_attn.o_proj.trellis": "model-00004-of-00010.safetensors",
255
+ "model.layers.11.self_attn.q_proj.suh": "model-00004-of-00010.safetensors",
256
+ "model.layers.11.self_attn.q_proj.svh": "model-00004-of-00010.safetensors",
257
+ "model.layers.11.self_attn.q_proj.trellis": "model-00004-of-00010.safetensors",
258
+ "model.layers.11.self_attn.v_proj.suh": "model-00004-of-00010.safetensors",
259
+ "model.layers.11.self_attn.v_proj.svh": "model-00004-of-00010.safetensors",
260
+ "model.layers.11.self_attn.v_proj.trellis": "model-00004-of-00010.safetensors",
261
+ "model.layers.11.mlp.down_proj.suh": "model-00004-of-00010.safetensors",
262
+ "model.layers.11.mlp.down_proj.svh": "model-00004-of-00010.safetensors",
263
+ "model.layers.11.mlp.down_proj.trellis": "model-00004-of-00010.safetensors",
264
+ "model.layers.11.mlp.gate_proj.suh": "model-00004-of-00010.safetensors",
265
+ "model.layers.11.mlp.gate_proj.svh": "model-00004-of-00010.safetensors",
266
+ "model.layers.11.mlp.gate_proj.trellis": "model-00004-of-00010.safetensors",
267
+ "model.layers.11.mlp.up_proj.suh": "model-00004-of-00010.safetensors",
268
+ "model.layers.11.mlp.up_proj.svh": "model-00004-of-00010.safetensors",
269
+ "model.layers.11.mlp.up_proj.trellis": "model-00004-of-00010.safetensors",
270
+ "model.layers.11.input_layernorm.weight": "model-00004-of-00010.safetensors",
271
+ "model.layers.12.self_attn.k_proj.suh": "model-00004-of-00010.safetensors",
272
+ "model.layers.12.self_attn.k_proj.svh": "model-00004-of-00010.safetensors",
273
+ "model.layers.12.self_attn.k_proj.trellis": "model-00004-of-00010.safetensors",
274
+ "model.layers.12.self_attn.o_proj.suh": "model-00004-of-00010.safetensors",
275
+ "model.layers.12.self_attn.o_proj.svh": "model-00004-of-00010.safetensors",
276
+ "model.layers.12.self_attn.o_proj.trellis": "model-00004-of-00010.safetensors",
277
+ "model.layers.12.self_attn.q_proj.suh": "model-00004-of-00010.safetensors",
278
+ "model.layers.12.self_attn.q_proj.svh": "model-00004-of-00010.safetensors",
279
+ "model.layers.12.self_attn.q_proj.trellis": "model-00004-of-00010.safetensors",
280
+ "model.layers.12.self_attn.v_proj.suh": "model-00004-of-00010.safetensors",
281
+ "model.layers.12.self_attn.v_proj.svh": "model-00004-of-00010.safetensors",
282
+ "model.layers.12.self_attn.v_proj.trellis": "model-00004-of-00010.safetensors",
283
+ "model.layers.12.mlp.down_proj.suh": "model-00004-of-00010.safetensors",
284
+ "model.layers.12.mlp.down_proj.svh": "model-00004-of-00010.safetensors",
285
+ "model.layers.12.mlp.down_proj.trellis": "model-00004-of-00010.safetensors",
286
+ "model.layers.12.mlp.gate_proj.suh": "model-00004-of-00010.safetensors",
287
+ "model.layers.12.mlp.gate_proj.svh": "model-00004-of-00010.safetensors",
288
+ "model.layers.12.mlp.gate_proj.trellis": "model-00004-of-00010.safetensors",
289
+ "model.layers.12.mlp.up_proj.suh": "model-00004-of-00010.safetensors",
290
+ "model.layers.12.mlp.up_proj.svh": "model-00004-of-00010.safetensors",
291
+ "model.layers.12.mlp.up_proj.trellis": "model-00004-of-00010.safetensors",
292
+ "model.layers.12.input_layernorm.weight": "model-00004-of-00010.safetensors",
293
+ "model.layers.13.self_attn.k_proj.suh": "model-00004-of-00010.safetensors",
294
+ "model.layers.13.self_attn.k_proj.svh": "model-00004-of-00010.safetensors",
295
+ "model.layers.13.self_attn.k_proj.trellis": "model-00004-of-00010.safetensors",
296
+ "model.layers.13.self_attn.o_proj.suh": "model-00004-of-00010.safetensors",
297
+ "model.layers.13.self_attn.o_proj.svh": "model-00004-of-00010.safetensors",
298
+ "model.layers.13.self_attn.o_proj.trellis": "model-00004-of-00010.safetensors",
299
+ "model.layers.13.self_attn.q_proj.suh": "model-00004-of-00010.safetensors",
300
+ "model.layers.13.self_attn.q_proj.svh": "model-00004-of-00010.safetensors",
301
+ "model.layers.13.self_attn.q_proj.trellis": "model-00004-of-00010.safetensors",
302
+ "model.layers.13.self_attn.v_proj.suh": "model-00004-of-00010.safetensors",
303
+ "model.layers.13.self_attn.v_proj.svh": "model-00004-of-00010.safetensors",
304
+ "model.layers.13.self_attn.v_proj.trellis": "model-00004-of-00010.safetensors",
305
+ "model.layers.13.mlp.down_proj.suh": "model-00004-of-00010.safetensors",
306
+ "model.layers.13.mlp.down_proj.svh": "model-00004-of-00010.safetensors",
307
+ "model.layers.13.mlp.down_proj.trellis": "model-00004-of-00010.safetensors",
308
+ "model.layers.13.mlp.gate_proj.suh": "model-00004-of-00010.safetensors",
309
+ "model.layers.13.mlp.gate_proj.svh": "model-00004-of-00010.safetensors",
310
+ "model.layers.13.mlp.gate_proj.trellis": "model-00004-of-00010.safetensors",
311
+ "model.layers.13.mlp.up_proj.suh": "model-00004-of-00010.safetensors",
312
+ "model.layers.13.mlp.up_proj.svh": "model-00004-of-00010.safetensors",
313
+ "model.layers.13.mlp.up_proj.trellis": "model-00004-of-00010.safetensors",
314
+ "model.layers.13.input_layernorm.weight": "model-00004-of-00010.safetensors",
315
+ "model.layers.14.self_attn.k_proj.suh": "model-00004-of-00010.safetensors",
316
+ "model.layers.14.self_attn.k_proj.svh": "model-00004-of-00010.safetensors",
317
+ "model.layers.14.self_attn.k_proj.trellis": "model-00004-of-00010.safetensors",
318
+ "model.layers.14.self_attn.o_proj.suh": "model-00004-of-00010.safetensors",
319
+ "model.layers.14.self_attn.o_proj.svh": "model-00004-of-00010.safetensors",
320
+ "model.layers.14.self_attn.o_proj.trellis": "model-00004-of-00010.safetensors",
321
+ "model.layers.14.self_attn.q_proj.suh": "model-00004-of-00010.safetensors",
322
+ "model.layers.14.self_attn.q_proj.svh": "model-00004-of-00010.safetensors",
323
+ "model.layers.14.self_attn.q_proj.trellis": "model-00004-of-00010.safetensors",
324
+ "model.layers.14.self_attn.v_proj.suh": "model-00004-of-00010.safetensors",
325
+ "model.layers.14.self_attn.v_proj.svh": "model-00004-of-00010.safetensors",
326
+ "model.layers.14.self_attn.v_proj.trellis": "model-00004-of-00010.safetensors",
327
+ "model.layers.14.mlp.down_proj.suh": "model-00004-of-00010.safetensors",
328
+ "model.layers.14.mlp.down_proj.svh": "model-00004-of-00010.safetensors",
329
+ "model.layers.14.mlp.down_proj.trellis": "model-00004-of-00010.safetensors",
330
+ "model.layers.14.mlp.gate_proj.suh": "model-00004-of-00010.safetensors",
331
+ "model.layers.14.mlp.gate_proj.svh": "model-00004-of-00010.safetensors",
332
+ "model.layers.14.mlp.gate_proj.trellis": "model-00004-of-00010.safetensors",
333
+ "model.layers.14.mlp.up_proj.suh": "model-00004-of-00010.safetensors",
334
+ "model.layers.14.mlp.up_proj.svh": "model-00004-of-00010.safetensors",
335
+ "model.layers.14.mlp.up_proj.trellis": "model-00004-of-00010.safetensors",
336
+ "model.layers.14.input_layernorm.weight": "model-00004-of-00010.safetensors",
337
+ "model.layers.15.self_attn.k_proj.suh": "model-00005-of-00010.safetensors",
338
+ "model.layers.15.self_attn.k_proj.svh": "model-00005-of-00010.safetensors",
339
+ "model.layers.15.self_attn.k_proj.trellis": "model-00005-of-00010.safetensors",
340
+ "model.layers.15.self_attn.o_proj.suh": "model-00005-of-00010.safetensors",
341
+ "model.layers.15.self_attn.o_proj.svh": "model-00005-of-00010.safetensors",
342
+ "model.layers.15.self_attn.o_proj.trellis": "model-00005-of-00010.safetensors",
343
+ "model.layers.15.self_attn.q_proj.suh": "model-00005-of-00010.safetensors",
344
+ "model.layers.15.self_attn.q_proj.svh": "model-00005-of-00010.safetensors",
345
+ "model.layers.15.self_attn.q_proj.trellis": "model-00005-of-00010.safetensors",
346
+ "model.layers.15.self_attn.v_proj.suh": "model-00005-of-00010.safetensors",
347
+ "model.layers.15.self_attn.v_proj.svh": "model-00005-of-00010.safetensors",
348
+ "model.layers.15.self_attn.v_proj.trellis": "model-00005-of-00010.safetensors",
349
+ "model.layers.15.mlp.down_proj.suh": "model-00005-of-00010.safetensors",
350
+ "model.layers.15.mlp.down_proj.svh": "model-00005-of-00010.safetensors",
351
+ "model.layers.15.mlp.down_proj.trellis": "model-00005-of-00010.safetensors",
352
+ "model.layers.15.mlp.gate_proj.suh": "model-00005-of-00010.safetensors",
353
+ "model.layers.15.mlp.gate_proj.svh": "model-00005-of-00010.safetensors",
354
+ "model.layers.15.mlp.gate_proj.trellis": "model-00005-of-00010.safetensors",
355
+ "model.layers.15.mlp.up_proj.suh": "model-00005-of-00010.safetensors",
356
+ "model.layers.15.mlp.up_proj.svh": "model-00005-of-00010.safetensors",
357
+ "model.layers.15.mlp.up_proj.trellis": "model-00005-of-00010.safetensors",
358
+ "model.layers.15.input_layernorm.weight": "model-00005-of-00010.safetensors",
359
+ "model.layers.16.self_attn.k_proj.suh": "model-00005-of-00010.safetensors",
360
+ "model.layers.16.self_attn.k_proj.svh": "model-00005-of-00010.safetensors",
361
+ "model.layers.16.self_attn.k_proj.trellis": "model-00005-of-00010.safetensors",
362
+ "model.layers.16.self_attn.o_proj.suh": "model-00005-of-00010.safetensors",
363
+ "model.layers.16.self_attn.o_proj.svh": "model-00005-of-00010.safetensors",
364
+ "model.layers.16.self_attn.o_proj.trellis": "model-00005-of-00010.safetensors",
365
+ "model.layers.16.self_attn.q_proj.suh": "model-00005-of-00010.safetensors",
366
+ "model.layers.16.self_attn.q_proj.svh": "model-00005-of-00010.safetensors",
367
+ "model.layers.16.self_attn.q_proj.trellis": "model-00005-of-00010.safetensors",
368
+ "model.layers.16.self_attn.v_proj.suh": "model-00005-of-00010.safetensors",
369
+ "model.layers.16.self_attn.v_proj.svh": "model-00005-of-00010.safetensors",
370
+ "model.layers.16.self_attn.v_proj.trellis": "model-00005-of-00010.safetensors",
371
+ "model.layers.16.mlp.down_proj.suh": "model-00005-of-00010.safetensors",
372
+ "model.layers.16.mlp.down_proj.svh": "model-00005-of-00010.safetensors",
373
+ "model.layers.16.mlp.down_proj.trellis": "model-00005-of-00010.safetensors",
374
+ "model.layers.16.mlp.gate_proj.suh": "model-00005-of-00010.safetensors",
375
+ "model.layers.16.mlp.gate_proj.svh": "model-00005-of-00010.safetensors",
376
+ "model.layers.16.mlp.gate_proj.trellis": "model-00005-of-00010.safetensors",
377
+ "model.layers.16.mlp.up_proj.suh": "model-00005-of-00010.safetensors",
378
+ "model.layers.16.mlp.up_proj.svh": "model-00005-of-00010.safetensors",
379
+ "model.layers.16.mlp.up_proj.trellis": "model-00005-of-00010.safetensors",
380
+ "model.layers.16.input_layernorm.weight": "model-00005-of-00010.safetensors",
381
+ "model.layers.17.self_attn.k_proj.suh": "model-00005-of-00010.safetensors",
382
+ "model.layers.17.self_attn.k_proj.svh": "model-00005-of-00010.safetensors",
383
+ "model.layers.17.self_attn.k_proj.trellis": "model-00005-of-00010.safetensors",
384
+ "model.layers.17.self_attn.o_proj.suh": "model-00005-of-00010.safetensors",
385
+ "model.layers.17.self_attn.o_proj.svh": "model-00005-of-00010.safetensors",
386
+ "model.layers.17.self_attn.o_proj.trellis": "model-00005-of-00010.safetensors",
387
+ "model.layers.17.self_attn.q_proj.suh": "model-00005-of-00010.safetensors",
388
+ "model.layers.17.self_attn.q_proj.svh": "model-00005-of-00010.safetensors",
389
+ "model.layers.17.self_attn.q_proj.trellis": "model-00005-of-00010.safetensors",
390
+ "model.layers.17.self_attn.v_proj.suh": "model-00005-of-00010.safetensors",
391
+ "model.layers.17.self_attn.v_proj.svh": "model-00005-of-00010.safetensors",
392
+ "model.layers.17.self_attn.v_proj.trellis": "model-00005-of-00010.safetensors",
393
+ "model.layers.17.mlp.down_proj.suh": "model-00005-of-00010.safetensors",
394
+ "model.layers.17.mlp.down_proj.svh": "model-00005-of-00010.safetensors",
395
+ "model.layers.17.mlp.down_proj.trellis": "model-00005-of-00010.safetensors",
396
+ "model.layers.17.mlp.gate_proj.suh": "model-00005-of-00010.safetensors",
397
+ "model.layers.17.mlp.gate_proj.svh": "model-00005-of-00010.safetensors",
398
+ "model.layers.17.mlp.gate_proj.trellis": "model-00005-of-00010.safetensors",
399
+ "model.layers.17.mlp.up_proj.suh": "model-00005-of-00010.safetensors",
400
+ "model.layers.17.mlp.up_proj.svh": "model-00005-of-00010.safetensors",
401
+ "model.layers.17.mlp.up_proj.trellis": "model-00005-of-00010.safetensors",
402
+ "model.layers.17.input_layernorm.weight": "model-00005-of-00010.safetensors",
403
+ "model.layers.18.self_attn.k_proj.suh": "model-00005-of-00010.safetensors",
404
+ "model.layers.18.self_attn.k_proj.svh": "model-00005-of-00010.safetensors",
405
+ "model.layers.18.self_attn.k_proj.trellis": "model-00005-of-00010.safetensors",
406
+ "model.layers.18.self_attn.o_proj.suh": "model-00005-of-00010.safetensors",
407
+ "model.layers.18.self_attn.o_proj.svh": "model-00005-of-00010.safetensors",
408
+ "model.layers.18.self_attn.o_proj.trellis": "model-00005-of-00010.safetensors",
409
+ "model.layers.18.self_attn.q_proj.suh": "model-00005-of-00010.safetensors",
410
+ "model.layers.18.self_attn.q_proj.svh": "model-00005-of-00010.safetensors",
411
+ "model.layers.18.self_attn.q_proj.trellis": "model-00005-of-00010.safetensors",
412
+ "model.layers.18.self_attn.v_proj.suh": "model-00005-of-00010.safetensors",
413
+ "model.layers.18.self_attn.v_proj.svh": "model-00005-of-00010.safetensors",
414
+ "model.layers.18.self_attn.v_proj.trellis": "model-00005-of-00010.safetensors",
415
+ "model.layers.18.mlp.down_proj.suh": "model-00005-of-00010.safetensors",
416
+ "model.layers.18.mlp.down_proj.svh": "model-00005-of-00010.safetensors",
417
+ "model.layers.18.mlp.down_proj.trellis": "model-00005-of-00010.safetensors",
418
+ "model.layers.18.mlp.gate_proj.suh": "model-00005-of-00010.safetensors",
419
+ "model.layers.18.mlp.gate_proj.svh": "model-00005-of-00010.safetensors",
420
+ "model.layers.18.mlp.gate_proj.trellis": "model-00005-of-00010.safetensors",
421
+ "model.layers.18.mlp.up_proj.suh": "model-00005-of-00010.safetensors",
422
+ "model.layers.18.mlp.up_proj.svh": "model-00005-of-00010.safetensors",
423
+ "model.layers.18.mlp.up_proj.trellis": "model-00005-of-00010.safetensors",
424
+ "model.layers.18.input_layernorm.weight": "model-00005-of-00010.safetensors",
425
+ "model.layers.19.self_attn.k_proj.suh": "model-00005-of-00010.safetensors",
426
+ "model.layers.19.self_attn.k_proj.svh": "model-00005-of-00010.safetensors",
427
+ "model.layers.19.self_attn.k_proj.trellis": "model-00005-of-00010.safetensors",
428
+ "model.layers.19.self_attn.o_proj.suh": "model-00005-of-00010.safetensors",
429
+ "model.layers.19.self_attn.o_proj.svh": "model-00005-of-00010.safetensors",
430
+ "model.layers.19.self_attn.o_proj.trellis": "model-00005-of-00010.safetensors",
431
+ "model.layers.19.self_attn.q_proj.suh": "model-00005-of-00010.safetensors",
432
+ "model.layers.19.self_attn.q_proj.svh": "model-00005-of-00010.safetensors",
433
+ "model.layers.19.self_attn.q_proj.trellis": "model-00005-of-00010.safetensors",
434
+ "model.layers.19.self_attn.v_proj.suh": "model-00005-of-00010.safetensors",
435
+ "model.layers.19.self_attn.v_proj.svh": "model-00005-of-00010.safetensors",
436
+ "model.layers.19.self_attn.v_proj.trellis": "model-00005-of-00010.safetensors",
437
+ "model.layers.19.mlp.down_proj.suh": "model-00005-of-00010.safetensors",
438
+ "model.layers.19.mlp.down_proj.svh": "model-00005-of-00010.safetensors",
439
+ "model.layers.19.mlp.down_proj.trellis": "model-00005-of-00010.safetensors",
440
+ "model.layers.19.mlp.gate_proj.suh": "model-00005-of-00010.safetensors",
441
+ "model.layers.19.mlp.gate_proj.svh": "model-00005-of-00010.safetensors",
442
+ "model.layers.19.mlp.gate_proj.trellis": "model-00005-of-00010.safetensors",
443
+ "model.layers.19.mlp.up_proj.suh": "model-00005-of-00010.safetensors",
444
+ "model.layers.19.mlp.up_proj.svh": "model-00005-of-00010.safetensors",
445
+ "model.layers.19.mlp.up_proj.trellis": "model-00005-of-00010.safetensors",
446
+ "model.layers.19.input_layernorm.weight": "model-00005-of-00010.safetensors",
447
+ "model.layers.20.self_attn.k_proj.suh": "model-00006-of-00010.safetensors",
448
+ "model.layers.20.self_attn.k_proj.svh": "model-00006-of-00010.safetensors",
449
+ "model.layers.20.self_attn.k_proj.trellis": "model-00006-of-00010.safetensors",
450
+ "model.layers.20.self_attn.o_proj.suh": "model-00006-of-00010.safetensors",
451
+ "model.layers.20.self_attn.o_proj.svh": "model-00006-of-00010.safetensors",
452
+ "model.layers.20.self_attn.o_proj.trellis": "model-00006-of-00010.safetensors",
453
+ "model.layers.20.self_attn.q_proj.suh": "model-00006-of-00010.safetensors",
454
+ "model.layers.20.self_attn.q_proj.svh": "model-00006-of-00010.safetensors",
455
+ "model.layers.20.self_attn.q_proj.trellis": "model-00006-of-00010.safetensors",
456
+ "model.layers.20.self_attn.v_proj.suh": "model-00006-of-00010.safetensors",
457
+ "model.layers.20.self_attn.v_proj.svh": "model-00006-of-00010.safetensors",
458
+ "model.layers.20.self_attn.v_proj.trellis": "model-00006-of-00010.safetensors",
459
+ "model.layers.20.mlp.down_proj.suh": "model-00006-of-00010.safetensors",
460
+ "model.layers.20.mlp.down_proj.svh": "model-00006-of-00010.safetensors",
461
+ "model.layers.20.mlp.down_proj.trellis": "model-00006-of-00010.safetensors",
462
+ "model.layers.20.mlp.gate_proj.suh": "model-00006-of-00010.safetensors",
463
+ "model.layers.20.mlp.gate_proj.svh": "model-00006-of-00010.safetensors",
464
+ "model.layers.20.mlp.gate_proj.trellis": "model-00006-of-00010.safetensors",
465
+ "model.layers.20.mlp.up_proj.suh": "model-00006-of-00010.safetensors",
466
+ "model.layers.20.mlp.up_proj.svh": "model-00006-of-00010.safetensors",
467
+ "model.layers.20.mlp.up_proj.trellis": "model-00006-of-00010.safetensors",
468
+ "model.layers.20.input_layernorm.weight": "model-00006-of-00010.safetensors",
469
+ "model.layers.21.self_attn.k_proj.suh": "model-00006-of-00010.safetensors",
470
+ "model.layers.21.self_attn.k_proj.svh": "model-00006-of-00010.safetensors",
471
+ "model.layers.21.self_attn.k_proj.trellis": "model-00006-of-00010.safetensors",
472
+ "model.layers.21.self_attn.o_proj.suh": "model-00006-of-00010.safetensors",
473
+ "model.layers.21.self_attn.o_proj.svh": "model-00006-of-00010.safetensors",
474
+ "model.layers.21.self_attn.o_proj.trellis": "model-00006-of-00010.safetensors",
475
+ "model.layers.21.self_attn.q_proj.suh": "model-00006-of-00010.safetensors",
476
+ "model.layers.21.self_attn.q_proj.svh": "model-00006-of-00010.safetensors",
477
+ "model.layers.21.self_attn.q_proj.trellis": "model-00006-of-00010.safetensors",
478
+ "model.layers.21.self_attn.v_proj.suh": "model-00006-of-00010.safetensors",
479
+ "model.layers.21.self_attn.v_proj.svh": "model-00006-of-00010.safetensors",
480
+ "model.layers.21.self_attn.v_proj.trellis": "model-00006-of-00010.safetensors",
481
+ "model.layers.21.mlp.down_proj.suh": "model-00006-of-00010.safetensors",
482
+ "model.layers.21.mlp.down_proj.svh": "model-00006-of-00010.safetensors",
483
+ "model.layers.21.mlp.down_proj.trellis": "model-00006-of-00010.safetensors",
484
+ "model.layers.21.mlp.gate_proj.suh": "model-00006-of-00010.safetensors",
485
+ "model.layers.21.mlp.gate_proj.svh": "model-00006-of-00010.safetensors",
486
+ "model.layers.21.mlp.gate_proj.trellis": "model-00006-of-00010.safetensors",
487
+ "model.layers.21.mlp.up_proj.suh": "model-00006-of-00010.safetensors",
488
+ "model.layers.21.mlp.up_proj.svh": "model-00006-of-00010.safetensors",
489
+ "model.layers.21.mlp.up_proj.trellis": "model-00006-of-00010.safetensors",
490
+ "model.layers.21.input_layernorm.weight": "model-00006-of-00010.safetensors",
491
+ "model.layers.22.self_attn.k_proj.suh": "model-00006-of-00010.safetensors",
492
+ "model.layers.22.self_attn.k_proj.svh": "model-00006-of-00010.safetensors",
493
+ "model.layers.22.self_attn.k_proj.trellis": "model-00006-of-00010.safetensors",
494
+ "model.layers.22.self_attn.o_proj.suh": "model-00006-of-00010.safetensors",
495
+ "model.layers.22.self_attn.o_proj.svh": "model-00006-of-00010.safetensors",
496
+ "model.layers.22.self_attn.o_proj.trellis": "model-00006-of-00010.safetensors",
497
+ "model.layers.22.self_attn.q_proj.suh": "model-00006-of-00010.safetensors",
498
+ "model.layers.22.self_attn.q_proj.svh": "model-00006-of-00010.safetensors",
499
+ "model.layers.22.self_attn.q_proj.trellis": "model-00006-of-00010.safetensors",
500
+ "model.layers.22.self_attn.v_proj.suh": "model-00006-of-00010.safetensors",
501
+ "model.layers.22.self_attn.v_proj.svh": "model-00006-of-00010.safetensors",
502
+ "model.layers.22.self_attn.v_proj.trellis": "model-00006-of-00010.safetensors",
503
+ "model.layers.22.mlp.down_proj.suh": "model-00006-of-00010.safetensors",
504
+ "model.layers.22.mlp.down_proj.svh": "model-00006-of-00010.safetensors",
505
+ "model.layers.22.mlp.down_proj.trellis": "model-00006-of-00010.safetensors",
506
+ "model.layers.22.mlp.gate_proj.suh": "model-00006-of-00010.safetensors",
507
+ "model.layers.22.mlp.gate_proj.svh": "model-00006-of-00010.safetensors",
508
+ "model.layers.22.mlp.gate_proj.trellis": "model-00006-of-00010.safetensors",
509
+ "model.layers.22.mlp.up_proj.suh": "model-00006-of-00010.safetensors",
510
+ "model.layers.22.mlp.up_proj.svh": "model-00006-of-00010.safetensors",
511
+ "model.layers.22.mlp.up_proj.trellis": "model-00006-of-00010.safetensors",
512
+ "model.layers.22.input_layernorm.weight": "model-00006-of-00010.safetensors",
513
+ "model.layers.23.self_attn.k_proj.suh": "model-00006-of-00010.safetensors",
514
+ "model.layers.23.self_attn.k_proj.svh": "model-00006-of-00010.safetensors",
515
+ "model.layers.23.self_attn.k_proj.trellis": "model-00006-of-00010.safetensors",
516
+ "model.layers.23.self_attn.o_proj.suh": "model-00006-of-00010.safetensors",
517
+ "model.layers.23.self_attn.o_proj.svh": "model-00006-of-00010.safetensors",
518
+ "model.layers.23.self_attn.o_proj.trellis": "model-00006-of-00010.safetensors",
519
+ "model.layers.23.self_attn.q_proj.suh": "model-00006-of-00010.safetensors",
520
+ "model.layers.23.self_attn.q_proj.svh": "model-00006-of-00010.safetensors",
521
+ "model.layers.23.self_attn.q_proj.trellis": "model-00006-of-00010.safetensors",
522
+ "model.layers.23.self_attn.v_proj.suh": "model-00006-of-00010.safetensors",
523
+ "model.layers.23.self_attn.v_proj.svh": "model-00006-of-00010.safetensors",
524
+ "model.layers.23.self_attn.v_proj.trellis": "model-00006-of-00010.safetensors",
525
+ "model.layers.23.mlp.down_proj.suh": "model-00006-of-00010.safetensors",
526
+ "model.layers.23.mlp.down_proj.svh": "model-00006-of-00010.safetensors",
527
+ "model.layers.23.mlp.down_proj.trellis": "model-00006-of-00010.safetensors",
528
+ "model.layers.23.mlp.gate_proj.suh": "model-00006-of-00010.safetensors",
529
+ "model.layers.23.mlp.gate_proj.svh": "model-00006-of-00010.safetensors",
530
+ "model.layers.23.mlp.gate_proj.trellis": "model-00006-of-00010.safetensors",
531
+ "model.layers.23.mlp.up_proj.suh": "model-00006-of-00010.safetensors",
532
+ "model.layers.23.mlp.up_proj.svh": "model-00006-of-00010.safetensors",
533
+ "model.layers.23.mlp.up_proj.trellis": "model-00006-of-00010.safetensors",
534
+ "model.layers.23.input_layernorm.weight": "model-00006-of-00010.safetensors",
535
+ "model.layers.24.self_attn.k_proj.suh": "model-00006-of-00010.safetensors",
536
+ "model.layers.24.self_attn.k_proj.svh": "model-00006-of-00010.safetensors",
537
+ "model.layers.24.self_attn.k_proj.trellis": "model-00006-of-00010.safetensors",
538
+ "model.layers.24.self_attn.o_proj.suh": "model-00006-of-00010.safetensors",
539
+ "model.layers.24.self_attn.o_proj.svh": "model-00006-of-00010.safetensors",
540
+ "model.layers.24.self_attn.o_proj.trellis": "model-00006-of-00010.safetensors",
541
+ "model.layers.24.self_attn.q_proj.suh": "model-00006-of-00010.safetensors",
542
+ "model.layers.24.self_attn.q_proj.svh": "model-00006-of-00010.safetensors",
543
+ "model.layers.24.self_attn.q_proj.trellis": "model-00006-of-00010.safetensors",
544
+ "model.layers.24.self_attn.v_proj.suh": "model-00006-of-00010.safetensors",
545
+ "model.layers.24.self_attn.v_proj.svh": "model-00006-of-00010.safetensors",
546
+ "model.layers.24.self_attn.v_proj.trellis": "model-00006-of-00010.safetensors",
547
+ "model.layers.24.mlp.down_proj.suh": "model-00006-of-00010.safetensors",
548
+ "model.layers.24.mlp.down_proj.svh": "model-00006-of-00010.safetensors",
549
+ "model.layers.24.mlp.down_proj.trellis": "model-00006-of-00010.safetensors",
550
+ "model.layers.24.mlp.gate_proj.suh": "model-00006-of-00010.safetensors",
551
+ "model.layers.24.mlp.gate_proj.svh": "model-00006-of-00010.safetensors",
552
+ "model.layers.24.mlp.gate_proj.trellis": "model-00006-of-00010.safetensors",
553
+ "model.layers.24.mlp.up_proj.suh": "model-00006-of-00010.safetensors",
554
+ "model.layers.24.mlp.up_proj.svh": "model-00006-of-00010.safetensors",
555
+ "model.layers.24.mlp.up_proj.trellis": "model-00006-of-00010.safetensors",
556
+ "model.layers.24.input_layernorm.weight": "model-00006-of-00010.safetensors",
557
+ "model.layers.25.self_attn.k_proj.suh": "model-00007-of-00010.safetensors",
558
+ "model.layers.25.self_attn.k_proj.svh": "model-00007-of-00010.safetensors",
559
+ "model.layers.25.self_attn.k_proj.trellis": "model-00007-of-00010.safetensors",
560
+ "model.layers.25.self_attn.o_proj.suh": "model-00007-of-00010.safetensors",
561
+ "model.layers.25.self_attn.o_proj.svh": "model-00007-of-00010.safetensors",
562
+ "model.layers.25.self_attn.o_proj.trellis": "model-00007-of-00010.safetensors",
563
+ "model.layers.25.self_attn.q_proj.suh": "model-00007-of-00010.safetensors",
564
+ "model.layers.25.self_attn.q_proj.svh": "model-00007-of-00010.safetensors",
565
+ "model.layers.25.self_attn.q_proj.trellis": "model-00007-of-00010.safetensors",
566
+ "model.layers.25.self_attn.v_proj.suh": "model-00007-of-00010.safetensors",
567
+ "model.layers.25.self_attn.v_proj.svh": "model-00007-of-00010.safetensors",
568
+ "model.layers.25.self_attn.v_proj.trellis": "model-00007-of-00010.safetensors",
569
+ "model.layers.25.mlp.down_proj.suh": "model-00007-of-00010.safetensors",
570
+ "model.layers.25.mlp.down_proj.svh": "model-00007-of-00010.safetensors",
571
+ "model.layers.25.mlp.down_proj.trellis": "model-00007-of-00010.safetensors",
572
+ "model.layers.25.mlp.gate_proj.suh": "model-00007-of-00010.safetensors",
573
+ "model.layers.25.mlp.gate_proj.svh": "model-00007-of-00010.safetensors",
574
+ "model.layers.25.mlp.gate_proj.trellis": "model-00007-of-00010.safetensors",
575
+ "model.layers.25.mlp.up_proj.suh": "model-00007-of-00010.safetensors",
576
+ "model.layers.25.mlp.up_proj.svh": "model-00007-of-00010.safetensors",
577
+ "model.layers.25.mlp.up_proj.trellis": "model-00007-of-00010.safetensors",
578
+ "model.layers.25.input_layernorm.weight": "model-00007-of-00010.safetensors",
579
+ "model.layers.26.self_attn.k_proj.suh": "model-00007-of-00010.safetensors",
580
+ "model.layers.26.self_attn.k_proj.svh": "model-00007-of-00010.safetensors",
581
+ "model.layers.26.self_attn.k_proj.trellis": "model-00007-of-00010.safetensors",
582
+ "model.layers.26.self_attn.o_proj.suh": "model-00007-of-00010.safetensors",
583
+ "model.layers.26.self_attn.o_proj.svh": "model-00007-of-00010.safetensors",
584
+ "model.layers.26.self_attn.o_proj.trellis": "model-00007-of-00010.safetensors",
585
+ "model.layers.26.self_attn.q_proj.suh": "model-00007-of-00010.safetensors",
586
+ "model.layers.26.self_attn.q_proj.svh": "model-00007-of-00010.safetensors",
587
+ "model.layers.26.self_attn.q_proj.trellis": "model-00007-of-00010.safetensors",
588
+ "model.layers.26.self_attn.v_proj.suh": "model-00007-of-00010.safetensors",
589
+ "model.layers.26.self_attn.v_proj.svh": "model-00007-of-00010.safetensors",
590
+ "model.layers.26.self_attn.v_proj.trellis": "model-00007-of-00010.safetensors",
591
+ "model.layers.26.mlp.down_proj.suh": "model-00007-of-00010.safetensors",
592
+ "model.layers.26.mlp.down_proj.svh": "model-00007-of-00010.safetensors",
593
+ "model.layers.26.mlp.down_proj.trellis": "model-00007-of-00010.safetensors",
594
+ "model.layers.26.mlp.gate_proj.suh": "model-00007-of-00010.safetensors",
595
+ "model.layers.26.mlp.gate_proj.svh": "model-00007-of-00010.safetensors",
596
+ "model.layers.26.mlp.gate_proj.trellis": "model-00007-of-00010.safetensors",
597
+ "model.layers.26.mlp.up_proj.suh": "model-00007-of-00010.safetensors",
598
+ "model.layers.26.mlp.up_proj.svh": "model-00007-of-00010.safetensors",
599
+ "model.layers.26.mlp.up_proj.trellis": "model-00007-of-00010.safetensors",
600
+ "model.layers.26.input_layernorm.weight": "model-00007-of-00010.safetensors",
601
+ "model.layers.27.self_attn.k_proj.suh": "model-00007-of-00010.safetensors",
602
+ "model.layers.27.self_attn.k_proj.svh": "model-00007-of-00010.safetensors",
603
+ "model.layers.27.self_attn.k_proj.trellis": "model-00007-of-00010.safetensors",
604
+ "model.layers.27.self_attn.o_proj.suh": "model-00007-of-00010.safetensors",
605
+ "model.layers.27.self_attn.o_proj.svh": "model-00007-of-00010.safetensors",
606
+ "model.layers.27.self_attn.o_proj.trellis": "model-00007-of-00010.safetensors",
607
+ "model.layers.27.self_attn.q_proj.suh": "model-00007-of-00010.safetensors",
608
+ "model.layers.27.self_attn.q_proj.svh": "model-00007-of-00010.safetensors",
609
+ "model.layers.27.self_attn.q_proj.trellis": "model-00007-of-00010.safetensors",
610
+ "model.layers.27.self_attn.v_proj.suh": "model-00007-of-00010.safetensors",
611
+ "model.layers.27.self_attn.v_proj.svh": "model-00007-of-00010.safetensors",
612
+ "model.layers.27.self_attn.v_proj.trellis": "model-00007-of-00010.safetensors",
613
+ "model.layers.27.mlp.down_proj.suh": "model-00007-of-00010.safetensors",
614
+ "model.layers.27.mlp.down_proj.svh": "model-00007-of-00010.safetensors",
615
+ "model.layers.27.mlp.down_proj.trellis": "model-00007-of-00010.safetensors",
616
+ "model.layers.27.mlp.gate_proj.suh": "model-00007-of-00010.safetensors",
617
+ "model.layers.27.mlp.gate_proj.svh": "model-00007-of-00010.safetensors",
618
+ "model.layers.27.mlp.gate_proj.trellis": "model-00007-of-00010.safetensors",
619
+ "model.layers.27.mlp.up_proj.suh": "model-00007-of-00010.safetensors",
620
+ "model.layers.27.mlp.up_proj.svh": "model-00007-of-00010.safetensors",
621
+ "model.layers.27.mlp.up_proj.trellis": "model-00007-of-00010.safetensors",
622
+ "model.layers.27.input_layernorm.weight": "model-00007-of-00010.safetensors",
623
+ "model.layers.28.self_attn.k_proj.suh": "model-00007-of-00010.safetensors",
624
+ "model.layers.28.self_attn.k_proj.svh": "model-00007-of-00010.safetensors",
625
+ "model.layers.28.self_attn.k_proj.trellis": "model-00007-of-00010.safetensors",
626
+ "model.layers.28.self_attn.o_proj.suh": "model-00007-of-00010.safetensors",
627
+ "model.layers.28.self_attn.o_proj.svh": "model-00007-of-00010.safetensors",
628
+ "model.layers.28.self_attn.o_proj.trellis": "model-00007-of-00010.safetensors",
629
+ "model.layers.28.self_attn.q_proj.suh": "model-00007-of-00010.safetensors",
630
+ "model.layers.28.self_attn.q_proj.svh": "model-00007-of-00010.safetensors",
631
+ "model.layers.28.self_attn.q_proj.trellis": "model-00007-of-00010.safetensors",
632
+ "model.layers.28.self_attn.v_proj.suh": "model-00007-of-00010.safetensors",
633
+ "model.layers.28.self_attn.v_proj.svh": "model-00007-of-00010.safetensors",
634
+ "model.layers.28.self_attn.v_proj.trellis": "model-00007-of-00010.safetensors",
635
+ "model.layers.28.mlp.down_proj.suh": "model-00007-of-00010.safetensors",
636
+ "model.layers.28.mlp.down_proj.svh": "model-00007-of-00010.safetensors",
637
+ "model.layers.28.mlp.down_proj.trellis": "model-00007-of-00010.safetensors",
638
+ "model.layers.28.mlp.gate_proj.suh": "model-00007-of-00010.safetensors",
639
+ "model.layers.28.mlp.gate_proj.svh": "model-00007-of-00010.safetensors",
640
+ "model.layers.28.mlp.gate_proj.trellis": "model-00007-of-00010.safetensors",
641
+ "model.layers.28.mlp.up_proj.suh": "model-00007-of-00010.safetensors",
642
+ "model.layers.28.mlp.up_proj.svh": "model-00007-of-00010.safetensors",
643
+ "model.layers.28.mlp.up_proj.trellis": "model-00007-of-00010.safetensors",
644
+ "model.layers.28.input_layernorm.weight": "model-00007-of-00010.safetensors",
645
+ "model.layers.29.self_attn.k_proj.suh": "model-00007-of-00010.safetensors",
646
+ "model.layers.29.self_attn.k_proj.svh": "model-00007-of-00010.safetensors",
647
+ "model.layers.29.self_attn.k_proj.trellis": "model-00007-of-00010.safetensors",
648
+ "model.layers.29.self_attn.o_proj.suh": "model-00007-of-00010.safetensors",
649
+ "model.layers.29.self_attn.o_proj.svh": "model-00007-of-00010.safetensors",
650
+ "model.layers.29.self_attn.o_proj.trellis": "model-00007-of-00010.safetensors",
651
+ "model.layers.29.self_attn.q_proj.suh": "model-00007-of-00010.safetensors",
652
+ "model.layers.29.self_attn.q_proj.svh": "model-00007-of-00010.safetensors",
653
+ "model.layers.29.self_attn.q_proj.trellis": "model-00007-of-00010.safetensors",
654
+ "model.layers.29.self_attn.v_proj.suh": "model-00007-of-00010.safetensors",
655
+ "model.layers.29.self_attn.v_proj.svh": "model-00007-of-00010.safetensors",
656
+ "model.layers.29.self_attn.v_proj.trellis": "model-00007-of-00010.safetensors",
657
+ "model.layers.29.mlp.down_proj.suh": "model-00007-of-00010.safetensors",
658
+ "model.layers.29.mlp.down_proj.svh": "model-00007-of-00010.safetensors",
659
+ "model.layers.29.mlp.down_proj.trellis": "model-00007-of-00010.safetensors",
660
+ "model.layers.29.mlp.gate_proj.suh": "model-00007-of-00010.safetensors",
661
+ "model.layers.29.mlp.gate_proj.svh": "model-00007-of-00010.safetensors",
662
+ "model.layers.29.mlp.gate_proj.trellis": "model-00007-of-00010.safetensors",
663
+ "model.layers.29.mlp.up_proj.suh": "model-00007-of-00010.safetensors",
664
+ "model.layers.29.mlp.up_proj.svh": "model-00007-of-00010.safetensors",
665
+ "model.layers.29.mlp.up_proj.trellis": "model-00007-of-00010.safetensors",
666
+ "model.layers.29.input_layernorm.weight": "model-00007-of-00010.safetensors",
667
+ "model.layers.30.self_attn.k_proj.suh": "model-00008-of-00010.safetensors",
668
+ "model.layers.30.self_attn.k_proj.svh": "model-00008-of-00010.safetensors",
669
+ "model.layers.30.self_attn.k_proj.trellis": "model-00008-of-00010.safetensors",
670
+ "model.layers.30.self_attn.o_proj.suh": "model-00008-of-00010.safetensors",
671
+ "model.layers.30.self_attn.o_proj.svh": "model-00008-of-00010.safetensors",
672
+ "model.layers.30.self_attn.o_proj.trellis": "model-00008-of-00010.safetensors",
673
+ "model.layers.30.self_attn.q_proj.suh": "model-00008-of-00010.safetensors",
674
+ "model.layers.30.self_attn.q_proj.svh": "model-00008-of-00010.safetensors",
675
+ "model.layers.30.self_attn.q_proj.trellis": "model-00008-of-00010.safetensors",
676
+ "model.layers.30.self_attn.v_proj.suh": "model-00008-of-00010.safetensors",
677
+ "model.layers.30.self_attn.v_proj.svh": "model-00008-of-00010.safetensors",
678
+ "model.layers.30.self_attn.v_proj.trellis": "model-00008-of-00010.safetensors",
679
+ "model.layers.30.mlp.down_proj.suh": "model-00008-of-00010.safetensors",
680
+ "model.layers.30.mlp.down_proj.svh": "model-00008-of-00010.safetensors",
681
+ "model.layers.30.mlp.down_proj.trellis": "model-00008-of-00010.safetensors",
682
+ "model.layers.30.mlp.gate_proj.suh": "model-00008-of-00010.safetensors",
683
+ "model.layers.30.mlp.gate_proj.svh": "model-00008-of-00010.safetensors",
684
+ "model.layers.30.mlp.gate_proj.trellis": "model-00008-of-00010.safetensors",
685
+ "model.layers.30.mlp.up_proj.suh": "model-00008-of-00010.safetensors",
686
+ "model.layers.30.mlp.up_proj.svh": "model-00008-of-00010.safetensors",
687
+ "model.layers.30.mlp.up_proj.trellis": "model-00008-of-00010.safetensors",
688
+ "model.layers.30.input_layernorm.weight": "model-00008-of-00010.safetensors",
689
+ "model.layers.31.self_attn.k_proj.suh": "model-00008-of-00010.safetensors",
690
+ "model.layers.31.self_attn.k_proj.svh": "model-00008-of-00010.safetensors",
691
+ "model.layers.31.self_attn.k_proj.trellis": "model-00008-of-00010.safetensors",
692
+ "model.layers.31.self_attn.o_proj.suh": "model-00008-of-00010.safetensors",
693
+ "model.layers.31.self_attn.o_proj.svh": "model-00008-of-00010.safetensors",
694
+ "model.layers.31.self_attn.o_proj.trellis": "model-00008-of-00010.safetensors",
695
+ "model.layers.31.self_attn.q_proj.suh": "model-00008-of-00010.safetensors",
696
+ "model.layers.31.self_attn.q_proj.svh": "model-00008-of-00010.safetensors",
697
+ "model.layers.31.self_attn.q_proj.trellis": "model-00008-of-00010.safetensors",
698
+ "model.layers.31.self_attn.v_proj.suh": "model-00008-of-00010.safetensors",
699
+ "model.layers.31.self_attn.v_proj.svh": "model-00008-of-00010.safetensors",
700
+ "model.layers.31.self_attn.v_proj.trellis": "model-00008-of-00010.safetensors",
701
+ "model.layers.31.mlp.down_proj.suh": "model-00008-of-00010.safetensors",
702
+ "model.layers.31.mlp.down_proj.svh": "model-00008-of-00010.safetensors",
703
+ "model.layers.31.mlp.down_proj.trellis": "model-00008-of-00010.safetensors",
704
+ "model.layers.31.mlp.gate_proj.suh": "model-00008-of-00010.safetensors",
705
+ "model.layers.31.mlp.gate_proj.svh": "model-00008-of-00010.safetensors",
706
+ "model.layers.31.mlp.gate_proj.trellis": "model-00008-of-00010.safetensors",
707
+ "model.layers.31.mlp.up_proj.suh": "model-00008-of-00010.safetensors",
708
+ "model.layers.31.mlp.up_proj.svh": "model-00008-of-00010.safetensors",
709
+ "model.layers.31.mlp.up_proj.trellis": "model-00008-of-00010.safetensors",
710
+ "model.layers.31.input_layernorm.weight": "model-00008-of-00010.safetensors",
711
+ "model.layers.32.self_attn.k_proj.suh": "model-00008-of-00010.safetensors",
712
+ "model.layers.32.self_attn.k_proj.svh": "model-00008-of-00010.safetensors",
713
+ "model.layers.32.self_attn.k_proj.trellis": "model-00008-of-00010.safetensors",
714
+ "model.layers.32.self_attn.o_proj.suh": "model-00008-of-00010.safetensors",
715
+ "model.layers.32.self_attn.o_proj.svh": "model-00008-of-00010.safetensors",
716
+ "model.layers.32.self_attn.o_proj.trellis": "model-00008-of-00010.safetensors",
717
+ "model.layers.32.self_attn.q_proj.suh": "model-00008-of-00010.safetensors",
718
+ "model.layers.32.self_attn.q_proj.svh": "model-00008-of-00010.safetensors",
719
+ "model.layers.32.self_attn.q_proj.trellis": "model-00008-of-00010.safetensors",
720
+ "model.layers.32.self_attn.v_proj.suh": "model-00008-of-00010.safetensors",
721
+ "model.layers.32.self_attn.v_proj.svh": "model-00008-of-00010.safetensors",
722
+ "model.layers.32.self_attn.v_proj.trellis": "model-00008-of-00010.safetensors",
723
+ "model.layers.32.mlp.down_proj.suh": "model-00008-of-00010.safetensors",
724
+ "model.layers.32.mlp.down_proj.svh": "model-00008-of-00010.safetensors",
725
+ "model.layers.32.mlp.down_proj.trellis": "model-00008-of-00010.safetensors",
726
+ "model.layers.32.mlp.gate_proj.suh": "model-00008-of-00010.safetensors",
727
+ "model.layers.32.mlp.gate_proj.svh": "model-00008-of-00010.safetensors",
728
+ "model.layers.32.mlp.gate_proj.trellis": "model-00008-of-00010.safetensors",
729
+ "model.layers.32.mlp.up_proj.suh": "model-00008-of-00010.safetensors",
730
+ "model.layers.32.mlp.up_proj.svh": "model-00008-of-00010.safetensors",
731
+ "model.layers.32.mlp.up_proj.trellis": "model-00008-of-00010.safetensors",
732
+ "model.layers.32.input_layernorm.weight": "model-00008-of-00010.safetensors",
733
+ "model.layers.33.self_attn.k_proj.suh": "model-00008-of-00010.safetensors",
734
+ "model.layers.33.self_attn.k_proj.svh": "model-00008-of-00010.safetensors",
735
+ "model.layers.33.self_attn.k_proj.trellis": "model-00008-of-00010.safetensors",
736
+ "model.layers.33.self_attn.o_proj.suh": "model-00008-of-00010.safetensors",
737
+ "model.layers.33.self_attn.o_proj.svh": "model-00008-of-00010.safetensors",
738
+ "model.layers.33.self_attn.o_proj.trellis": "model-00008-of-00010.safetensors",
739
+ "model.layers.33.self_attn.q_proj.suh": "model-00008-of-00010.safetensors",
740
+ "model.layers.33.self_attn.q_proj.svh": "model-00008-of-00010.safetensors",
741
+ "model.layers.33.self_attn.q_proj.trellis": "model-00008-of-00010.safetensors",
742
+ "model.layers.33.self_attn.v_proj.suh": "model-00008-of-00010.safetensors",
743
+ "model.layers.33.self_attn.v_proj.svh": "model-00008-of-00010.safetensors",
744
+ "model.layers.33.self_attn.v_proj.trellis": "model-00008-of-00010.safetensors",
745
+ "model.layers.33.mlp.down_proj.suh": "model-00008-of-00010.safetensors",
746
+ "model.layers.33.mlp.down_proj.svh": "model-00008-of-00010.safetensors",
747
+ "model.layers.33.mlp.down_proj.trellis": "model-00008-of-00010.safetensors",
748
+ "model.layers.33.mlp.gate_proj.suh": "model-00008-of-00010.safetensors",
749
+ "model.layers.33.mlp.gate_proj.svh": "model-00008-of-00010.safetensors",
750
+ "model.layers.33.mlp.gate_proj.trellis": "model-00008-of-00010.safetensors",
751
+ "model.layers.33.mlp.up_proj.suh": "model-00008-of-00010.safetensors",
752
+ "model.layers.33.mlp.up_proj.svh": "model-00008-of-00010.safetensors",
753
+ "model.layers.33.mlp.up_proj.trellis": "model-00008-of-00010.safetensors",
754
+ "model.layers.33.input_layernorm.weight": "model-00008-of-00010.safetensors",
755
+ "model.layers.34.self_attn.k_proj.suh": "model-00008-of-00010.safetensors",
756
+ "model.layers.34.self_attn.k_proj.svh": "model-00008-of-00010.safetensors",
757
+ "model.layers.34.self_attn.k_proj.trellis": "model-00008-of-00010.safetensors",
758
+ "model.layers.34.self_attn.o_proj.suh": "model-00008-of-00010.safetensors",
759
+ "model.layers.34.self_attn.o_proj.svh": "model-00008-of-00010.safetensors",
760
+ "model.layers.34.self_attn.o_proj.trellis": "model-00008-of-00010.safetensors",
761
+ "model.layers.34.self_attn.q_proj.suh": "model-00008-of-00010.safetensors",
762
+ "model.layers.34.self_attn.q_proj.svh": "model-00008-of-00010.safetensors",
763
+ "model.layers.34.self_attn.q_proj.trellis": "model-00008-of-00010.safetensors",
764
+ "model.layers.34.self_attn.v_proj.suh": "model-00008-of-00010.safetensors",
765
+ "model.layers.34.self_attn.v_proj.svh": "model-00008-of-00010.safetensors",
766
+ "model.layers.34.self_attn.v_proj.trellis": "model-00008-of-00010.safetensors",
767
+ "model.layers.34.mlp.down_proj.suh": "model-00008-of-00010.safetensors",
768
+ "model.layers.34.mlp.down_proj.svh": "model-00008-of-00010.safetensors",
769
+ "model.layers.34.mlp.down_proj.trellis": "model-00008-of-00010.safetensors",
770
+ "model.layers.34.mlp.gate_proj.suh": "model-00008-of-00010.safetensors",
771
+ "model.layers.34.mlp.gate_proj.svh": "model-00008-of-00010.safetensors",
772
+ "model.layers.34.mlp.gate_proj.trellis": "model-00008-of-00010.safetensors",
773
+ "model.layers.34.mlp.up_proj.suh": "model-00008-of-00010.safetensors",
774
+ "model.layers.34.mlp.up_proj.svh": "model-00008-of-00010.safetensors",
775
+ "model.layers.34.mlp.up_proj.trellis": "model-00008-of-00010.safetensors",
776
+ "model.layers.34.input_layernorm.weight": "model-00008-of-00010.safetensors",
777
+ "model.layers.35.self_attn.k_proj.suh": "model-00009-of-00010.safetensors",
778
+ "model.layers.35.self_attn.k_proj.svh": "model-00009-of-00010.safetensors",
779
+ "model.layers.35.self_attn.k_proj.trellis": "model-00009-of-00010.safetensors",
780
+ "model.layers.35.self_attn.o_proj.suh": "model-00009-of-00010.safetensors",
781
+ "model.layers.35.self_attn.o_proj.svh": "model-00009-of-00010.safetensors",
782
+ "model.layers.35.self_attn.o_proj.trellis": "model-00009-of-00010.safetensors",
783
+ "model.layers.35.self_attn.q_proj.suh": "model-00009-of-00010.safetensors",
784
+ "model.layers.35.self_attn.q_proj.svh": "model-00009-of-00010.safetensors",
785
+ "model.layers.35.self_attn.q_proj.trellis": "model-00009-of-00010.safetensors",
786
+ "model.layers.35.self_attn.v_proj.suh": "model-00009-of-00010.safetensors",
787
+ "model.layers.35.self_attn.v_proj.svh": "model-00009-of-00010.safetensors",
788
+ "model.layers.35.self_attn.v_proj.trellis": "model-00009-of-00010.safetensors",
789
+ "model.layers.35.mlp.down_proj.suh": "model-00009-of-00010.safetensors",
790
+ "model.layers.35.mlp.down_proj.svh": "model-00009-of-00010.safetensors",
791
+ "model.layers.35.mlp.down_proj.trellis": "model-00009-of-00010.safetensors",
792
+ "model.layers.35.mlp.gate_proj.suh": "model-00009-of-00010.safetensors",
793
+ "model.layers.35.mlp.gate_proj.svh": "model-00009-of-00010.safetensors",
794
+ "model.layers.35.mlp.gate_proj.trellis": "model-00009-of-00010.safetensors",
795
+ "model.layers.35.mlp.up_proj.suh": "model-00009-of-00010.safetensors",
796
+ "model.layers.35.mlp.up_proj.svh": "model-00009-of-00010.safetensors",
797
+ "model.layers.35.mlp.up_proj.trellis": "model-00009-of-00010.safetensors",
798
+ "model.layers.35.input_layernorm.weight": "model-00009-of-00010.safetensors",
799
+ "model.layers.36.self_attn.k_proj.suh": "model-00009-of-00010.safetensors",
800
+ "model.layers.36.self_attn.k_proj.svh": "model-00009-of-00010.safetensors",
801
+ "model.layers.36.self_attn.k_proj.trellis": "model-00009-of-00010.safetensors",
802
+ "model.layers.36.self_attn.o_proj.suh": "model-00009-of-00010.safetensors",
803
+ "model.layers.36.self_attn.o_proj.svh": "model-00009-of-00010.safetensors",
804
+ "model.layers.36.self_attn.o_proj.trellis": "model-00009-of-00010.safetensors",
805
+ "model.layers.36.self_attn.q_proj.suh": "model-00009-of-00010.safetensors",
806
+ "model.layers.36.self_attn.q_proj.svh": "model-00009-of-00010.safetensors",
807
+ "model.layers.36.self_attn.q_proj.trellis": "model-00009-of-00010.safetensors",
808
+ "model.layers.36.self_attn.v_proj.suh": "model-00009-of-00010.safetensors",
809
+ "model.layers.36.self_attn.v_proj.svh": "model-00009-of-00010.safetensors",
810
+ "model.layers.36.self_attn.v_proj.trellis": "model-00009-of-00010.safetensors",
811
+ "model.layers.36.mlp.down_proj.suh": "model-00009-of-00010.safetensors",
812
+ "model.layers.36.mlp.down_proj.svh": "model-00009-of-00010.safetensors",
813
+ "model.layers.36.mlp.down_proj.trellis": "model-00009-of-00010.safetensors",
814
+ "model.layers.36.mlp.gate_proj.suh": "model-00009-of-00010.safetensors",
815
+ "model.layers.36.mlp.gate_proj.svh": "model-00009-of-00010.safetensors",
816
+ "model.layers.36.mlp.gate_proj.trellis": "model-00009-of-00010.safetensors",
817
+ "model.layers.36.mlp.up_proj.suh": "model-00009-of-00010.safetensors",
818
+ "model.layers.36.mlp.up_proj.svh": "model-00009-of-00010.safetensors",
819
+ "model.layers.36.mlp.up_proj.trellis": "model-00009-of-00010.safetensors",
820
+ "model.layers.36.input_layernorm.weight": "model-00009-of-00010.safetensors",
821
+ "model.layers.37.self_attn.k_proj.suh": "model-00009-of-00010.safetensors",
822
+ "model.layers.37.self_attn.k_proj.svh": "model-00009-of-00010.safetensors",
823
+ "model.layers.37.self_attn.k_proj.trellis": "model-00009-of-00010.safetensors",
824
+ "model.layers.37.self_attn.o_proj.suh": "model-00009-of-00010.safetensors",
825
+ "model.layers.37.self_attn.o_proj.svh": "model-00009-of-00010.safetensors",
826
+ "model.layers.37.self_attn.o_proj.trellis": "model-00009-of-00010.safetensors",
827
+ "model.layers.37.self_attn.q_proj.suh": "model-00009-of-00010.safetensors",
828
+ "model.layers.37.self_attn.q_proj.svh": "model-00009-of-00010.safetensors",
829
+ "model.layers.37.self_attn.q_proj.trellis": "model-00009-of-00010.safetensors",
830
+ "model.layers.37.self_attn.v_proj.suh": "model-00009-of-00010.safetensors",
831
+ "model.layers.37.self_attn.v_proj.svh": "model-00009-of-00010.safetensors",
832
+ "model.layers.37.self_attn.v_proj.trellis": "model-00009-of-00010.safetensors",
833
+ "model.layers.37.mlp.down_proj.suh": "model-00009-of-00010.safetensors",
834
+ "model.layers.37.mlp.down_proj.svh": "model-00009-of-00010.safetensors",
835
+ "model.layers.37.mlp.down_proj.trellis": "model-00009-of-00010.safetensors",
836
+ "model.layers.37.mlp.gate_proj.suh": "model-00009-of-00010.safetensors",
837
+ "model.layers.37.mlp.gate_proj.svh": "model-00009-of-00010.safetensors",
838
+ "model.layers.37.mlp.gate_proj.trellis": "model-00009-of-00010.safetensors",
839
+ "model.layers.37.mlp.up_proj.suh": "model-00009-of-00010.safetensors",
840
+ "model.layers.37.mlp.up_proj.svh": "model-00009-of-00010.safetensors",
841
+ "model.layers.37.mlp.up_proj.trellis": "model-00009-of-00010.safetensors",
842
+ "model.layers.37.input_layernorm.weight": "model-00009-of-00010.safetensors",
843
+ "model.layers.38.self_attn.k_proj.suh": "model-00009-of-00010.safetensors",
844
+ "model.layers.38.self_attn.k_proj.svh": "model-00009-of-00010.safetensors",
845
+ "model.layers.38.self_attn.k_proj.trellis": "model-00009-of-00010.safetensors",
846
+ "model.layers.38.self_attn.o_proj.suh": "model-00009-of-00010.safetensors",
847
+ "model.layers.38.self_attn.o_proj.svh": "model-00009-of-00010.safetensors",
848
+ "model.layers.38.self_attn.o_proj.trellis": "model-00009-of-00010.safetensors",
849
+ "model.layers.38.self_attn.q_proj.suh": "model-00009-of-00010.safetensors",
850
+ "model.layers.38.self_attn.q_proj.svh": "model-00009-of-00010.safetensors",
851
+ "model.layers.38.self_attn.q_proj.trellis": "model-00009-of-00010.safetensors",
852
+ "model.layers.38.self_attn.v_proj.suh": "model-00009-of-00010.safetensors",
853
+ "model.layers.38.self_attn.v_proj.svh": "model-00009-of-00010.safetensors",
854
+ "model.layers.38.self_attn.v_proj.trellis": "model-00009-of-00010.safetensors",
855
+ "model.layers.38.mlp.down_proj.suh": "model-00009-of-00010.safetensors",
856
+ "model.layers.38.mlp.down_proj.svh": "model-00009-of-00010.safetensors",
857
+ "model.layers.38.mlp.down_proj.trellis": "model-00009-of-00010.safetensors",
858
+ "model.layers.38.mlp.gate_proj.suh": "model-00009-of-00010.safetensors",
859
+ "model.layers.38.mlp.gate_proj.svh": "model-00009-of-00010.safetensors",
860
+ "model.layers.38.mlp.gate_proj.trellis": "model-00009-of-00010.safetensors",
861
+ "model.layers.38.mlp.up_proj.suh": "model-00009-of-00010.safetensors",
862
+ "model.layers.38.mlp.up_proj.svh": "model-00009-of-00010.safetensors",
863
+ "model.layers.38.mlp.up_proj.trellis": "model-00009-of-00010.safetensors",
864
+ "model.layers.38.input_layernorm.weight": "model-00009-of-00010.safetensors",
865
+ "model.layers.39.self_attn.k_proj.suh": "model-00009-of-00010.safetensors",
866
+ "model.layers.39.self_attn.k_proj.svh": "model-00009-of-00010.safetensors",
867
+ "model.layers.39.self_attn.k_proj.trellis": "model-00009-of-00010.safetensors",
868
+ "model.layers.39.self_attn.o_proj.suh": "model-00009-of-00010.safetensors",
869
+ "model.layers.39.self_attn.o_proj.svh": "model-00009-of-00010.safetensors",
870
+ "model.layers.39.self_attn.o_proj.trellis": "model-00009-of-00010.safetensors",
871
+ "model.layers.39.self_attn.q_proj.suh": "model-00009-of-00010.safetensors",
872
+ "model.layers.39.self_attn.q_proj.svh": "model-00009-of-00010.safetensors",
873
+ "model.layers.39.self_attn.q_proj.trellis": "model-00009-of-00010.safetensors",
874
+ "model.layers.39.self_attn.v_proj.suh": "model-00009-of-00010.safetensors",
875
+ "model.layers.39.self_attn.v_proj.svh": "model-00009-of-00010.safetensors",
876
+ "model.layers.39.self_attn.v_proj.trellis": "model-00009-of-00010.safetensors",
877
+ "model.layers.39.mlp.down_proj.suh": "model-00009-of-00010.safetensors",
878
+ "model.layers.39.mlp.down_proj.svh": "model-00009-of-00010.safetensors",
879
+ "model.layers.39.mlp.down_proj.trellis": "model-00009-of-00010.safetensors",
880
+ "model.layers.39.mlp.gate_proj.suh": "model-00009-of-00010.safetensors",
881
+ "model.layers.39.mlp.gate_proj.svh": "model-00009-of-00010.safetensors",
882
+ "model.layers.39.mlp.gate_proj.trellis": "model-00009-of-00010.safetensors",
883
+ "model.layers.39.mlp.up_proj.suh": "model-00009-of-00010.safetensors",
884
+ "model.layers.39.mlp.up_proj.svh": "model-00009-of-00010.safetensors",
885
+ "model.layers.39.mlp.up_proj.trellis": "model-00009-of-00010.safetensors",
886
+ "model.layers.39.input_layernorm.weight": "model-00009-of-00010.safetensors",
887
+ "model.norm.weight": "model-00009-of-00010.safetensors",
888
+ "lm_head.suh": "model-00010-of-00010.safetensors",
889
+ "lm_head.svh": "model-00010-of-00010.safetensors",
890
+ "lm_head.trellis": "model-00010-of-00010.safetensors"
891
+ }
892
+ }
quantization_config.json ADDED
The diff for this file is too large to render. See raw diff
 
special_tokens_map.json ADDED
@@ -0,0 +1,23 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<BOS_TOKEN>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "<|END_OF_TURN_TOKEN|>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "<PAD>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ }
23
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c69a7ea6c0927dfac8c349186ebcf0466a4723c21cbdb2e850cf559f0bee92b8
3
+ size 12777433
tokenizer_config.json ADDED
@@ -0,0 +1,330 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "add_prefix_space": false,
5
+ "added_tokens_decoder": {
6
+ "0": {
7
+ "content": "<PAD>",
8
+ "lstrip": false,
9
+ "normalized": false,
10
+ "rstrip": false,
11
+ "single_word": false,
12
+ "special": true
13
+ },
14
+ "1": {
15
+ "content": "<UNK>",
16
+ "lstrip": false,
17
+ "normalized": false,
18
+ "rstrip": false,
19
+ "single_word": false,
20
+ "special": true
21
+ },
22
+ "2": {
23
+ "content": "<CLS>",
24
+ "lstrip": false,
25
+ "normalized": false,
26
+ "rstrip": false,
27
+ "single_word": false,
28
+ "special": true
29
+ },
30
+ "3": {
31
+ "content": "<SEP>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false,
36
+ "special": true
37
+ },
38
+ "4": {
39
+ "content": "<MASK_TOKEN>",
40
+ "lstrip": false,
41
+ "normalized": false,
42
+ "rstrip": false,
43
+ "single_word": false,
44
+ "special": true
45
+ },
46
+ "5": {
47
+ "content": "<BOS_TOKEN>",
48
+ "lstrip": false,
49
+ "normalized": false,
50
+ "rstrip": false,
51
+ "single_word": false,
52
+ "special": true
53
+ },
54
+ "6": {
55
+ "content": "<EOS_TOKEN>",
56
+ "lstrip": false,
57
+ "normalized": false,
58
+ "rstrip": false,
59
+ "single_word": false,
60
+ "special": true
61
+ },
62
+ "7": {
63
+ "content": "<EOP_TOKEN>",
64
+ "lstrip": false,
65
+ "normalized": false,
66
+ "rstrip": false,
67
+ "single_word": false,
68
+ "special": true
69
+ },
70
+ "255000": {
71
+ "content": "<|START_OF_TURN_TOKEN|>",
72
+ "lstrip": false,
73
+ "normalized": false,
74
+ "rstrip": false,
75
+ "single_word": false,
76
+ "special": false
77
+ },
78
+ "255001": {
79
+ "content": "<|END_OF_TURN_TOKEN|>",
80
+ "lstrip": false,
81
+ "normalized": false,
82
+ "rstrip": false,
83
+ "single_word": false,
84
+ "special": true
85
+ },
86
+ "255002": {
87
+ "content": "<|YES_TOKEN|>",
88
+ "lstrip": false,
89
+ "normalized": false,
90
+ "rstrip": false,
91
+ "single_word": false,
92
+ "special": false
93
+ },
94
+ "255003": {
95
+ "content": "<|NO_TOKEN|>",
96
+ "lstrip": false,
97
+ "normalized": false,
98
+ "rstrip": false,
99
+ "single_word": false,
100
+ "special": false
101
+ },
102
+ "255004": {
103
+ "content": "<|GOOD_TOKEN|>",
104
+ "lstrip": false,
105
+ "normalized": false,
106
+ "rstrip": false,
107
+ "single_word": false,
108
+ "special": false
109
+ },
110
+ "255005": {
111
+ "content": "<|BAD_TOKEN|>",
112
+ "lstrip": false,
113
+ "normalized": false,
114
+ "rstrip": false,
115
+ "single_word": false,
116
+ "special": false
117
+ },
118
+ "255006": {
119
+ "content": "<|USER_TOKEN|>",
120
+ "lstrip": false,
121
+ "normalized": false,
122
+ "rstrip": false,
123
+ "single_word": false,
124
+ "special": false
125
+ },
126
+ "255007": {
127
+ "content": "<|CHATBOT_TOKEN|>",
128
+ "lstrip": false,
129
+ "normalized": false,
130
+ "rstrip": false,
131
+ "single_word": false,
132
+ "special": false
133
+ },
134
+ "255008": {
135
+ "content": "<|SYSTEM_TOKEN|>",
136
+ "lstrip": false,
137
+ "normalized": false,
138
+ "rstrip": false,
139
+ "single_word": false,
140
+ "special": false
141
+ },
142
+ "255009": {
143
+ "content": "<|USER_0_TOKEN|>",
144
+ "lstrip": false,
145
+ "normalized": false,
146
+ "rstrip": false,
147
+ "single_word": false,
148
+ "special": false
149
+ },
150
+ "255010": {
151
+ "content": "<|USER_1_TOKEN|>",
152
+ "lstrip": false,
153
+ "normalized": false,
154
+ "rstrip": false,
155
+ "single_word": false,
156
+ "special": false
157
+ },
158
+ "255011": {
159
+ "content": "<|USER_2_TOKEN|>",
160
+ "lstrip": false,
161
+ "normalized": false,
162
+ "rstrip": false,
163
+ "single_word": false,
164
+ "special": false
165
+ },
166
+ "255012": {
167
+ "content": "<|USER_3_TOKEN|>",
168
+ "lstrip": false,
169
+ "normalized": false,
170
+ "rstrip": false,
171
+ "single_word": false,
172
+ "special": false
173
+ },
174
+ "255013": {
175
+ "content": "<|USER_4_TOKEN|>",
176
+ "lstrip": false,
177
+ "normalized": false,
178
+ "rstrip": false,
179
+ "single_word": false,
180
+ "special": false
181
+ },
182
+ "255014": {
183
+ "content": "<|USER_5_TOKEN|>",
184
+ "lstrip": false,
185
+ "normalized": false,
186
+ "rstrip": false,
187
+ "single_word": false,
188
+ "special": false
189
+ },
190
+ "255015": {
191
+ "content": "<|USER_6_TOKEN|>",
192
+ "lstrip": false,
193
+ "normalized": false,
194
+ "rstrip": false,
195
+ "single_word": false,
196
+ "special": false
197
+ },
198
+ "255016": {
199
+ "content": "<|USER_7_TOKEN|>",
200
+ "lstrip": false,
201
+ "normalized": false,
202
+ "rstrip": false,
203
+ "single_word": false,
204
+ "special": false
205
+ },
206
+ "255017": {
207
+ "content": "<|USER_8_TOKEN|>",
208
+ "lstrip": false,
209
+ "normalized": false,
210
+ "rstrip": false,
211
+ "single_word": false,
212
+ "special": false
213
+ },
214
+ "255018": {
215
+ "content": "<|USER_9_TOKEN|>",
216
+ "lstrip": false,
217
+ "normalized": false,
218
+ "rstrip": false,
219
+ "single_word": false,
220
+ "special": false
221
+ },
222
+ "255019": {
223
+ "content": "<|EXTRA_0_TOKEN|>",
224
+ "lstrip": false,
225
+ "normalized": false,
226
+ "rstrip": false,
227
+ "single_word": false,
228
+ "special": false
229
+ },
230
+ "255020": {
231
+ "content": "<|EXTRA_1_TOKEN|>",
232
+ "lstrip": false,
233
+ "normalized": false,
234
+ "rstrip": false,
235
+ "single_word": false,
236
+ "special": false
237
+ },
238
+ "255021": {
239
+ "content": "<|EXTRA_2_TOKEN|>",
240
+ "lstrip": false,
241
+ "normalized": false,
242
+ "rstrip": false,
243
+ "single_word": false,
244
+ "special": false
245
+ },
246
+ "255022": {
247
+ "content": "<|EXTRA_3_TOKEN|>",
248
+ "lstrip": false,
249
+ "normalized": false,
250
+ "rstrip": false,
251
+ "single_word": false,
252
+ "special": false
253
+ },
254
+ "255023": {
255
+ "content": "<|EXTRA_4_TOKEN|>",
256
+ "lstrip": false,
257
+ "normalized": false,
258
+ "rstrip": false,
259
+ "single_word": false,
260
+ "special": false
261
+ },
262
+ "255024": {
263
+ "content": "<|EXTRA_5_TOKEN|>",
264
+ "lstrip": false,
265
+ "normalized": false,
266
+ "rstrip": false,
267
+ "single_word": false,
268
+ "special": false
269
+ },
270
+ "255025": {
271
+ "content": "<|EXTRA_6_TOKEN|>",
272
+ "lstrip": false,
273
+ "normalized": false,
274
+ "rstrip": false,
275
+ "single_word": false,
276
+ "special": false
277
+ },
278
+ "255026": {
279
+ "content": "<|EXTRA_7_TOKEN|>",
280
+ "lstrip": false,
281
+ "normalized": false,
282
+ "rstrip": false,
283
+ "single_word": false,
284
+ "special": false
285
+ },
286
+ "255027": {
287
+ "content": "<|EXTRA_8_TOKEN|>",
288
+ "lstrip": false,
289
+ "normalized": false,
290
+ "rstrip": false,
291
+ "single_word": false,
292
+ "special": false
293
+ },
294
+ "255028": {
295
+ "content": "<|EXTRA_9_TOKEN|>",
296
+ "lstrip": false,
297
+ "normalized": false,
298
+ "rstrip": false,
299
+ "single_word": false,
300
+ "special": false
301
+ }
302
+ },
303
+ "bos_token": "<BOS_TOKEN>",
304
+ "chat_template": [
305
+ {
306
+ "name": "default",
307
+ "template": "{{ bos_token }}{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% elif false == true %}{% set loop_messages = messages %}{% set system_message = 'You are Command-R, a brilliant, sophisticated, AI-assistant trained to assist human users by providing thorough responses. You are trained by Cohere.' %}{% else %}{% set loop_messages = messages %}{% set system_message = false %}{% endif %}{% if system_message != false %}{{ '<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>' + system_message + '<|END_OF_TURN_TOKEN|>' }}{% endif %}{% for message in loop_messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% set content = message['content'] %}{% if message['role'] == 'user' %}{{ '<|START_OF_TURN_TOKEN|><|USER_TOKEN|>' + content.strip() + '<|END_OF_TURN_TOKEN|>' }}{% elif message['role'] == 'assistant' %}{{ '<|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>' + content.strip() + '<|END_OF_TURN_TOKEN|>' }}{% endif %}{% endfor %}{% if add_generation_prompt %}{{ '<|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>' }}{% endif %}"
308
+ },
309
+ {
310
+ "name": "tool_use",
311
+ "template": "\n{%- macro json_to_python_type(json_spec) %}\n{%- set basic_type_map = {\n \"string\": \"str\",\n \"number\": \"float\",\n \"integer\": \"int\",\n \"boolean\": \"bool\"\n} %}\n\n{%- if basic_type_map[json_spec.type] is defined %}\n {{- basic_type_map[json_spec.type] }}\n{%- elif json_spec.type == \"array\" %}\n {{- \"List[\" + json_to_python_type(json_spec.items) + \"]\"}}\n{%- elif json_spec.type == \"object\" %}\n {{- \"Dict[str, \" + json_to_python_type(json_spec.additionalProperties) + ']'}}\n{%- elif json_spec.type is iterable %}\n {{- \"Union[\" }}\n {%- for t in json_spec.type %}\n {{- json_to_python_type({\"type\": t}) }}\n {%- if not loop.last %}\n {{- \",\" }} \n {%- endif %}\n {%- endfor %}\n {{- \"]\" }}\n{%- else %}\n {{- \"Any\" }}\n{%- endif %}\n{%- endmacro %}\n\n{%- macro old_tool_parser(tools) %}\n{%- for tool in tools %}\n {%- if loop.index0 != 0 %}\n {{- '\\n\\n' }}\n {%- endif %}\n {{- '```python\\ndef ' + tool.name + '(' }}\n {%- for param_name, param_fields in tool.parameter_definitions|items %}\n {%- if loop.index0 != 0 %}\n {{- ', '}}\n {%- endif %}\n {{- param_name + ': ' }}\n {%- if not param_fields.required %}\n {{- 'Optional[' + param_fields.type + '] = None'}}\n {%- else %}\n {{- param_fields.type }}\n {%- endif %}\n {%- endfor %}\n {{- ') -> List[Dict]:\\n \"\"\"'}}\n {{- tool.description }}\n {%- if tool.parameter_definitions|length != 0 %}\n {{- '\\n\\n Args:\\n '}}\n {%- for param_name, param_fields in tool.parameter_definitions|items %}\n {%- if loop.index0 != 0 %}\n {{- '\\n ' }}\n {%- endif %}\n {{- param_name + ' ('}}\n {%- if not param_fields.required %}\n {{- 'Optional[' + param_fields.type + ']'}}\n {%- else %}\n {{- param_fields.type }}\n {%- endif %}\n {{- '): ' + param_fields.description }}\n {%- endfor %}\n {%- endif %}\n {{- '\\n \"\"\"\\n pass\\n```' }}\n{%- endfor %}\n{%- endmacro %}\n\n{%- macro new_tool_parser(tools) %}\n{%- for tool in tools %}\n {%- if loop.index0 != 0 %}\n {{- '\\n\\n'}}\n {%- endif %}\n {%- if tool.function is defined %}\n {%- set tool = tool.function %}\n {%- endif %}\n {{-'```python\ndef ' + tool.name + '('}}\n {%- for param_name, param_fields in tool.parameters.properties|items %}\n {%- if loop.index0 != 0 %}\n {{- ', '}}\n {%- endif %}\n {{-param_name + \": \"}} \n {%- if not param_name in tool.parameters.required %}\n {{-'Optional[' + json_to_python_type(param_fields) + '] = None'}}\n {%- else %}\n {{- json_to_python_type(param_fields) }}\n {%- endif %}\n {%- endfor %}\n {{- ') -> List[Dict]:\n \"\"\"'}}\n {{- tool.description }}\n {%- if tool.parameters.properties|length != 0 %}\n {{- '\\n\\n Args:\\n '}}\n {%- for param_name, param_fields in tool.parameters.properties|items %}\n {%- if loop.index0 != 0 %}\n {{- '\\n ' }}\n {%- endif %}\n {{- param_name + ' ('}}\n {%- if not param_name in tool.parameters.required %}\n {{-'Optional[' + json_to_python_type(param_fields) + ']'}}\n {%- else %}\n {{- json_to_python_type(param_fields) }}\n {%- endif %}\n {{- '): ' + param_fields.description }}\n {%- endfor %}\n {%- endif %}\n {{- '\\n \"\"\"\\n pass\\n```' }}\n{%- endfor %}\n{%- endmacro %}\n\n{{- bos_token }}\n{%- if messages[0]['role'] == 'system' %}\n {%- set loop_messages = messages[1:] %}\n {%- set system_message = messages[0]['content'] %}\n{%- else %}\n {%- set loop_messages = messages %}\n {%- set system_message = '## Task and Context\\nYou help people answer their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You should focus on serving the user\\'s needs as best you can, which will be wide-ranging.\\n\\n## Style Guide\\nUnless the user asks for a different style of answer, you should answer in full sentences, using proper grammar and spelling.' %}\n{%- endif %}\n{{- '<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>' }}\n{{- '# Safety Preamble' }}\n{{- '\nThe instructions in this section override those in the task description and style guide sections. Don\\'t answer questions that are harmful or immoral.' }}\n{{- '\n\n# System Preamble' }}\n{{- '\n## Basic Rules' }}\n{{- '\nYou are a powerful conversational AI trained by Cohere to help people. You are augmented by a number of tools, and your job is to use and consume the output of these tools to best help the user. You will see a conversation history between yourself and a user, ending with an utterance from the user. You will then see a specific instruction instructing you what kind of response to generate. When you answer the user\\'s requests, you cite your sources in your answers, according to those instructions.' }}\n{{- '\n\n# User Preamble' }}\n{{- '\n' + system_message }}\n{{-'\n\n## Available Tools\nHere is a list of tools that you have available to you:\n\n'}}\n{%- set ns = namespace(new_tools=true) %}\n{%- for tool in tools %}\n {%- if tool.parameter_definitions is defined %}\n {%- set ns.new_tools = false %}\n {%- endif %}\n{%- endfor %}\n{%- if ns.new_tools %}\n {{- new_tool_parser(tools) }}\n{%- else %}\n {{- old_tool_parser(tools) }}\n{%- endif %}\n{{- '<|END_OF_TURN_TOKEN|>'}}\n{%- for message in loop_messages %}\n {%- set content = message['content'] %}\n {%- if message.role == 'user' %}\n {{- '<|START_OF_TURN_TOKEN|><|USER_TOKEN|>' + content|trim + '<|END_OF_TURN_TOKEN|>' }}\n {%- elif message.role == 'system' %}\n {{- '<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>' + content|trim + '<|END_OF_TURN_TOKEN|>' }}\n {%- elif message.role == 'assistant' and message.tool_calls is defined %}\n {{- '<|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>' }}\n {%- if message.content is defined %}\n {{- message.content|trim }}\n {%- endif %}\n {{- '\\nAction:\\n```json\\n[\\n' }}\n {%- for tool_call in message.tool_calls %}\n {%- if tool_call.function is defined %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '{\\n'|indent(4, first=true) }}\n {{- '\"tool_name\": \"'|indent(8, first=true) + tool_call.name + '\",\\n' }}\n {{- '\"parameters\": '|indent(8, first=true) }}\n {%- if tool_call.arguments is defined and tool_call.arguments|length > 0 %} \n {{- tool_call.arguments|tojson(indent=4)|indent(8) }}\n {{- '\\n' }}\n {%- else %}\n {{- '{}\\n' }}\n {%- endif %}\n {{- '}'|indent(4, first=true) }}\n {%- if not loop.last %}\n {{- ',\\n' }}\n {%- endif %}\n {%- endfor %}\n {{- \"\\n]```\\n\" }}\n {%- elif message.role == 'assistant' %}\n {{- '<|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>' + content|trim + '<|END_OF_TURN_TOKEN|>' }}\n {%- elif message.role == 'tool' %}\n {{- '<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|><results>\\n' }}\n {{- message.content|trim }}\n {{- '</results><|END_OF_TURN_TOKEN|>' }}\n {%- endif %}\n{%- endfor %}\n{{-'<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>Write \\'Action:\\' followed by a json-formatted list of actions that you want to perform in order to produce a good response to the user\\'s last input. You can use any of the supplied tools any number of times, but you should aim to execute the minimum number of necessary actions for the input. You should use the `directly-answer` tool if calling the other tools is unnecessary. The list of actions you want to call should be formatted as a list of json objects, for example:\n```json\n[\n {\n \"tool_name\": title of the tool in the specification,\n \"parameters\": a dict of parameters to input into the tool as they are defined in the specs, or {} if it takes no parameters\n }\n]```<|END_OF_TURN_TOKEN|>'}}\n{%- if add_generation_prompt %}\n {{- '<|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>' }}\n{%- endif %}\n"
312
+ },
313
+ {
314
+ "name": "rag",
315
+ "template": "{{ bos_token }}{% if messages[0]['role'] == 'system' %}{% set loop_messages = messages[1:] %}{% set system_message = messages[0]['content'] %}{% else %}{% set loop_messages = messages %}{% set system_message = '## Task and Context\\nYou help people answer their questions and other requests interactively. You will be asked a very wide array of requests on all kinds of topics. You will be equipped with a wide range of search engines or similar tools to help you, which you use to research your answer. You should focus on serving the user\\'s needs as best you can, which will be wide-ranging.\\n\\n## Style Guide\\nUnless the user asks for a different style of answer, you should answer in full sentences, using proper grammar and spelling.' %}{% endif %}{{ '<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>' }}{{ '# Safety Preamble' }}{{ '\nThe instructions in this section override those in the task description and style guide sections. Don\\'t answer questions that are harmful or immoral.' }}{{ '\n\n# System Preamble' }}{{ '\n## Basic Rules' }}{{ '\nYou are a powerful conversational AI trained by Cohere to help people. You are augmented by a number of tools, and your job is to use and consume the output of these tools to best help the user. You will see a conversation history between yourself and a user, ending with an utterance from the user. You will then see a specific instruction instructing you what kind of response to generate. When you answer the user\\'s requests, you cite your sources in your answers, according to those instructions.' }}{{ '\n\n# User Preamble' }}{{ '\n' + system_message }}{{ '<|END_OF_TURN_TOKEN|>'}}{% for message in loop_messages %}{% set content = message['content'] %}{% if message['role'] == 'user' %}{{ '<|START_OF_TURN_TOKEN|><|USER_TOKEN|>' + content.strip() + '<|END_OF_TURN_TOKEN|>' }}{% elif message['role'] == 'system' %}{{ '<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>' + content.strip() + '<|END_OF_TURN_TOKEN|>' }}{% elif message['role'] == 'assistant' %}{{ '<|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>' + content.strip() + '<|END_OF_TURN_TOKEN|>' }}{% endif %}{% endfor %}{{ '<|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>'}}{{ '<results>' }}{% for document in documents %}{{ '\nDocument: ' }}{{ loop.index0 }}\n{% for key, value in document.items() %}{{ key }}: {{value}}\n{% endfor %}{% endfor %}{{ '</results>'}}{{ '<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|SYSTEM_TOKEN|>' }}{{ 'Carefully perform the following instructions, in order, starting each with a new line.\n' }}{{ 'Firstly, Decide which of the retrieved documents are relevant to the user\\'s last input by writing \\'Relevant Documents:\\' followed by comma-separated list of document numbers. If none are relevant, you should instead write \\'None\\'.\n' }}{{ 'Secondly, Decide which of the retrieved documents contain facts that should be cited in a good answer to the user\\'s last input by writing \\'Cited Documents:\\' followed a comma-separated list of document numbers. If you dont want to cite any of them, you should instead write \\'None\\'.\n' }}{% if citation_mode=='accurate' %}{{ 'Thirdly, Write \\'Answer:\\' followed by a response to the user\\'s last input in high quality natural english. Use the retrieved documents to help you. Do not insert any citations or grounding markup.\n' }}{% endif %}{{ 'Finally, Write \\'Grounded answer:\\' followed by a response to the user\\'s last input in high quality natural english. Use the symbols <co: doc> and </co: doc> to indicate when a fact comes from a document in the search result, e.g <co: 0>my fact</co: 0> for a fact from document 0.' }}{{ '<|END_OF_TURN_TOKEN|>' }}{% if add_generation_prompt %}{{ '<|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>' }}{% endif %}"
316
+ }
317
+ ],
318
+ "clean_up_tokenization_spaces": false,
319
+ "eos_token": "<|END_OF_TURN_TOKEN|>",
320
+ "legacy": true,
321
+ "merges_file": null,
322
+ "model_max_length": 1000000000000000019884624838656,
323
+ "pad_token": "<PAD>",
324
+ "sp_model_kwargs": {},
325
+ "spaces_between_special_tokens": false,
326
+ "tokenizer_class": "CohereTokenizer",
327
+ "unk_token": null,
328
+ "use_default_system_prompt": false,
329
+ "vocab_file": null
330
+ }