OloriBern commited on
Commit
45baf67
·
verified ·
1 Parent(s): 17e2a46

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,365 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - cross-encoder
5
+ - reranker
6
+ - generated_from_trainer
7
+ - dataset_size:6313
8
+ - loss:BinaryCrossEntropyLoss
9
+ pipeline_tag: text-ranking
10
+ library_name: sentence-transformers
11
+ metrics:
12
+ - pearson
13
+ - spearman
14
+ model-index:
15
+ - name: CrossEncoder
16
+ results:
17
+ - task:
18
+ type: cross-encoder-correlation
19
+ name: Cross Encoder Correlation
20
+ dataset:
21
+ name: explorer validation
22
+ type: explorer-validation
23
+ metrics:
24
+ - type: pearson
25
+ value: 0.9535121580256509
26
+ name: Pearson
27
+ - type: spearman
28
+ value: 0.9229114028784231
29
+ name: Spearman
30
+ ---
31
+
32
+ # CrossEncoder
33
+
34
+ This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model trained using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
35
+
36
+ ## Model Details
37
+
38
+ ### Model Description
39
+ - **Model Type:** Cross Encoder
40
+ <!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
41
+ - **Maximum Sequence Length:** 512 tokens
42
+ - **Number of Output Labels:** 1 label
43
+ <!-- - **Training Dataset:** Unknown -->
44
+ <!-- - **Language:** Unknown -->
45
+ <!-- - **License:** Unknown -->
46
+
47
+ ### Model Sources
48
+
49
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
50
+ - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
51
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
52
+ - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
53
+
54
+ ## Usage
55
+
56
+ ### Direct Usage (Sentence Transformers)
57
+
58
+ First install the Sentence Transformers library:
59
+
60
+ ```bash
61
+ pip install -U sentence-transformers
62
+ ```
63
+
64
+ Then you can load this model and run inference.
65
+ ```python
66
+ from sentence_transformers import CrossEncoder
67
+
68
+ # Download from the 🤗 Hub
69
+ model = CrossEncoder("cross_encoder_model_id")
70
+ # Get scores for pairs of texts
71
+ pairs = [
72
+ ['How far from the city with the highest cost of living in the nation is Stanford?', 'Folk Singer is the fourth studio album by Muddy Waters, released in April 1964 by Chess Records. The album features Waters on acoustic guitar, backed by Willie Dixon on string bass, Clifton James on drums, and Buddy Guy on acoustic guitar. It is Waters\'s only all-acoustic album. Numerous reissues of "Folk Singer" include bonus tracks from two subsequent sessions, in April 1964 and October 1964.'],
73
+ ['What year was the university accepting classes from the Dockyard Technical College after its closure founded?', 'The University of Southampton, which was founded in 1862 and received its Royal Charter as a university in 1952, has over 22,000 students. The university is ranked in the top 100 research universities in the world in the Academic Ranking of World Universities 2010. In 2010, the THES - QS World University Rankings positioned the University of Southampton in the top 80 universities in the world. The university considers itself one of the top 5 research universities in the UK. The university has a global reputation for research into engineering sciences, oceanography, chemistry, cancer sciences, sound and vibration research, computer science and electronics, optoelectronics and textile conservation at the Textile Conservation Centre (which is due to close in October 2009.) It is also home to the National Oceanography Centre, Southampton (NOCS), the focus of Natural Environment Research Council-funded marine research.</s>The city was also home to the Royal Naval Engineering College; opened in 1880 in Keyham, it trained engineering students for five years before they completed the remaining two years of the course at Greenwich. The college closed in 1910, but in 1940 a new college opened at Manadon. This was renamed Dockyard Technical College in 1959 before finally closing in 1994; training was transferred to the University of Southampton.'],
74
+ ['What is the main research library at the place where Torben Grodal is employed?', "The Pennsylvania State University (commonly referred to as Penn State or PSU) is a state - related, land - grant, doctoral university with campuses and facilities throughout Pennsylvania. Founded in 1855, the university has a stated threefold mission of teaching, research, and public service. Its instructional mission includes undergraduate, graduate, professional and continuing education offered through resident instruction and online delivery. Its University Park campus, the flagship campus, lies within the Borough of State College and College Township. It has two law schools: Penn State Law, on the school's University Park campus, and Dickinson Law, located in Carlisle, 90 miles south of State College. The College of Medicine is located in Hershey. Penn State has another 19 commonwealth campuses and 5 special mission campuses located across the state. Penn State has been labeled one of the ``Public Ivies, ''a publicly funded university considered as providing a quality of education comparable to those of the Ivy League."],
75
+ ['When did the band which released Violent and Lazy form?', 'Grinspoon is an Australian rock band from Lismore, New South Wales formed in 1995 and fronted by Phil Jamieson on vocals and guitar with Pat Davern on guitar, Joe Hansen on bass guitar and Kristian Hopes on drums. Also in 1995, they won the Triple J-sponsored Unearthed competition for Lismore, with their post-grunge song "Sickfest". Their name was taken from Dr. Lester Grinspoon an Associate Professor Emeritus of Psychiatry at Harvard Medical School, who supports marijuana for medical use.'],
76
+ ['Who did the actor who plays toy Santa in Santa Clause 2 play in Toy Story?', "Victoria was the daughter of Prince Edward, Duke of Kent and Strathearn, the fourth son of King George III. Both the Duke of Kent and King George III died in 1820, and Victoria was raised under close supervision by her German-born mother Princess Victoria of Saxe-Coburg-Saalfeld. She inherited the throne aged 18, after her father's three elder brothers had all died, leaving no surviving legitimate children. The United Kingdom was already an established constitutional monarchy, in which the sovereign held relatively little direct political power. Privately, Victoria attempted to influence government policy and ministerial appointments; publicly, she became a national icon who was identified with strict standards of personal morality."],
77
+ ]
78
+ scores = model.predict(pairs)
79
+ print(scores.shape)
80
+ # (5,)
81
+
82
+ # Or rank different texts based on similarity to a single text
83
+ ranks = model.rank(
84
+ 'How far from the city with the highest cost of living in the nation is Stanford?',
85
+ [
86
+ 'Folk Singer is the fourth studio album by Muddy Waters, released in April 1964 by Chess Records. The album features Waters on acoustic guitar, backed by Willie Dixon on string bass, Clifton James on drums, and Buddy Guy on acoustic guitar. It is Waters\'s only all-acoustic album. Numerous reissues of "Folk Singer" include bonus tracks from two subsequent sessions, in April 1964 and October 1964.',
87
+ 'The University of Southampton, which was founded in 1862 and received its Royal Charter as a university in 1952, has over 22,000 students. The university is ranked in the top 100 research universities in the world in the Academic Ranking of World Universities 2010. In 2010, the THES - QS World University Rankings positioned the University of Southampton in the top 80 universities in the world. The university considers itself one of the top 5 research universities in the UK. The university has a global reputation for research into engineering sciences, oceanography, chemistry, cancer sciences, sound and vibration research, computer science and electronics, optoelectronics and textile conservation at the Textile Conservation Centre (which is due to close in October 2009.) It is also home to the National Oceanography Centre, Southampton (NOCS), the focus of Natural Environment Research Council-funded marine research.</s>The city was also home to the Royal Naval Engineering College; opened in 1880 in Keyham, it trained engineering students for five years before they completed the remaining two years of the course at Greenwich. The college closed in 1910, but in 1940 a new college opened at Manadon. This was renamed Dockyard Technical College in 1959 before finally closing in 1994; training was transferred to the University of Southampton.',
88
+ "The Pennsylvania State University (commonly referred to as Penn State or PSU) is a state - related, land - grant, doctoral university with campuses and facilities throughout Pennsylvania. Founded in 1855, the university has a stated threefold mission of teaching, research, and public service. Its instructional mission includes undergraduate, graduate, professional and continuing education offered through resident instruction and online delivery. Its University Park campus, the flagship campus, lies within the Borough of State College and College Township. It has two law schools: Penn State Law, on the school's University Park campus, and Dickinson Law, located in Carlisle, 90 miles south of State College. The College of Medicine is located in Hershey. Penn State has another 19 commonwealth campuses and 5 special mission campuses located across the state. Penn State has been labeled one of the ``Public Ivies, ''a publicly funded university considered as providing a quality of education comparable to those of the Ivy League.",
89
+ 'Grinspoon is an Australian rock band from Lismore, New South Wales formed in 1995 and fronted by Phil Jamieson on vocals and guitar with Pat Davern on guitar, Joe Hansen on bass guitar and Kristian Hopes on drums. Also in 1995, they won the Triple J-sponsored Unearthed competition for Lismore, with their post-grunge song "Sickfest". Their name was taken from Dr. Lester Grinspoon an Associate Professor Emeritus of Psychiatry at Harvard Medical School, who supports marijuana for medical use.',
90
+ "Victoria was the daughter of Prince Edward, Duke of Kent and Strathearn, the fourth son of King George III. Both the Duke of Kent and King George III died in 1820, and Victoria was raised under close supervision by her German-born mother Princess Victoria of Saxe-Coburg-Saalfeld. She inherited the throne aged 18, after her father's three elder brothers had all died, leaving no surviving legitimate children. The United Kingdom was already an established constitutional monarchy, in which the sovereign held relatively little direct political power. Privately, Victoria attempted to influence government policy and ministerial appointments; publicly, she became a national icon who was identified with strict standards of personal morality.",
91
+ ]
92
+ )
93
+ # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
94
+ ```
95
+
96
+ <!--
97
+ ### Direct Usage (Transformers)
98
+
99
+ <details><summary>Click to see the direct usage in Transformers</summary>
100
+
101
+ </details>
102
+ -->
103
+
104
+ <!--
105
+ ### Downstream Usage (Sentence Transformers)
106
+
107
+ You can finetune this model on your own dataset.
108
+
109
+ <details><summary>Click to expand</summary>
110
+
111
+ </details>
112
+ -->
113
+
114
+ <!--
115
+ ### Out-of-Scope Use
116
+
117
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
118
+ -->
119
+
120
+ ## Evaluation
121
+
122
+ ### Metrics
123
+
124
+ #### Cross Encoder Correlation
125
+
126
+ * Dataset: `explorer-validation`
127
+ * Evaluated with [<code>CECorrelationEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CECorrelationEvaluator)
128
+
129
+ | Metric | Value |
130
+ |:-------------|:-----------|
131
+ | pearson | 0.9535 |
132
+ | **spearman** | **0.9229** |
133
+
134
+ <!--
135
+ ## Bias, Risks and Limitations
136
+
137
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
138
+ -->
139
+
140
+ <!--
141
+ ### Recommendations
142
+
143
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
144
+ -->
145
+
146
+ ## Training Details
147
+
148
+ ### Training Dataset
149
+
150
+ #### Unnamed Dataset
151
+
152
+ * Size: 6,313 training samples
153
+ * Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
154
+ * Approximate statistics based on the first 1000 samples:
155
+ | | sentence_0 | sentence_1 | label |
156
+ |:--------|:------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------|:---------------------------------------------------------------|
157
+ | type | string | string | float |
158
+ | details | <ul><li>min: 31 characters</li><li>mean: 79.25 characters</li><li>max: 168 characters</li></ul> | <ul><li>min: 111 characters</li><li>mean: 805.66 characters</li><li>max: 7035 characters</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.52</li><li>max: 1.0</li></ul> |
159
+ * Samples:
160
+ | sentence_0 | sentence_1 | label |
161
+ |:---------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
162
+ | <code>How far from the city with the highest cost of living in the nation is Stanford?</code> | <code>Folk Singer is the fourth studio album by Muddy Waters, released in April 1964 by Chess Records. The album features Waters on acoustic guitar, backed by Willie Dixon on string bass, Clifton James on drums, and Buddy Guy on acoustic guitar. It is Waters's only all-acoustic album. Numerous reissues of "Folk Singer" include bonus tracks from two subsequent sessions, in April 1964 and October 1964.</code> | <code>0.0</code> |
163
+ | <code>What year was the university accepting classes from the Dockyard Technical College after its closure founded?</code> | <code>The University of Southampton, which was founded in 1862 and received its Royal Charter as a university in 1952, has over 22,000 students. The university is ranked in the top 100 research universities in the world in the Academic Ranking of World Universities 2010. In 2010, the THES - QS World University Rankings positioned the University of Southampton in the top 80 universities in the world. The university considers itself one of the top 5 research universities in the UK. The university has a global reputation for research into engineering sciences, oceanography, chemistry, cancer sciences, sound and vibration research, computer science and electronics, optoelectronics and textile conservation at the Textile Conservation Centre (which is due to close in October 2009.) It is also home to the National Oceanography Centre, Southampton (NOCS), the focus of Natural Environment Research Council-funded marine research.</s>The city was also home to the Royal Naval Engineering College; opened...</code> | <code>1.0</code> |
164
+ | <code>What is the main research library at the place where Torben Grodal is employed?</code> | <code>The Pennsylvania State University (commonly referred to as Penn State or PSU) is a state - related, land - grant, doctoral university with campuses and facilities throughout Pennsylvania. Founded in 1855, the university has a stated threefold mission of teaching, research, and public service. Its instructional mission includes undergraduate, graduate, professional and continuing education offered through resident instruction and online delivery. Its University Park campus, the flagship campus, lies within the Borough of State College and College Township. It has two law schools: Penn State Law, on the school's University Park campus, and Dickinson Law, located in Carlisle, 90 miles south of State College. The College of Medicine is located in Hershey. Penn State has another 19 commonwealth campuses and 5 special mission campuses located across the state. Penn State has been labeled one of the ``Public Ivies, ''a publicly funded university considered as providing a quality of education ...</code> | <code>0.0</code> |
165
+ * Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
166
+ ```json
167
+ {
168
+ "activation_fn": "torch.nn.modules.linear.Identity",
169
+ "pos_weight": null
170
+ }
171
+ ```
172
+
173
+ ### Training Hyperparameters
174
+ #### Non-Default Hyperparameters
175
+
176
+ - `eval_strategy`: steps
177
+ - `per_device_train_batch_size`: 4
178
+ - `per_device_eval_batch_size`: 4
179
+ - `num_train_epochs`: 1
180
+
181
+ #### All Hyperparameters
182
+ <details><summary>Click to expand</summary>
183
+
184
+ - `overwrite_output_dir`: False
185
+ - `do_predict`: False
186
+ - `eval_strategy`: steps
187
+ - `prediction_loss_only`: True
188
+ - `per_device_train_batch_size`: 4
189
+ - `per_device_eval_batch_size`: 4
190
+ - `per_gpu_train_batch_size`: None
191
+ - `per_gpu_eval_batch_size`: None
192
+ - `gradient_accumulation_steps`: 1
193
+ - `eval_accumulation_steps`: None
194
+ - `torch_empty_cache_steps`: None
195
+ - `learning_rate`: 5e-05
196
+ - `weight_decay`: 0.0
197
+ - `adam_beta1`: 0.9
198
+ - `adam_beta2`: 0.999
199
+ - `adam_epsilon`: 1e-08
200
+ - `max_grad_norm`: 1.0
201
+ - `num_train_epochs`: 1
202
+ - `max_steps`: -1
203
+ - `lr_scheduler_type`: linear
204
+ - `lr_scheduler_kwargs`: None
205
+ - `warmup_ratio`: 0.0
206
+ - `warmup_steps`: 0
207
+ - `log_level`: passive
208
+ - `log_level_replica`: warning
209
+ - `log_on_each_node`: True
210
+ - `logging_nan_inf_filter`: True
211
+ - `save_safetensors`: True
212
+ - `save_on_each_node`: False
213
+ - `save_only_model`: False
214
+ - `restore_callback_states_from_checkpoint`: False
215
+ - `no_cuda`: False
216
+ - `use_cpu`: False
217
+ - `use_mps_device`: False
218
+ - `seed`: 42
219
+ - `data_seed`: None
220
+ - `jit_mode_eval`: False
221
+ - `bf16`: False
222
+ - `fp16`: False
223
+ - `fp16_opt_level`: O1
224
+ - `half_precision_backend`: auto
225
+ - `bf16_full_eval`: False
226
+ - `fp16_full_eval`: False
227
+ - `tf32`: None
228
+ - `local_rank`: 0
229
+ - `ddp_backend`: None
230
+ - `tpu_num_cores`: None
231
+ - `tpu_metrics_debug`: False
232
+ - `debug`: []
233
+ - `dataloader_drop_last`: False
234
+ - `dataloader_num_workers`: 0
235
+ - `dataloader_prefetch_factor`: None
236
+ - `past_index`: -1
237
+ - `disable_tqdm`: False
238
+ - `remove_unused_columns`: True
239
+ - `label_names`: None
240
+ - `load_best_model_at_end`: False
241
+ - `ignore_data_skip`: False
242
+ - `fsdp`: []
243
+ - `fsdp_min_num_params`: 0
244
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
245
+ - `fsdp_transformer_layer_cls_to_wrap`: None
246
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
247
+ - `parallelism_config`: None
248
+ - `deepspeed`: None
249
+ - `label_smoothing_factor`: 0.0
250
+ - `optim`: adamw_torch_fused
251
+ - `optim_args`: None
252
+ - `adafactor`: False
253
+ - `group_by_length`: False
254
+ - `length_column_name`: length
255
+ - `project`: huggingface
256
+ - `trackio_space_id`: trackio
257
+ - `ddp_find_unused_parameters`: None
258
+ - `ddp_bucket_cap_mb`: None
259
+ - `ddp_broadcast_buffers`: False
260
+ - `dataloader_pin_memory`: True
261
+ - `dataloader_persistent_workers`: False
262
+ - `skip_memory_metrics`: True
263
+ - `use_legacy_prediction_loop`: False
264
+ - `push_to_hub`: False
265
+ - `resume_from_checkpoint`: None
266
+ - `hub_model_id`: None
267
+ - `hub_strategy`: every_save
268
+ - `hub_private_repo`: None
269
+ - `hub_always_push`: False
270
+ - `hub_revision`: None
271
+ - `gradient_checkpointing`: False
272
+ - `gradient_checkpointing_kwargs`: None
273
+ - `include_inputs_for_metrics`: False
274
+ - `include_for_metrics`: []
275
+ - `eval_do_concat_batches`: True
276
+ - `fp16_backend`: auto
277
+ - `push_to_hub_model_id`: None
278
+ - `push_to_hub_organization`: None
279
+ - `mp_parameters`:
280
+ - `auto_find_batch_size`: False
281
+ - `full_determinism`: False
282
+ - `torchdynamo`: None
283
+ - `ray_scope`: last
284
+ - `ddp_timeout`: 1800
285
+ - `torch_compile`: False
286
+ - `torch_compile_backend`: None
287
+ - `torch_compile_mode`: None
288
+ - `include_tokens_per_second`: False
289
+ - `include_num_input_tokens_seen`: no
290
+ - `neftune_noise_alpha`: None
291
+ - `optim_target_modules`: None
292
+ - `batch_eval_metrics`: False
293
+ - `eval_on_start`: False
294
+ - `use_liger_kernel`: False
295
+ - `liger_kernel_config`: None
296
+ - `eval_use_gather_object`: False
297
+ - `average_tokens_across_devices`: True
298
+ - `prompts`: None
299
+ - `batch_sampler`: batch_sampler
300
+ - `multi_dataset_batch_sampler`: proportional
301
+ - `router_mapping`: {}
302
+ - `learning_rate_mapping`: {}
303
+
304
+ </details>
305
+
306
+ ### Training Logs
307
+ | Epoch | Step | Training Loss | explorer-validation_spearman |
308
+ |:------:|:----:|:-------------:|:----------------------------:|
309
+ | 0.3167 | 500 | 0.3547 | 0.9187 |
310
+ | 0.6333 | 1000 | 0.3377 | 0.9136 |
311
+ | 0.9500 | 1500 | 0.3264 | 0.9231 |
312
+ | 1.0 | 1579 | - | 0.9239 |
313
+ | -1 | -1 | - | 0.9239 |
314
+ | 0.3167 | 500 | 0.3111 | 0.9127 |
315
+ | 0.6333 | 1000 | 0.3306 | 0.9199 |
316
+ | 0.9500 | 1500 | 0.3175 | 0.9221 |
317
+ | 1.0 | 1579 | - | 0.9224 |
318
+ | -1 | -1 | - | 0.9224 |
319
+ | 0.3167 | 500 | 0.2913 | 0.9160 |
320
+ | 0.6333 | 1000 | 0.3053 | 0.9229 |
321
+
322
+
323
+ ### Framework Versions
324
+ - Python: 3.12.11
325
+ - Sentence Transformers: 5.2.0
326
+ - Transformers: 4.57.6
327
+ - PyTorch: 2.9.1+cu128
328
+ - Accelerate: 1.12.0
329
+ - Datasets: 4.5.0
330
+ - Tokenizers: 0.22.2
331
+
332
+ ## Citation
333
+
334
+ ### BibTeX
335
+
336
+ #### Sentence Transformers
337
+ ```bibtex
338
+ @inproceedings{reimers-2019-sentence-bert,
339
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
340
+ author = "Reimers, Nils and Gurevych, Iryna",
341
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
342
+ month = "11",
343
+ year = "2019",
344
+ publisher = "Association for Computational Linguistics",
345
+ url = "https://arxiv.org/abs/1908.10084",
346
+ }
347
+ ```
348
+
349
+ <!--
350
+ ## Glossary
351
+
352
+ *Clearly define terms in order to be accessible across audiences.*
353
+ -->
354
+
355
+ <!--
356
+ ## Model Card Authors
357
+
358
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
359
+ -->
360
+
361
+ <!--
362
+ ## Model Card Contact
363
+
364
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
365
+ -->
config.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "XLMRobertaForSequenceClassification"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "bos_token_id": 0,
7
+ "classifier_dropout": null,
8
+ "dtype": "float32",
9
+ "eos_token_id": 2,
10
+ "hidden_act": "gelu",
11
+ "hidden_dropout_prob": 0.1,
12
+ "hidden_size": 1024,
13
+ "id2label": {
14
+ "0": "LABEL_0"
15
+ },
16
+ "initializer_range": 0.02,
17
+ "intermediate_size": 4096,
18
+ "label2id": {
19
+ "LABEL_0": 0
20
+ },
21
+ "layer_norm_eps": 1e-05,
22
+ "max_position_embeddings": 8194,
23
+ "model_type": "xlm-roberta",
24
+ "num_attention_heads": 16,
25
+ "num_hidden_layers": 24,
26
+ "output_past": true,
27
+ "pad_token_id": 1,
28
+ "position_embedding_type": "absolute",
29
+ "sentence_transformers": {
30
+ "activation_fn": "torch.nn.modules.activation.Sigmoid",
31
+ "version": "5.2.0"
32
+ },
33
+ "transformers_version": "4.57.6",
34
+ "type_vocab_size": 1,
35
+ "use_cache": true,
36
+ "vocab_size": 250002
37
+ }
eval/CrossEncoderCorrelationEvaluator_explorer-validation_results.csv ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ epoch,steps,Pearson_Correlation,Spearman_Correlation
2
+ 1.0,1578,0.8501751995258846,0.8905569617674473
3
+ 1.0,1578,0.8570334177160368,0.8942484497672712
4
+ 1.0,1578,0.8568699047421272,0.8848450033650819
5
+ 1.0,1579,0.9527875709381797,0.9238502423936007
6
+ 1.0,1579,0.9628350133547275,0.92242053984088
7
+ 1.0,1579,0.9621289158326546,0.9224219171223669
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aa3db09a2ef45f714a3cc2b712f624056f2137c11ddb5b84154e20160b26c846
3
+ size 2271071852
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "<unk>",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d9a6af42442a3e3e9f05f618eae0bb2d98ca4f6a6406cb80ef7a4fa865204d61
3
+ size 17083052
tokenizer_config.json ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<unk>",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "250001": {
36
+ "content": "<mask>",
37
+ "lstrip": true,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "bos_token": "<s>",
45
+ "clean_up_tokenization_spaces": true,
46
+ "cls_token": "<s>",
47
+ "eos_token": "</s>",
48
+ "extra_special_tokens": {},
49
+ "mask_token": "<mask>",
50
+ "max_length": 512,
51
+ "model_max_length": 512,
52
+ "pad_to_multiple_of": null,
53
+ "pad_token": "<pad>",
54
+ "pad_token_type_id": 0,
55
+ "padding_side": "right",
56
+ "sep_token": "</s>",
57
+ "sp_model_kwargs": {},
58
+ "stride": 0,
59
+ "tokenizer_class": "XLMRobertaTokenizerFast",
60
+ "truncation_side": "right",
61
+ "truncation_strategy": "longest_first",
62
+ "unk_token": "<unk>"
63
+ }