Text Ranking
sentence-transformers
Safetensors
xlm-roberta
cross-encoder
reranker
Generated from Trainer
dataset_size:6310
loss:BinaryCrossEntropyLoss
Eval Results (legacy)
text-embeddings-inference
Instructions to use OloriBern/bge-m3-musique-hybrid-3ep with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use OloriBern/bge-m3-musique-hybrid-3ep with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("OloriBern/bge-m3-musique-hybrid-3ep") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
File size: 27,995 Bytes
9079c16 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 | ---
tags:
- sentence-transformers
- cross-encoder
- reranker
- generated_from_trainer
- dataset_size:6310
- loss:BinaryCrossEntropyLoss
pipeline_tag: text-ranking
library_name: sentence-transformers
metrics:
- pearson
- spearman
model-index:
- name: CrossEncoder
results:
- task:
type: cross-encoder-correlation
name: Cross Encoder Correlation
dataset:
name: explorer validation
type: explorer-validation
metrics:
- type: pearson
value: 0.904792870972471
name: Pearson
- type: spearman
value: 0.9035997861017825
name: Spearman
---
# CrossEncoder
This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model trained using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
## Model Details
### Model Description
- **Model Type:** Cross Encoder
<!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
- **Maximum Sequence Length:** 512 tokens
- **Number of Output Labels:** 1 label
<!-- - **Training Dataset:** Unknown -->
<!-- - **Language:** Unknown -->
<!-- - **License:** Unknown -->
### Model Sources
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
- **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
## Usage
### Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
```bash
pip install -U sentence-transformers
```
Then you can load this model and run inference.
```python
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("cross_encoder_model_id")
# Get scores for pairs of texts
pairs = [
["In which year did the publisher of Roald Dahl's Guide to Railway Safety cease to exist?", "Glacier Park International Airport is in Flathead County, Montana, six miles northeast of Kalispell. The airport is owned and operated by the Flathead Municipal Airport Authority, a public agency created by the county in 1974.</s>Roald Dahl's Guide to Railway Safety was published in 1991 by the British Railways Board. The British Railways Board had asked Roald Dahl to write the text of the booklet, and Quentin Blake to illustrate it, to help young people enjoy using the railways safely.</s>The British Railways Board (BRB) was a nationalised industry in the United Kingdom that operated from 1963 to 2001. Until 1997 it was responsible for most railway services in Great Britain, trading under the brand name British Railways and, from 1965, British Rail. It did not operate railways in Northern Ireland, where railways were the responsibility of the Government of Northern Ireland."],
['Who is the brother of the developer of Buffy the Animated Series?', "Waylon Malloy Payne (born April 5, 1972) is an American country singer, songwriter, musician and actor. He is the son of the country singer Sammi Smith.</s>WVFM, known simply as FM 106.5 and formerly WQLR, is a Classic Hits-leaning Adult Contemporary outlet serving the Kalamazoo, Michigan radio market. The station's frequency is 106.5\xa0MHz on the FM dial with an ERP of 33\xa0kW. They are owned by Midwest Communications. WVFM 106.5 is located on a crowded Frequency across south-west lower Michigan. The station covers all of Kalamazoo/Battle Creek area, can be heard well in the Grand Rapids area, and reaches as far north and east as Lansing and Jackson. During summer months, the station can be received to Flint and Ann Arbor on occasion.</s>Fred Iltis (Brno, Czechoslovakia, April 20, 1923 – San Jose, California, December 11, 2008) was an American entomologist. His research focused on the biosystematics and life cycle of mosquitoes."],
['Who did the performer of So Under Pressure play in Home and Away?', 'The Acoma Massacre was fought in January 1599 between Spanish conquistadors and Acoma Native Americans in what is now New Mexico. After twelve soldiers were killed at Acoma Pueblo in 1598, the Spanish retaliated by launching a punitive expedition, which led to the deaths of around 800 men, women and children during a three - day battle. Several hundred survivors were also enslaved or otherwise severely punished.</s>The NBA Championship ring is an annual award given by the National Basketball Association to the team that wins the NBA Finals. Rings are presented to the team\'s players, coaches, and members of the executive front office. Red Auerbach has the most rings overall with 16. Phil Jackson has the most as coach and Bill Russell has the most as a player (11 each)</s>Tibetan sources say Deshin Shekpa also persuaded the Yongle Emperor not to impose his military might on Tibet as the Mongols had previously done. Thinley writes that before the Karmapa returned to Tibet, the Yongle Emperor began planning to send a military force into Tibet to forcibly give the Karmapa authority over all the Tibetan Buddhist schools but Deshin Shekpa dissuaded him. However, Hok-Lam Chan states that "there is little evidence that this was ever the emperor\'s intention" and that evidence indicates that Deshin Skekpa was invited strictly for religious purposes.'],
["Big Bill Morganfield's father is associated with which subgenre of the blues?", 'William "Big Bill" Morganfield (born June 19, 1956) is an American blues singer and guitarist, who is the son of Muddy Waters.</s>WNZK is a radio station in Dearborn Heights, Michigan, United States. It began broadcasting October 12, 1985. It broadcasts in the AM radio band at 690 kHz during the daytime and at 680\xa0kHz at night. This is to protect the nighttime pattern of Montreal, Quebec\'s CKGM, a clear-channel station on 690. WNZK is the only North American AM station to broadcast on two frequencies, at least according to the FCC online database.</s>McKinley Morganfield (April 4, 1913 -- April 30, 1983), known professionally as Muddy Waters, was an American blues musician who is often cited as the ``father of modern Chicago blues \'\'.'],
['Who was played by the Nothing That You Are performer in Princess Diaries?', 'The Madonna of Chancellor Rolin is an oil painting by the Early Netherlandish master Jan van Eyck, dating from around 1435. It is kept in the Musée du Louvre, Paris, and was commissioned by Nicolas Rolin, aged 60, chancellor of the Duchy of Burgundy, whose votive portrait takes up the left side of the picture, for his parish church, "Notre-Dame-du-Chastel" in Autun, where it remained until the church burnt down in 1793. After a period in Autun Cathedral, it was moved to the Louvre in 1805.</s>Jon Monday (born 1947 in San Jose, California) is an American producer and distributor of CDs and DVDs across an eclectic range of material such as Swami Prabhavananda, Aldous Huxley, Christopher Isherwood, Huston Smith, Chalmers Johnson, and Charles Bukowski. Monday directed and co-produced with Jennifer Douglas the feature-length documentary "Save KLSD: Media Consolidation and Local Radio". He is also President of Benchmark Recordings, which owns and distributes the early catalog of The Fabulous Thunderbirds CDs and a live recording of Mike Bloomfield.</s>Strictly Inc. is the self-titled project album, released by Genesis keyboardist Tony Banks, and Wang Chung lead vocalist Jack Hues, in 1995 on Virgin Records. Tony Banks wanted the album release—as the title suggested—with no reference to the band members; but the record company went against his wishes. This was Banks\' fifth studio album (his second issued under a band name and seventh album overall).'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
"In which year did the publisher of Roald Dahl's Guide to Railway Safety cease to exist?",
[
"Glacier Park International Airport is in Flathead County, Montana, six miles northeast of Kalispell. The airport is owned and operated by the Flathead Municipal Airport Authority, a public agency created by the county in 1974.</s>Roald Dahl's Guide to Railway Safety was published in 1991 by the British Railways Board. The British Railways Board had asked Roald Dahl to write the text of the booklet, and Quentin Blake to illustrate it, to help young people enjoy using the railways safely.</s>The British Railways Board (BRB) was a nationalised industry in the United Kingdom that operated from 1963 to 2001. Until 1997 it was responsible for most railway services in Great Britain, trading under the brand name British Railways and, from 1965, British Rail. It did not operate railways in Northern Ireland, where railways were the responsibility of the Government of Northern Ireland.",
"Waylon Malloy Payne (born April 5, 1972) is an American country singer, songwriter, musician and actor. He is the son of the country singer Sammi Smith.</s>WVFM, known simply as FM 106.5 and formerly WQLR, is a Classic Hits-leaning Adult Contemporary outlet serving the Kalamazoo, Michigan radio market. The station's frequency is 106.5\xa0MHz on the FM dial with an ERP of 33\xa0kW. They are owned by Midwest Communications. WVFM 106.5 is located on a crowded Frequency across south-west lower Michigan. The station covers all of Kalamazoo/Battle Creek area, can be heard well in the Grand Rapids area, and reaches as far north and east as Lansing and Jackson. During summer months, the station can be received to Flint and Ann Arbor on occasion.</s>Fred Iltis (Brno, Czechoslovakia, April 20, 1923 – San Jose, California, December 11, 2008) was an American entomologist. His research focused on the biosystematics and life cycle of mosquitoes.",
'The Acoma Massacre was fought in January 1599 between Spanish conquistadors and Acoma Native Americans in what is now New Mexico. After twelve soldiers were killed at Acoma Pueblo in 1598, the Spanish retaliated by launching a punitive expedition, which led to the deaths of around 800 men, women and children during a three - day battle. Several hundred survivors were also enslaved or otherwise severely punished.</s>The NBA Championship ring is an annual award given by the National Basketball Association to the team that wins the NBA Finals. Rings are presented to the team\'s players, coaches, and members of the executive front office. Red Auerbach has the most rings overall with 16. Phil Jackson has the most as coach and Bill Russell has the most as a player (11 each)</s>Tibetan sources say Deshin Shekpa also persuaded the Yongle Emperor not to impose his military might on Tibet as the Mongols had previously done. Thinley writes that before the Karmapa returned to Tibet, the Yongle Emperor began planning to send a military force into Tibet to forcibly give the Karmapa authority over all the Tibetan Buddhist schools but Deshin Shekpa dissuaded him. However, Hok-Lam Chan states that "there is little evidence that this was ever the emperor\'s intention" and that evidence indicates that Deshin Skekpa was invited strictly for religious purposes.',
'William "Big Bill" Morganfield (born June 19, 1956) is an American blues singer and guitarist, who is the son of Muddy Waters.</s>WNZK is a radio station in Dearborn Heights, Michigan, United States. It began broadcasting October 12, 1985. It broadcasts in the AM radio band at 690 kHz during the daytime and at 680\xa0kHz at night. This is to protect the nighttime pattern of Montreal, Quebec\'s CKGM, a clear-channel station on 690. WNZK is the only North American AM station to broadcast on two frequencies, at least according to the FCC online database.</s>McKinley Morganfield (April 4, 1913 -- April 30, 1983), known professionally as Muddy Waters, was an American blues musician who is often cited as the ``father of modern Chicago blues \'\'.',
'The Madonna of Chancellor Rolin is an oil painting by the Early Netherlandish master Jan van Eyck, dating from around 1435. It is kept in the Musée du Louvre, Paris, and was commissioned by Nicolas Rolin, aged 60, chancellor of the Duchy of Burgundy, whose votive portrait takes up the left side of the picture, for his parish church, "Notre-Dame-du-Chastel" in Autun, where it remained until the church burnt down in 1793. After a period in Autun Cathedral, it was moved to the Louvre in 1805.</s>Jon Monday (born 1947 in San Jose, California) is an American producer and distributor of CDs and DVDs across an eclectic range of material such as Swami Prabhavananda, Aldous Huxley, Christopher Isherwood, Huston Smith, Chalmers Johnson, and Charles Bukowski. Monday directed and co-produced with Jennifer Douglas the feature-length documentary "Save KLSD: Media Consolidation and Local Radio". He is also President of Benchmark Recordings, which owns and distributes the early catalog of The Fabulous Thunderbirds CDs and a live recording of Mike Bloomfield.</s>Strictly Inc. is the self-titled project album, released by Genesis keyboardist Tony Banks, and Wang Chung lead vocalist Jack Hues, in 1995 on Virgin Records. Tony Banks wanted the album release—as the title suggested—with no reference to the band members; but the record company went against his wishes. This was Banks\' fifth studio album (his second issued under a band name and seventh album overall).',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
```
<!--
### Direct Usage (Transformers)
<details><summary>Click to see the direct usage in Transformers</summary>
</details>
-->
<!--
### Downstream Usage (Sentence Transformers)
You can finetune this model on your own dataset.
<details><summary>Click to expand</summary>
</details>
-->
<!--
### Out-of-Scope Use
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->
## Evaluation
### Metrics
#### Cross Encoder Correlation
* Dataset: `explorer-validation`
* Evaluated with [<code>CECorrelationEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CECorrelationEvaluator)
| Metric | Value |
|:-------------|:-----------|
| pearson | 0.9048 |
| **spearman** | **0.9036** |
<!--
## Bias, Risks and Limitations
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->
<!--
### Recommendations
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->
## Training Details
### Training Dataset
#### Unnamed Dataset
* Size: 6,310 training samples
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
* Approximate statistics based on the first 1000 samples:
| | sentence_0 | sentence_1 | label |
|:--------|:------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------|:---------------------------------------------------------------|
| type | string | string | float |
| details | <ul><li>min: 31 characters</li><li>mean: 79.26 characters</li><li>max: 168 characters</li></ul> | <ul><li>min: 606 characters</li><li>mean: 1644.98 characters</li><li>max: 4551 characters</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.51</li><li>max: 1.0</li></ul> |
* Samples:
| sentence_0 | sentence_1 | label |
|:-----------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
| <code>In which year did the publisher of Roald Dahl's Guide to Railway Safety cease to exist?</code> | <code>Glacier Park International Airport is in Flathead County, Montana, six miles northeast of Kalispell. The airport is owned and operated by the Flathead Municipal Airport Authority, a public agency created by the county in 1974.</s>Roald Dahl's Guide to Railway Safety was published in 1991 by the British Railways Board. The British Railways Board had asked Roald Dahl to write the text of the booklet, and Quentin Blake to illustrate it, to help young people enjoy using the railways safely.</s>The British Railways Board (BRB) was a nationalised industry in the United Kingdom that operated from 1963 to 2001. Until 1997 it was responsible for most railway services in Great Britain, trading under the brand name British Railways and, from 1965, British Rail. It did not operate railways in Northern Ireland, where railways were the responsibility of the Government of Northern Ireland.</code> | <code>1.0</code> |
| <code>Who is the brother of the developer of Buffy the Animated Series?</code> | <code>Waylon Malloy Payne (born April 5, 1972) is an American country singer, songwriter, musician and actor. He is the son of the country singer Sammi Smith.</s>WVFM, known simply as FM 106.5 and formerly WQLR, is a Classic Hits-leaning Adult Contemporary outlet serving the Kalamazoo, Michigan radio market. The station's frequency is 106.5 MHz on the FM dial with an ERP of 33 kW. They are owned by Midwest Communications. WVFM 106.5 is located on a crowded Frequency across south-west lower Michigan. The station covers all of Kalamazoo/Battle Creek area, can be heard well in the Grand Rapids area, and reaches as far north and east as Lansing and Jackson. During summer months, the station can be received to Flint and Ann Arbor on occasion.</s>Fred Iltis (Brno, Czechoslovakia, April 20, 1923 – San Jose, California, December 11, 2008) was an American entomologist. His research focused on the biosystematics and life cycle of mosquitoes.</code> | <code>0.0</code> |
| <code>Who did the performer of So Under Pressure play in Home and Away?</code> | <code>The Acoma Massacre was fought in January 1599 between Spanish conquistadors and Acoma Native Americans in what is now New Mexico. After twelve soldiers were killed at Acoma Pueblo in 1598, the Spanish retaliated by launching a punitive expedition, which led to the deaths of around 800 men, women and children during a three - day battle. Several hundred survivors were also enslaved or otherwise severely punished.</s>The NBA Championship ring is an annual award given by the National Basketball Association to the team that wins the NBA Finals. Rings are presented to the team's players, coaches, and members of the executive front office. Red Auerbach has the most rings overall with 16. Phil Jackson has the most as coach and Bill Russell has the most as a player (11 each)</s>Tibetan sources say Deshin Shekpa also persuaded the Yongle Emperor not to impose his military might on Tibet as the Mongols had previously done. Thinley writes that before the Karmapa returned to Tibet, the Yongle Empe...</code> | <code>0.0</code> |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
```json
{
"activation_fn": "torch.nn.modules.linear.Identity",
"pos_weight": null
}
```
### Training Hyperparameters
#### Non-Default Hyperparameters
- `eval_strategy`: steps
- `per_device_train_batch_size`: 4
- `per_device_eval_batch_size`: 4
- `num_train_epochs`: 1
#### All Hyperparameters
<details><summary>Click to expand</summary>
- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 4
- `per_device_eval_batch_size`: 4
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 5e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 1
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: None
- `warmup_ratio`: 0.0
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `bf16`: False
- `fp16`: False
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `parallelism_config`: None
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch_fused
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `project`: huggingface
- `trackio_space_id`: trackio
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `hub_revision`: None
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`:
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: no
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `liger_kernel_config`: None
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: True
- `prompts`: None
- `batch_sampler`: batch_sampler
- `multi_dataset_batch_sampler`: proportional
- `router_mapping`: {}
- `learning_rate_mapping`: {}
</details>
### Training Logs
| Epoch | Step | Training Loss | explorer-validation_spearman |
|:------:|:----:|:-------------:|:----------------------------:|
| 0.3169 | 500 | 0.3868 | 0.9090 |
| 0.6337 | 1000 | 0.4186 | 0.9041 |
| 0.9506 | 1500 | 0.3636 | 0.8895 |
| 1.0 | 1578 | - | 0.8906 |
| -1 | -1 | - | 0.8906 |
| 0.3169 | 500 | 0.3691 | 0.9014 |
| 0.6337 | 1000 | 0.3661 | 0.8900 |
| 0.9506 | 1500 | 0.3499 | 0.8936 |
| 1.0 | 1578 | - | 0.8942 |
| -1 | -1 | - | 0.8942 |
| 0.3169 | 500 | 0.3342 | 0.9036 |
### Framework Versions
- Python: 3.12.11
- Sentence Transformers: 5.2.0
- Transformers: 4.57.6
- PyTorch: 2.9.1+cu128
- Accelerate: 1.12.0
- Datasets: 4.5.0
- Tokenizers: 0.22.2
## Citation
### BibTeX
#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
```
<!--
## Glossary
*Clearly define terms in order to be accessible across audiences.*
-->
<!--
## Model Card Authors
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->
<!--
## Model Card Contact
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
--> |