Text Ranking
sentence-transformers
Safetensors
xlm-roberta
cross-encoder
reranker
Generated from Trainer
dataset_size:6313
loss:BinaryCrossEntropyLoss
Eval Results (legacy)
text-embeddings-inference
Instructions to use OloriBern/bge-m3-musique-mixer-3ep with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use OloriBern/bge-m3-musique-mixer-3ep with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("OloriBern/bge-m3-musique-mixer-3ep") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
metadata
tags:
- sentence-transformers
- cross-encoder
- reranker
- generated_from_trainer
- dataset_size:6313
- loss:BinaryCrossEntropyLoss
pipeline_tag: text-ranking
library_name: sentence-transformers
metrics:
- pearson
- spearman
model-index:
- name: CrossEncoder
results:
- task:
type: cross-encoder-correlation
name: Cross Encoder Correlation
dataset:
name: explorer validation
type: explorer-validation
metrics:
- type: pearson
value: 0.9535121580256509
name: Pearson
- type: spearman
value: 0.9229114028784231
name: Spearman
CrossEncoder
This is a Cross Encoder model trained using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
Model Details
Model Description
- Model Type: Cross Encoder
- Maximum Sequence Length: 512 tokens
- Number of Output Labels: 1 label
Model Sources
- Documentation: Sentence Transformers Documentation
- Documentation: Cross Encoder Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Cross Encoders on Hugging Face
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("cross_encoder_model_id")
# Get scores for pairs of texts
pairs = [
['How far from the city with the highest cost of living in the nation is Stanford?', 'Folk Singer is the fourth studio album by Muddy Waters, released in April 1964 by Chess Records. The album features Waters on acoustic guitar, backed by Willie Dixon on string bass, Clifton James on drums, and Buddy Guy on acoustic guitar. It is Waters\'s only all-acoustic album. Numerous reissues of "Folk Singer" include bonus tracks from two subsequent sessions, in April 1964 and October 1964.'],
['What year was the university accepting classes from the Dockyard Technical College after its closure founded?', 'The University of Southampton, which was founded in 1862 and received its Royal Charter as a university in 1952, has over 22,000 students. The university is ranked in the top 100 research universities in the world in the Academic Ranking of World Universities 2010. In 2010, the THES - QS World University Rankings positioned the University of Southampton in the top 80 universities in the world. The university considers itself one of the top 5 research universities in the UK. The university has a global reputation for research into engineering sciences, oceanography, chemistry, cancer sciences, sound and vibration research, computer science and electronics, optoelectronics and textile conservation at the Textile Conservation Centre (which is due to close in October 2009.) It is also home to the National Oceanography Centre, Southampton (NOCS), the focus of Natural Environment Research Council-funded marine research.</s>The city was also home to the Royal Naval Engineering College; opened in 1880 in Keyham, it trained engineering students for five years before they completed the remaining two years of the course at Greenwich. The college closed in 1910, but in 1940 a new college opened at Manadon. This was renamed Dockyard Technical College in 1959 before finally closing in 1994; training was transferred to the University of Southampton.'],
['What is the main research library at the place where Torben Grodal is employed?', "The Pennsylvania State University (commonly referred to as Penn State or PSU) is a state - related, land - grant, doctoral university with campuses and facilities throughout Pennsylvania. Founded in 1855, the university has a stated threefold mission of teaching, research, and public service. Its instructional mission includes undergraduate, graduate, professional and continuing education offered through resident instruction and online delivery. Its University Park campus, the flagship campus, lies within the Borough of State College and College Township. It has two law schools: Penn State Law, on the school's University Park campus, and Dickinson Law, located in Carlisle, 90 miles south of State College. The College of Medicine is located in Hershey. Penn State has another 19 commonwealth campuses and 5 special mission campuses located across the state. Penn State has been labeled one of the ``Public Ivies, ''a publicly funded university considered as providing a quality of education comparable to those of the Ivy League."],
['When did the band which released Violent and Lazy form?', 'Grinspoon is an Australian rock band from Lismore, New South Wales formed in 1995 and fronted by Phil Jamieson on vocals and guitar with Pat Davern on guitar, Joe Hansen on bass guitar and Kristian Hopes on drums. Also in 1995, they won the Triple J-sponsored Unearthed competition for Lismore, with their post-grunge song "Sickfest". Their name was taken from Dr. Lester Grinspoon an Associate Professor Emeritus of Psychiatry at Harvard Medical School, who supports marijuana for medical use.'],
['Who did the actor who plays toy Santa in Santa Clause 2 play in Toy Story?', "Victoria was the daughter of Prince Edward, Duke of Kent and Strathearn, the fourth son of King George III. Both the Duke of Kent and King George III died in 1820, and Victoria was raised under close supervision by her German-born mother Princess Victoria of Saxe-Coburg-Saalfeld. She inherited the throne aged 18, after her father's three elder brothers had all died, leaving no surviving legitimate children. The United Kingdom was already an established constitutional monarchy, in which the sovereign held relatively little direct political power. Privately, Victoria attempted to influence government policy and ministerial appointments; publicly, she became a national icon who was identified with strict standards of personal morality."],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'How far from the city with the highest cost of living in the nation is Stanford?',
[
'Folk Singer is the fourth studio album by Muddy Waters, released in April 1964 by Chess Records. The album features Waters on acoustic guitar, backed by Willie Dixon on string bass, Clifton James on drums, and Buddy Guy on acoustic guitar. It is Waters\'s only all-acoustic album. Numerous reissues of "Folk Singer" include bonus tracks from two subsequent sessions, in April 1964 and October 1964.',
'The University of Southampton, which was founded in 1862 and received its Royal Charter as a university in 1952, has over 22,000 students. The university is ranked in the top 100 research universities in the world in the Academic Ranking of World Universities 2010. In 2010, the THES - QS World University Rankings positioned the University of Southampton in the top 80 universities in the world. The university considers itself one of the top 5 research universities in the UK. The university has a global reputation for research into engineering sciences, oceanography, chemistry, cancer sciences, sound and vibration research, computer science and electronics, optoelectronics and textile conservation at the Textile Conservation Centre (which is due to close in October 2009.) It is also home to the National Oceanography Centre, Southampton (NOCS), the focus of Natural Environment Research Council-funded marine research.</s>The city was also home to the Royal Naval Engineering College; opened in 1880 in Keyham, it trained engineering students for five years before they completed the remaining two years of the course at Greenwich. The college closed in 1910, but in 1940 a new college opened at Manadon. This was renamed Dockyard Technical College in 1959 before finally closing in 1994; training was transferred to the University of Southampton.',
"The Pennsylvania State University (commonly referred to as Penn State or PSU) is a state - related, land - grant, doctoral university with campuses and facilities throughout Pennsylvania. Founded in 1855, the university has a stated threefold mission of teaching, research, and public service. Its instructional mission includes undergraduate, graduate, professional and continuing education offered through resident instruction and online delivery. Its University Park campus, the flagship campus, lies within the Borough of State College and College Township. It has two law schools: Penn State Law, on the school's University Park campus, and Dickinson Law, located in Carlisle, 90 miles south of State College. The College of Medicine is located in Hershey. Penn State has another 19 commonwealth campuses and 5 special mission campuses located across the state. Penn State has been labeled one of the ``Public Ivies, ''a publicly funded university considered as providing a quality of education comparable to those of the Ivy League.",
'Grinspoon is an Australian rock band from Lismore, New South Wales formed in 1995 and fronted by Phil Jamieson on vocals and guitar with Pat Davern on guitar, Joe Hansen on bass guitar and Kristian Hopes on drums. Also in 1995, they won the Triple J-sponsored Unearthed competition for Lismore, with their post-grunge song "Sickfest". Their name was taken from Dr. Lester Grinspoon an Associate Professor Emeritus of Psychiatry at Harvard Medical School, who supports marijuana for medical use.',
"Victoria was the daughter of Prince Edward, Duke of Kent and Strathearn, the fourth son of King George III. Both the Duke of Kent and King George III died in 1820, and Victoria was raised under close supervision by her German-born mother Princess Victoria of Saxe-Coburg-Saalfeld. She inherited the throne aged 18, after her father's three elder brothers had all died, leaving no surviving legitimate children. The United Kingdom was already an established constitutional monarchy, in which the sovereign held relatively little direct political power. Privately, Victoria attempted to influence government policy and ministerial appointments; publicly, she became a national icon who was identified with strict standards of personal morality.",
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
Evaluation
Metrics
Cross Encoder Correlation
- Dataset:
explorer-validation - Evaluated with
CECorrelationEvaluator
| Metric | Value |
|---|---|
| pearson | 0.9535 |
| spearman | 0.9229 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 6,313 training samples
- Columns:
sentence_0,sentence_1, andlabel - Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 label type string string float details - min: 31 characters
- mean: 79.25 characters
- max: 168 characters
- min: 111 characters
- mean: 805.66 characters
- max: 7035 characters
- min: 0.0
- mean: 0.52
- max: 1.0
- Samples:
sentence_0 sentence_1 label How far from the city with the highest cost of living in the nation is Stanford?Folk Singer is the fourth studio album by Muddy Waters, released in April 1964 by Chess Records. The album features Waters on acoustic guitar, backed by Willie Dixon on string bass, Clifton James on drums, and Buddy Guy on acoustic guitar. It is Waters's only all-acoustic album. Numerous reissues of "Folk Singer" include bonus tracks from two subsequent sessions, in April 1964 and October 1964.0.0What year was the university accepting classes from the Dockyard Technical College after its closure founded?The University of Southampton, which was founded in 1862 and received its Royal Charter as a university in 1952, has over 22,000 students. The university is ranked in the top 100 research universities in the world in the Academic Ranking of World Universities 2010. In 2010, the THES - QS World University Rankings positioned the University of Southampton in the top 80 universities in the world. The university considers itself one of the top 5 research universities in the UK. The university has a global reputation for research into engineering sciences, oceanography, chemistry, cancer sciences, sound and vibration research, computer science and electronics, optoelectronics and textile conservation at the Textile Conservation Centre (which is due to close in October 2009.) It is also home to the National Oceanography Centre, Southampton (NOCS), the focus of Natural Environment Research Council-funded marine research.The city was also home to the Royal Naval Engineering College; opened...1.0What is the main research library at the place where Torben Grodal is employed?The Pennsylvania State University (commonly referred to as Penn State or PSU) is a state - related, land - grant, doctoral university with campuses and facilities throughout Pennsylvania. Founded in 1855, the university has a stated threefold mission of teaching, research, and public service. Its instructional mission includes undergraduate, graduate, professional and continuing education offered through resident instruction and online delivery. Its University Park campus, the flagship campus, lies within the Borough of State College and College Township. It has two law schools: Penn State Law, on the school's University Park campus, and Dickinson Law, located in Carlisle, 90 miles south of State College. The College of Medicine is located in Hershey. Penn State has another 19 commonwealth campuses and 5 special mission campuses located across the state. Penn State has been labeled one of the ``Public Ivies, ''a publicly funded university considered as providing a quality of education ...0.0 - Loss:
BinaryCrossEntropyLosswith these parameters:{ "activation_fn": "torch.nn.modules.linear.Identity", "pos_weight": null }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: stepsper_device_train_batch_size: 4per_device_eval_batch_size: 4num_train_epochs: 1
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 4per_device_eval_batch_size: 4per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}
Training Logs
| Epoch | Step | Training Loss | explorer-validation_spearman |
|---|---|---|---|
| 0.3167 | 500 | 0.3547 | 0.9187 |
| 0.6333 | 1000 | 0.3377 | 0.9136 |
| 0.9500 | 1500 | 0.3264 | 0.9231 |
| 1.0 | 1579 | - | 0.9239 |
| -1 | -1 | - | 0.9239 |
| 0.3167 | 500 | 0.3111 | 0.9127 |
| 0.6333 | 1000 | 0.3306 | 0.9199 |
| 0.9500 | 1500 | 0.3175 | 0.9221 |
| 1.0 | 1579 | - | 0.9224 |
| -1 | -1 | - | 0.9224 |
| 0.3167 | 500 | 0.2913 | 0.9160 |
| 0.6333 | 1000 | 0.3053 | 0.9229 |
Framework Versions
- Python: 3.12.11
- Sentence Transformers: 5.2.0
- Transformers: 4.57.6
- PyTorch: 2.9.1+cu128
- Accelerate: 1.12.0
- Datasets: 4.5.0
- Tokenizers: 0.22.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}