Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 14
How to use Trelis/ms-marco-MiniLM-L-6-v2-2-constant-ep-MNRLpairs-2e-5-batch32-cuda-overlap with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("Trelis/ms-marco-MiniLM-L-6-v2-2-constant-ep-MNRLpairs-2e-5-batch32-cuda-overlap")
sentences = [
"What happens if a player in possession enters the defending team's seven-metre zone?",
"10. 8 if a touch is made in the in - goal area before the ball is grounded, the player in possession is to perform a rollball seven ( 7 ) metres from the team ’ s attacking try line, provided it is not the sixth touch and the player is not half. 10. 9 if a player in possession is touched while on or behind their defending try line, the touch counts and once the referee sets the mark seven ( 7 ) metres directly forward of the contact point from the defending team ’ s try line, a rollball is performed. 10. 10 if a player in possession intentionally makes a touch on an offside defender who is making every effort to retire and remain out of play, the touch counts. fit playing rules - 5th edition copyright © touch football australia 2020 9 10. 11 if a touch is made on a player in possession while the player is juggling the ball in an attempt to maintain control of it, the touch counts if the attacking player following the touch retains possession.",
"9. 2 on the change of possession due to an intercept, the first touch will be zero ( 0 ) touch. 9. 3 following the sixth touch or a loss of possession due to any other means, the ball must be returned to the mark without delay. ruling = a deliberate delay in the changeover procedure will result in a penalty awarded to the non - offending team ten ( 10 ) metres forward of the mark for the change of possession. 9. 4 if the ball is dropped or passed and goes to ground during play, a change of possession results. ruling = the mark for the change of possession is where the ball makes initial contact with the ground. 9. 5 if the ball, while still under the control of the half, contacts the ground in the in - goal area, possession is lost. ruling = play will restart with a rollball at the nearest point on the seven ( 7 ) metre line. fit playing rules - 5th edition 8 copyright © touch football australia 2020 9. 6 if a player mishandles the ball and even if in an effort to gain control, the ball is accidentally knocked forward into any other player, a change of possession results.",
"fit playing rules - 5th edition copyright © touch football australia 2020 9 10. 11 if a touch is made on a player in possession while the player is juggling the ball in an attempt to maintain control of it, the touch counts if the attacking player following the touch retains possession. 10. 12 if a player in possession is touched and subsequently makes contact with either the sideline, a field marker or the ground outside the field of play, the touch counts and play continues with a rollball at the mark where the touch occurred. 10. 13 when a player from the defending team enters its defensive seven metre zone, the defending team must move forward at a reasonable pace until a touch is imminent or made. ruling = a penalty to the attacking team at the point of the infringement. 10. 14 when a player in possession enters the defending teams ’ seven metre zone the defending team is not obliged to move forward but cannot retire back towards their try line until a touch is imminent or made. ruling = a penalty to the attacking team at the seven ( 7 ) metre line in line with the point of the infringement."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from cross-encoder/ms-marco-MiniLM-L-6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Trelis/ms-marco-MiniLM-L-6-v2-2-constant-ep-MNRLpairs-2e-5-batch32-cuda-overlap")
# Run inference
sentences = [
'What happens if a team is leading at the end of the two-minute period of extra time?',
'24. 1. 2 the drop - off commences with a tap from the centre of the halfway line by the team that did not commence the match with possession. 24. 1. 3 the drop - off will commence with a two ( 2 ) minute period of extra time. 24. 1. 4 should a team be leading at the expiration of the two ( 2 ) minute period of extra time then that team will be declared the winner and match complete. 24. 1. 5 should neither team be leading at the expiration of two ( 2 ) minutes, a signal is given and the match will pause at the next touch or dead ball. each team will then remove another player from the field of play. 24. 1. 6 the match will recommence immediately after the players have left the field at the same place where it paused ( i. e. the team retains possession at the designated number of touches, or at change of possession due to some infringement or the sixth touch ) and the match will continue until a try is scored. 24. 1. 7 there is no time off during the drop - off and the clock does not stop at the two ( 2 ) minute interval.',
'7. 7 the tap to commence or recommence play must be performed without delay. ruling = a penalty to the non - offending team at the centre of the halfway line. 8 match duration 8. 1 a match is 40 minutes in duration, consisting of two ( 2 ) x 20 minute halves with a half time break. 8. 1. 1 there is no time off for injury during a match. 8. 2 local competition and tournament conditions may vary the duration of a match. 8. 3 when time expires, play is to continue until the next touch or dead ball and end of play is signaled by the referee. 8. 3. 1 should a penalty be awarded during this period, the penalty is to be taken. 8. 4 if a match is abandoned in any circumstances other than those referred to in clause 24. 1. 6 the nta or nta competition provider in its sole discretion shall determine the result of the match. 9 possession 9. 1 the team with the ball is entitled to six ( 6 ) touches prior to a change of possession. 9. 2 on the change of possession due to an intercept, the first touch will be zero ( 0 ) touch.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
eval_strategy: stepsper_device_train_batch_size: 32per_device_eval_batch_size: 32learning_rate: 2e-05num_train_epochs: 2lr_scheduler_type: constantwarmup_ratio: 0.3bf16: Trueoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 32per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 2max_steps: -1lr_scheduler_type: constantlr_scheduler_kwargs: {}warmup_ratio: 0.3warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falsebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | loss |
|---|---|---|---|
| 0.2857 | 2 | 3.7086 | 3.1734 |
| 0.5714 | 4 | 3.4714 | 3.0742 |
| 0.8571 | 6 | 3.41 | 3.0204 |
| 1.1429 | 8 | 3.042 | 2.9657 |
| 1.4286 | 10 | 3.3335 | 2.9125 |
| 1.7143 | 12 | 3.2224 | 2.8573 |
| 2.0 | 14 | 2.9969 | 2.8300 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
microsoft/MiniLM-L12-H384-uncased