sentence-transformers/gooaq
Viewer • Updated • 3.01M • 408 • 35
How to use tomaarsen/bert-base-uncased-gooaq-peft with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("tomaarsen/bert-base-uncased-gooaq-peft")
sentences = [
"are the sequels better than the prequels?",
"['Automatically.', 'When connected to car Bluetooth and,', 'Manually.']",
"The prequels are also not scared to take risks, making movies which are very different from the original trilogy. The sequel saga, on the other hand, are technically better made films, the acting is more consistent, the CGI is better and the writing is stronger, however it falls down in many other places.",
"While both public and private sectors use budgets as a key planning tool, public bodies balance budgets, while private sector firms use budgets to predict operating results. The public sector budget matches expenditures on mandated assets and services with receipts of public money such as taxes and fees."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from google-bert/bert-base-uncased on the gooaq dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("tomaarsen/bert-base-uncased-gooaq-peft")
# Run inference
sentences = [
'how can i download youtube videos with internet download manager?',
"['Go to settings and then click on extensions (top left side in chrome).', 'Minimise your browser and open the location (folder) where IDM is installed. ... ', 'Find the file “IDMGCExt. ... ', 'Drag this file to your chrome browser and drop to install the IDM extension.']",
"Coca-Cola might rot your teeth and load your body with sugar and calories, but it's actually an effective and safe first line of treatment for some stomach blockages, researchers say.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
NanoClimateFEVER, NanoDBPedia, NanoFEVER, NanoFiQA2018, NanoHotpotQA, NanoMSMARCO, NanoNFCorpus, NanoNQ, NanoQuoraRetrieval, NanoSCIDOCS, NanoArguAna, NanoSciFact and NanoTouche2020InformationRetrievalEvaluator| Metric | NanoClimateFEVER | NanoDBPedia | NanoFEVER | NanoFiQA2018 | NanoHotpotQA | NanoMSMARCO | NanoNFCorpus | NanoNQ | NanoQuoraRetrieval | NanoSCIDOCS | NanoArguAna | NanoSciFact | NanoTouche2020 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| cosine_accuracy@1 | 0.3 | 0.48 | 0.6 | 0.22 | 0.64 | 0.22 | 0.32 | 0.4 | 0.84 | 0.3 | 0.16 | 0.38 | 0.3673 |
| cosine_accuracy@3 | 0.42 | 0.78 | 0.84 | 0.4 | 0.74 | 0.46 | 0.44 | 0.54 | 0.98 | 0.36 | 0.52 | 0.56 | 0.8571 |
| cosine_accuracy@5 | 0.48 | 0.82 | 0.9 | 0.5 | 0.82 | 0.54 | 0.46 | 0.62 | 0.98 | 0.54 | 0.72 | 0.64 | 0.9388 |
| cosine_accuracy@10 | 0.54 | 0.92 | 0.96 | 0.6 | 0.84 | 0.68 | 0.5 | 0.7 | 1.0 | 0.68 | 0.8 | 0.7 | 1.0 |
| cosine_precision@1 | 0.3 | 0.48 | 0.6 | 0.22 | 0.64 | 0.22 | 0.32 | 0.4 | 0.84 | 0.3 | 0.16 | 0.38 | 0.3673 |
| cosine_precision@3 | 0.16 | 0.46 | 0.28 | 0.18 | 0.3133 | 0.1533 | 0.2867 | 0.18 | 0.3867 | 0.2 | 0.1733 | 0.2 | 0.4966 |
| cosine_precision@5 | 0.116 | 0.416 | 0.184 | 0.14 | 0.224 | 0.108 | 0.244 | 0.124 | 0.24 | 0.192 | 0.144 | 0.14 | 0.449 |
| cosine_precision@10 | 0.066 | 0.39 | 0.098 | 0.098 | 0.118 | 0.068 | 0.178 | 0.072 | 0.13 | 0.142 | 0.08 | 0.078 | 0.3939 |
| cosine_recall@1 | 0.1483 | 0.0444 | 0.59 | 0.1144 | 0.32 | 0.22 | 0.0229 | 0.4 | 0.7573 | 0.0647 | 0.16 | 0.345 | 0.0307 |
| cosine_recall@3 | 0.21 | 0.1092 | 0.8 | 0.2189 | 0.47 | 0.46 | 0.0516 | 0.53 | 0.9287 | 0.1247 | 0.52 | 0.525 | 0.1124 |
| cosine_recall@5 | 0.2567 | 0.145 | 0.8567 | 0.3109 | 0.56 | 0.54 | 0.062 | 0.59 | 0.936 | 0.1967 | 0.72 | 0.615 | 0.1616 |
| cosine_recall@10 | 0.2867 | 0.2407 | 0.9067 | 0.4079 | 0.59 | 0.68 | 0.0734 | 0.67 | 0.9793 | 0.2907 | 0.8 | 0.68 | 0.2674 |
| cosine_ndcg@10 | 0.2613 | 0.4507 | 0.7556 | 0.2964 | 0.5584 | 0.4416 | 0.2241 | 0.5271 | 0.9154 | 0.2646 | 0.4714 | 0.5211 | 0.4291 |
| cosine_mrr@10 | 0.3718 | 0.6355 | 0.7192 | 0.3307 | 0.7015 | 0.3667 | 0.3782 | 0.4859 | 0.9053 | 0.3836 | 0.3663 | 0.4848 | 0.6237 |
| cosine_map@100 | 0.2163 | 0.3183 | 0.7017 | 0.2334 | 0.4954 | 0.3814 | 0.0878 | 0.4878 | 0.889 | 0.2058 | 0.3751 | 0.4707 | 0.3288 |
NanoBEIR_meanNanoBEIREvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.4021 |
| cosine_accuracy@3 | 0.6075 |
| cosine_accuracy@5 | 0.6891 |
| cosine_accuracy@10 | 0.7631 |
| cosine_precision@1 | 0.4021 |
| cosine_precision@3 | 0.2669 |
| cosine_precision@5 | 0.2093 |
| cosine_precision@10 | 0.1471 |
| cosine_recall@1 | 0.2475 |
| cosine_recall@3 | 0.3893 |
| cosine_recall@5 | 0.4577 |
| cosine_recall@10 | 0.5287 |
| cosine_ndcg@10 | 0.4705 |
| cosine_mrr@10 | 0.5195 |
| cosine_map@100 | 0.3993 |
question and answer| question | answer | |
|---|---|---|
| type | string | string |
| details |
|
|
| question | answer |
|---|---|
what is the difference between broilers and layers? |
An egg laying poultry is called egger or layer whereas broilers are reared for obtaining meat. So a layer should be able to produce more number of large sized eggs, without growing too much. On the other hand, a broiler should yield more meat and hence should be able to grow well. |
what is the difference between chronological order and spatial order? |
As a writer, you should always remember that unlike chronological order and the other organizational methods for data, spatial order does not take into account the time. Spatial order is primarily focused on the location. All it does is take into account the location of objects and not the time. |
is kamagra same as viagra? |
Kamagra is thought to contain the same active ingredient as Viagra, sildenafil citrate. In theory, it should work in much the same way as Viagra, taking about 45 minutes to take effect, and lasting for around 4-6 hours. However, this will vary from person to person. |
MatryoshkaLoss with these parameters:{
"loss": "CachedMultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64,
32
],
"matryoshka_weights": [
1,
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
question and answer| question | answer | |
|---|---|---|
| type | string | string |
| details |
|
|
| question | answer |
|---|---|
how do i program my directv remote with my tv? |
['Press MENU on your remote.', 'Select Settings & Help > Settings > Remote Control > Program Remote.', 'Choose the device (TV, audio, DVD) you wish to program. ... ', 'Follow the on-screen prompts to complete programming.'] |
are rodrigues fruit bats nocturnal? |
Before its numbers were threatened by habitat destruction, storms, and hunting, some of those groups could number 500 or more members. Sunrise, sunset. Rodrigues fruit bats are most active at dawn, at dusk, and at night. |
why does your heart rate increase during exercise bbc bitesize? |
During exercise there is an increase in physical activity and muscle cells respire more than they do when the body is at rest. The heart rate increases during exercise. The rate and depth of breathing increases - this makes sure that more oxygen is absorbed into the blood, and more carbon dioxide is removed from it. |
MatryoshkaLoss with these parameters:{
"loss": "CachedMultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64,
32
],
"matryoshka_weights": [
1,
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
eval_strategy: stepsper_device_train_batch_size: 1024per_device_eval_batch_size: 1024learning_rate: 2e-05num_train_epochs: 1warmup_ratio: 0.1seed: 12bf16: Truebatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 1024per_device_eval_batch_size: 1024per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 12data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | Validation Loss | NanoClimateFEVER_cosine_ndcg@10 | NanoDBPedia_cosine_ndcg@10 | NanoFEVER_cosine_ndcg@10 | NanoFiQA2018_cosine_ndcg@10 | NanoHotpotQA_cosine_ndcg@10 | NanoMSMARCO_cosine_ndcg@10 | NanoNFCorpus_cosine_ndcg@10 | NanoNQ_cosine_ndcg@10 | NanoQuoraRetrieval_cosine_ndcg@10 | NanoSCIDOCS_cosine_ndcg@10 | NanoArguAna_cosine_ndcg@10 | NanoSciFact_cosine_ndcg@10 | NanoTouche2020_cosine_ndcg@10 | NanoBEIR_mean_cosine_ndcg@10 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | - | - | 0.1046 | 0.2182 | 0.1573 | 0.0575 | 0.2597 | 0.1602 | 0.0521 | 0.0493 | 0.7310 | 0.1320 | 0.2309 | 0.1240 | 0.0970 | 0.1826 |
| 0.0010 | 1 | 28.4479 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0256 | 25 | 27.0904 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0512 | 50 | 19.016 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.0768 | 75 | 12.2306 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1024 | 100 | 9.0613 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1279 | 125 | 7.393 | 3.7497 | 0.2787 | 0.4840 | 0.7029 | 0.2589 | 0.5208 | 0.4094 | 0.2117 | 0.4526 | 0.9042 | 0.2503 | 0.5280 | 0.4922 | 0.4132 | 0.4544 |
| 0.1535 | 150 | 6.6613 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.1791 | 175 | 6.1911 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2047 | 200 | 5.9305 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2303 | 225 | 5.6825 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.2559 | 250 | 5.5326 | 2.8771 | 0.2867 | 0.4619 | 0.7333 | 0.2835 | 0.5549 | 0.4056 | 0.2281 | 0.4883 | 0.9137 | 0.2555 | 0.5114 | 0.5220 | 0.4298 | 0.4673 |
| 0.2815 | 275 | 5.1671 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3071 | 300 | 5.2006 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3327 | 325 | 5.0447 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3582 | 350 | 4.9647 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.3838 | 375 | 4.8521 | 2.5709 | 0.2881 | 0.4577 | 0.7438 | 0.2909 | 0.5712 | 0.4093 | 0.2273 | 0.5141 | 0.9008 | 0.2668 | 0.5117 | 0.5253 | 0.4331 | 0.4723 |
| 0.4094 | 400 | 4.8423 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.4350 | 425 | 4.7472 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.4606 | 450 | 4.6527 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.4862 | 475 | 4.61 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5118 | 500 | 4.5451 | 2.4136 | 0.2786 | 0.4464 | 0.7485 | 0.2961 | 0.5638 | 0.4368 | 0.2269 | 0.5125 | 0.8998 | 0.2680 | 0.4938 | 0.5341 | 0.4383 | 0.4726 |
| 0.5374 | 525 | 4.5357 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5629 | 550 | 4.481 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.5885 | 575 | 4.4669 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.6141 | 600 | 4.3886 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.6397 | 625 | 4.2929 | 2.3091 | 0.2639 | 0.4475 | 0.7521 | 0.3095 | 0.5619 | 0.4448 | 0.2244 | 0.5178 | 0.9102 | 0.2655 | 0.4809 | 0.5253 | 0.4351 | 0.4722 |
| 0.6653 | 650 | 4.2558 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.6909 | 675 | 4.3228 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7165 | 700 | 4.2496 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7421 | 725 | 4.2304 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.7677 | 750 | 4.224 | 2.2440 | 0.2628 | 0.4514 | 0.7387 | 0.3028 | 0.5522 | 0.4313 | 0.2253 | 0.5266 | 0.9211 | 0.2675 | 0.4929 | 0.5232 | 0.4351 | 0.4716 |
| 0.7932 | 775 | 4.2821 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8188 | 800 | 4.2686 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8444 | 825 | 4.1657 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8700 | 850 | 4.2297 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.8956 | 875 | 4.1709 | 2.2142 | 0.2685 | 0.4520 | 0.7569 | 0.2930 | 0.5625 | 0.4486 | 0.2229 | 0.5280 | 0.9153 | 0.2601 | 0.4862 | 0.5199 | 0.4334 | 0.4729 |
| 0.9212 | 900 | 4.0771 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9468 | 925 | 4.1492 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9724 | 950 | 4.2074 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 0.9980 | 975 | 4.0993 | - | - | - | - | - | - | - | - | - | - | - | - | - | - | - |
| 1.0 | 977 | - | - | 0.2613 | 0.4507 | 0.7556 | 0.2964 | 0.5584 | 0.4416 | 0.2241 | 0.5271 | 0.9154 | 0.2646 | 0.4714 | 0.5211 | 0.4291 | 0.4705 |
Carbon emissions were measured using CodeCarbon.
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{gao2021scaling,
title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup},
author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
year={2021},
eprint={2101.06983},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Base model
google-bert/bert-base-uncased