Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 14
How to use lucasflins/CE-BERT-Tiny_L-2_H-128_A-2-BCE-20260207-084638 with sentence-transformers:
from sentence_transformers import CrossEncoder
model = CrossEncoder("lucasflins/CE-BERT-Tiny_L-2_H-128_A-2-BCE-20260207-084638")
query = "Which planet is known as the Red Planet?"
passages = [
"Venus is often called Earth's twin because of its similar size and proximity.",
"Mars, known for its reddish appearance, is often referred to as the Red Planet.",
"Jupiter, the largest planet in our solar system, has a prominent red spot.",
"Saturn, famous for its rings, is sometimes mistaken for the Red Planet."
]
scores = model.predict([(query, passage) for passage in passages])
print(scores)This is a Cross Encoder model finetuned from nreimers/BERT-Tiny_L-2_H-128_A-2 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("lucasflins/CE-BERT-Tiny_L-2_H-128_A-2-BCE-20260207-084638")
# Get scores for pairs of texts
pairs = [
['aparador off white', 'aparador para sala ambiente classic off white nature - imcal'],
['itatiaia renova', 'balcão itatiaia branco renova com 3 portas e 2 gavetas'],
['cozinhas moduladas completas 100 mdf', 'cozinha completa 7 peças 100% mdf com portas de vidro americana'],
['caixa de som', 'caixa de som torre double 12 2300w bluetooth pulse - ps736'],
['escrivaninha 90cm', 'escrivaninha mesa escritório dobrável 90cm industrial steel quadra'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'aparador off white',
[
'aparador para sala ambiente classic off white nature - imcal',
'balcão itatiaia branco renova com 3 portas e 2 gavetas',
'cozinha completa 7 peças 100% mdf com portas de vidro americana',
'caixa de som torre double 12 2300w bluetooth pulse - ps736',
'escrivaninha mesa escritório dobrável 90cm industrial steel quadra',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
query, text, and label| query | text | label | |
|---|---|---|---|
| type | string | string | float |
| details |
|
|
|
| query | text | label |
|---|---|---|
tenis under armour masculino |
tênis under armour charged slight 3 se |
0.9281755975185405 |
jogos de cama casal 100 algodao 4 pecas |
lençol casal 400 fios 3 peças toque 100% macio com fronhas |
0.5433727088037827 |
geladeira para caminhao |
geladeira portátil 40l 12v/24v 110v/220v caminhão ônibus van |
0.7394712159210983 |
BinaryCrossEntropyLoss with these parameters:{
"activation_fn": "torch.nn.modules.linear.Identity",
"pos_weight": null
}
query, text, and label| query | text | label | |
|---|---|---|---|
| type | string | string | float |
| details |
|
|
|
| query | text | label |
|---|---|---|
aparador off white |
aparador para sala ambiente classic off white nature - imcal |
0.3710301443240389 |
itatiaia renova |
balcão itatiaia branco renova com 3 portas e 2 gavetas |
0.35256627704291904 |
cozinhas moduladas completas 100 mdf |
cozinha completa 7 peças 100% mdf com portas de vidro americana |
0.2038212525937755 |
BinaryCrossEntropyLoss with these parameters:{
"activation_fn": "torch.nn.modules.linear.Identity",
"pos_weight": null
}
eval_strategy: epochper_device_train_batch_size: 2048per_device_eval_batch_size: 32num_train_epochs: 10warmup_ratio: 0.1log_level: infotf32: Trueload_best_model_at_end: Truehub_strategy: endhub_private_repo: Truehub_always_push: Trueeval_on_start: Trueoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: epochprediction_loss_only: Trueper_device_train_batch_size: 2048per_device_eval_batch_size: 32per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 10max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_ratio: 0.1warmup_steps: 0log_level: infolog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Truelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthproject: huggingfacetrackio_space_id: trackioddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: endhub_private_repo: Truehub_always_push: Truehub_revision: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: noneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Trueuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Trueprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | Validation Loss |
|---|---|---|---|
| 0 | 0 | - | 0.6922 |
| 1.0 | 1901 | 0.6888 | 0.6872 |
| 2.0 | 3802 | 0.6802 | 0.6879 |
| 3.0 | 5703 | 0.675 | 0.6902 |
| 4.0 | 7604 | 0.6715 | 0.6912 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
nreimers/BERT-Tiny_L-2_H-128_A-2