Sentence Similarity
sentence-transformers
Safetensors
English
modernbert
feature-extraction
Generated from Trainer
dataset_size:11518
loss:MatryoshkaLoss
loss:MultipleNegativesRankingLoss
Eval Results (legacy)
text-embeddings-inference
Instructions to use kokojake/modernbert-embed-base-fitness-health-matryoshka with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use kokojake/modernbert-embed-base-fitness-health-matryoshka with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("kokojake/modernbert-embed-base-fitness-health-matryoshka") sentences = [ "= little to no difference (low certainty)\n \t\n= little to no difference (moderate certainty)\n= benefit that meets threshold for clinically important difference (very low certainty) = benefit that meets threshold for clinically important difference (low certainty)\n\t\n= benefit that meets threshold for clinically important difference (moderate certainty).\n53\n4. Evidence and recommendations\nIntervention class B: Physical interventions •\t In the comparison of any structured \nexercise programme with usual care \n(6 trials), trivial benefits were observed for \npain and function. Since the certainty of the \nevidence was very low for other time-points and outcomes, it was uncertain whether \nany structured exercise programme: \n\t›\n\tdecreased pain in the immediate term \n(trivial effect);\n\t›\n\tmade little to no difference to pain in the short term or long term;\n\t›\n\timproved function in the immediate \nterm (trivial effect);\n\t›\n\tmade little to no difference to function \nin the short term; or\n\t› improved function in the long term. \nIn the two trials on older people, trivial \nbenefit was observed for function. Since the \ncertainty of the evidence was very low, it was uncertain whether any structured exercise \nprogramme:\n\t›\n\tmade little to no difference to pain in \nthe immediate term (2 trials) or short \nterm (1 trial);\n\t› improved function the immediate term \n(trivial effect, 2 trials); or\n\t›\n\tmade little to no difference to function \nin the short term (1 trial).\nIn the two trials that monitored harms,", "voice functions assessment techniques and impairments", "physiotherapy exercises for patients with casts", "effectiveness of exercise on function in older adults" ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4] - Notebooks
- Google Colab
- Kaggle
metadata
language:
- en
license: apache-2.0
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:11518
- loss:MatryoshkaLoss
- loss:MultipleNegativesRankingLoss
base_model: nomic-ai/modernbert-embed-base
widget:
- source_sentence: "= little to no difference (low certainty)\n \t\n= little to no difference (moderate certainty)\n= benefit that meets threshold for clinically important difference (very low certainty) = benefit that meets threshold for clinically important difference (low certainty)\n\t\n= benefit that meets threshold for clinically important difference (moderate certainty).\n53\n4. Evidence and recommendations\nIntervention class B: Physical interventions •\t In the comparison of any structured \nexercise programme with usual care \n(6 trials), trivial benefits were observed for \npain and function. Since the certainty of the \nevidence was very low for other time-points and outcomes, it was uncertain whether \nany structured exercise programme: \n\t›\n\tdecreased pain in the immediate term \n(trivial effect);\n\t›\n\tmade little to no difference to pain in the short term or long term;\n\t›\n\timproved function in the immediate \nterm (trivial effect);\n\t›\n\tmade little to no difference to function \nin the short term; or\n\t› improved function in the long term. \nIn the two trials on older people, trivial \nbenefit was observed for function. Since the \ncertainty of the evidence was very low, it was uncertain whether any structured exercise \nprogramme:\n\t›\n\tmade little to no difference to pain in \nthe immediate term (2 trials) or short \nterm (1 trial);\n\t› improved function the immediate term \n(trivial effect, 2 trials); or\n\t›\n\tmade little to no difference to function \nin the short term (1 trial).\nIn the two trials that monitored harms,"
sentences:
- voice functions assessment techniques and impairments
- physiotherapy exercises for patients with casts
- effectiveness of exercise on function in older adults
- source_sentence: "Material resources\_\nOccupations \n(rehabilitation specialists) \nAssistive products\nEquipment\_\nConsumables\_\n Cardiovascular and immunological functions\nTarget: Oedema control\nAssessment of oedema 10\n–\n•\tMeasuring tape\n–\n•\tNursing professional\n•\tOccupational therapist\n•\tPhysiotherapist \n•\tSpecialist medical practitioner/\nPRM physician\nRange of motion exercises\n15 –\n•\tTreatment table\n–\n•\tOccupational therapist \n•\tPhysiotherapist \nRetrograde massage\n30\n–\n•\tTreatment table\n•\tPillows •\tFoam rollers/wedges\n•\tCompression bandages\n•\tMassage lotion\n•\tOccupational therapist\n•\tPhysiotherapist\nPositioning for oedema \ncontrol\n10\n– •\tPillows\n•\tFoam rollers/wedges\n–\n•\tNursing professional\n•\tOccupational therapist\n•\tPhysiotherapist\nProvision and training \nin the use of assistive products for compression \ntherapy\n15\n•\tProducts for compression \ntherapy (garments, sockets, \nbandages)\n–\n–\n•\tNursing professional \n•\tOccupational therapist •\tPhysiotherapist \nMotor functions and mobility\nTarget: Mobility of joint functions\nAssessment of joint \nmobility\n10\n–\n•\tTreatment table\n•\tGoniometer\n•\tMeasuring tape\n–"
sentences:
- role of occupational therapists in oedema management
- protective measures during plaster application plastic sheeting
- materials needed for adult fracture immobilization with POP
- source_sentence: "benefits offered.\n•\t There are potentially serious adverse events \nassociated with SNRI antidepressants \namong older people, including \nhyponatraemia, memory impairment, \ngastrointestinal events and falls, without evidence of benefit. \n127\n4. Evidence and recommendations\nIntervention class D: Medicines\n\t›\nIt was uncertain whether SNRI \nantidepressants made little to no \ndifference to work-related outcomes (low to very low certainty evidence).\nHarms were monitored across the trials and \nwere identified for nausea, constipation, \ndizziness and somnolence.\n\t›\n\tSNRI antidepressants were probably associated with a large increase in the \nlikelihood of discontinuation due to \nadverse events, nausea, constipation, \ndizziness and somnolence (moderate \ncertainty evidence). \n\t› It was uncertain whether SNRI \nantidepressants were associated with \na small increased likelihood of serious \nadverse events (very low certainty \nevidence).\n•\t In the comparison of SNRI antidepressants (treatment duration < 12 weeks) with \nplacebo (5 trials, 263 participants), the \ncertainty of evidence was very low. It was \nuncertain whether SNRI antidepressants made little to no difference to pain or \npsychological well-being at < 1 month or at \n1–3 months; or to function or quality of life \nat 1–3 months. Harms were monitored in the included \ntrials, however, the certainty of evidence \nwas very low. It was therefore uncertain \nwhether SNRI antidepressants:\n\t›\nincreased the likelihood of treatment"
sentences:
- does organic milk have more n-3 PUFA than non-organic
- evidence-based recommendations for chronic low back pain management
- SNRI antidepressants and work-related outcomes evidence
- source_sentence: >-
Assessment of mobilityb
Mobility training (incl. wheelchair skills
training)b
Functional positioningb
Provision and training in the use of assistive
products for mobilityb
Assessment of hand and arm usec Functional training for hand and arm usec
Exercise and
fitness
Assessment of exercise capacityb
Fitness trainingb
Activity of daily
living
Assessment of activities of daily living
(ADL)b ADL trainingb
Provision and training in the use of assistive
products for self-careb
Modification of the home environmentb
[cont.]
82
Assessmentsa
Interventionsa
Interpersonal interactions and
relationships
Assessment of interpersonal interactions
and relationshipsb
Psychological support/counsellingb
Education and
vocation
Educational assessmentk
Educational counselling, training, and supportk Modification of the school
environmentk Vocational assessmentj
Vocational counselling, training, and supportd
Provision and training in the use of assistive
products for workd
Modification of the workplace environmentd
Community
and social life Assessment of participation in
community and social lifeb
Participation focused interventionsb
Peer supportb
Lifestyle
modification
Assessment of lifestyle risk factorsb
Education and advice on healthy lifestyleb
Self-
management Education, advice and support for self-
management of the health conditionb
Education and advice on self-directed
exercisesb
Carer and
family support
Assessment of carer and family needsl
sentences:
- adverse events in dietary and exercise interventions
- ACSM-CPT certification details
- assistive products for daily living activities
- source_sentence: "facilitate publication;\n• \aMobilise academic expertise for \ndeveloping training programmes and \nmobilising trainers.\n\t Weigh in on the debate around issues \nrelated to rehabilitation promotion and funding, promote best practices to \ninfluence policies that favour access \nto rehabilitation services and thereby \nmove toward advocacy actions.\n48\nUsers,\nDisabled people’s\norganisations\nService\nproviders\nDecision-makers User \ngroups\nLocal\nauthorities\nMinistry of \nHealth, Ministry \nof Social Action,\netc.\nUnited Nations \n(WHO, etc.)\nHospitals, \nReference\nrehabilitation centre\nProfessional \nassociations\nService provider groups\nTraining institutes\nCommunity- \nbased Services\nFederation\nand national\n associations\nHospital, \nHealth \ncare centres Network: actors that can be mobilised for physical \nand functional rehabilitation\nInternational\nNational\nLocal\nInstitutional donors\nFacilitation organisations* * \aOrganisations (IOs, NGOs, etc.), agencies, universities and research centres that facilitate the existence of physical \nand functional rehabilitation via national or international projects.\nInternational \n consortia (IDDC, etc.)\n International\n networks \n (CBR, WCPT, \n WFOT, ISPO,\nFATO, etc.)\nLevels of intervention © Handicap International, 2013\n \n \n49\n\_Intervention.\n\_modalities .\nThe Unit has technical resources specifically \npositioned to be able to reach the maximum"
sentences:
- risks of yo-yo dieting and heart disease
- training programmes for rehabilitation professionals
- >-
long-term effects of mobility assistive products on chronic low back
pain
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
- cosine_accuracy@1
- cosine_accuracy@3
- cosine_accuracy@5
- cosine_accuracy@10
- cosine_precision@1
- cosine_precision@3
- cosine_precision@5
- cosine_precision@10
- cosine_recall@1
- cosine_recall@3
- cosine_recall@5
- cosine_recall@10
- cosine_ndcg@10
- cosine_mrr@10
- cosine_map@100
model-index:
- name: ModernBERT Embed base fitness health Matryoshka
results:
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 768
type: dim_768
metrics:
- type: cosine_accuracy@1
value: 0.47890625
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.47890625
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.47890625
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.521875
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.47890625
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.47890625
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.47890625
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.439453125
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.060052083333333325
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.18015625000000002
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.3002604166666667
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.5134114583333333
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.49903613322071383
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.4860677083333333
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.5680816897290807
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 512
type: dim_512
metrics:
- type: cosine_accuracy@1
value: 0.47421875
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.47421875
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.47421875
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.5140625
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.47421875
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.47421875
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.47421875
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.436171875
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.058958333333333335
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.17687499999999998
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.2947916666666667
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.5077864583333334
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.4934297397023317
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.4808268229166666
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.5631677376472394
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 256
type: dim_256
metrics:
- type: cosine_accuracy@1
value: 0.45546875
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.45546875
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.45546875
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.496875
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.45546875
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.45546875
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.45546875
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.41875
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.05701822916666667
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.1710546875
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.2850911458333333
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.4881510416666666
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.47460494952644156
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.4623697916666667
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.5446289971505553
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 128
type: dim_128
metrics:
- type: cosine_accuracy@1
value: 0.43515625
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.43515625
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.43515625
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.47265625
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.43515625
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.43515625
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.43515625
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.398828125
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.05451822916666667
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.1635546875
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.2725911458333333
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.4639322916666666
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.4521791896913678
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.44140625000000017
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.5207625038942943
name: Cosine Map@100
- task:
type: information-retrieval
name: Information Retrieval
dataset:
name: dim 64
type: dim_64
metrics:
- type: cosine_accuracy@1
value: 0.39453125
name: Cosine Accuracy@1
- type: cosine_accuracy@3
value: 0.39453125
name: Cosine Accuracy@3
- type: cosine_accuracy@5
value: 0.39453125
name: Cosine Accuracy@5
- type: cosine_accuracy@10
value: 0.4296875
name: Cosine Accuracy@10
- type: cosine_precision@1
value: 0.39453125
name: Cosine Precision@1
- type: cosine_precision@3
value: 0.39453125
name: Cosine Precision@3
- type: cosine_precision@5
value: 0.39453125
name: Cosine Precision@5
- type: cosine_precision@10
value: 0.359765625
name: Cosine Precision@10
- type: cosine_recall@1
value: 0.049895833333333334
name: Cosine Recall@1
- type: cosine_recall@3
value: 0.1496875
name: Cosine Recall@3
- type: cosine_recall@5
value: 0.2494791666666667
name: Cosine Recall@5
- type: cosine_recall@10
value: 0.4223958333333333
name: Cosine Recall@10
- type: cosine_ndcg@10
value: 0.4108797528312945
name: Cosine Ndcg@10
- type: cosine_mrr@10
value: 0.40039062500000033
name: Cosine Mrr@10
- type: cosine_map@100
value: 0.4763475810083717
name: Cosine Map@100
ModernBERT Embed base fitness health Matryoshka
This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: nomic-ai/modernbert-embed-base
- Maximum Sequence Length: 8192 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
- Training Dataset:
- json
- Language: en
- License: apache-2.0
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("kokojake/modernbert-embed-base-fitness-health-matryoshka")
# Run inference
sentences = [
'facilitate publication;\n•\u2009\x07Mobilise academic expertise for \ndeveloping training programmes and \nmobilising trainers.\n\t Weigh in on the debate around issues \nrelated to rehabilitation promotion and funding, promote best practices to \ninfluence policies that favour access \nto rehabilitation services and thereby \nmove toward advocacy actions.\n48\nUsers,\nDisabled people’s\norganisations\nService\nproviders\nDecision-makers User \ngroups\nLocal\nauthorities\nMinistry of \nHealth, Ministry \nof Social Action,\netc.\nUnited Nations \n(WHO, etc.)\nHospitals, \nReference\nrehabilitation centre\nProfessional \nassociations\nService provider groups\nTraining institutes\nCommunity- \nbased Services\nFederation\nand national\n associations\nHospital, \nHealth \ncare centres Network: actors that can be mobilised for physical \nand functional rehabilitation\nInternational\nNational\nLocal\nInstitutional donors\nFacilitation organisations* * \x07Organisations (IOs, NGOs, etc.), agencies, universities and research centres that facilitate the existence of physical \nand functional rehabilitation via national or international projects.\nInternational \n consortia (IDDC, etc.)\n International\n networks \n (CBR, WCPT, \n WFOT, ISPO,\nFATO, etc.)\nLevels of intervention © Handicap International, 2013\n \n \n49\n\xa0Intervention.\n\xa0modalities\u200a.\nThe Unit has technical resources specifically \npositioned to be able to reach the maximum',
'training programmes for rehabilitation professionals',
'risks of yo-yo dieting and heart disease',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Dataset:
dim_768 - Evaluated with
InformationRetrievalEvaluatorwith these parameters:{ "truncate_dim": 768 }
| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.4789 |
| cosine_accuracy@3 | 0.4789 |
| cosine_accuracy@5 | 0.4789 |
| cosine_accuracy@10 | 0.5219 |
| cosine_precision@1 | 0.4789 |
| cosine_precision@3 | 0.4789 |
| cosine_precision@5 | 0.4789 |
| cosine_precision@10 | 0.4395 |
| cosine_recall@1 | 0.0601 |
| cosine_recall@3 | 0.1802 |
| cosine_recall@5 | 0.3003 |
| cosine_recall@10 | 0.5134 |
| cosine_ndcg@10 | 0.499 |
| cosine_mrr@10 | 0.4861 |
| cosine_map@100 | 0.5681 |
Information Retrieval
- Dataset:
dim_512 - Evaluated with
InformationRetrievalEvaluatorwith these parameters:{ "truncate_dim": 512 }
| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.4742 |
| cosine_accuracy@3 | 0.4742 |
| cosine_accuracy@5 | 0.4742 |
| cosine_accuracy@10 | 0.5141 |
| cosine_precision@1 | 0.4742 |
| cosine_precision@3 | 0.4742 |
| cosine_precision@5 | 0.4742 |
| cosine_precision@10 | 0.4362 |
| cosine_recall@1 | 0.059 |
| cosine_recall@3 | 0.1769 |
| cosine_recall@5 | 0.2948 |
| cosine_recall@10 | 0.5078 |
| cosine_ndcg@10 | 0.4934 |
| cosine_mrr@10 | 0.4808 |
| cosine_map@100 | 0.5632 |
Information Retrieval
- Dataset:
dim_256 - Evaluated with
InformationRetrievalEvaluatorwith these parameters:{ "truncate_dim": 256 }
| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.4555 |
| cosine_accuracy@3 | 0.4555 |
| cosine_accuracy@5 | 0.4555 |
| cosine_accuracy@10 | 0.4969 |
| cosine_precision@1 | 0.4555 |
| cosine_precision@3 | 0.4555 |
| cosine_precision@5 | 0.4555 |
| cosine_precision@10 | 0.4188 |
| cosine_recall@1 | 0.057 |
| cosine_recall@3 | 0.1711 |
| cosine_recall@5 | 0.2851 |
| cosine_recall@10 | 0.4882 |
| cosine_ndcg@10 | 0.4746 |
| cosine_mrr@10 | 0.4624 |
| cosine_map@100 | 0.5446 |
Information Retrieval
- Dataset:
dim_128 - Evaluated with
InformationRetrievalEvaluatorwith these parameters:{ "truncate_dim": 128 }
| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.4352 |
| cosine_accuracy@3 | 0.4352 |
| cosine_accuracy@5 | 0.4352 |
| cosine_accuracy@10 | 0.4727 |
| cosine_precision@1 | 0.4352 |
| cosine_precision@3 | 0.4352 |
| cosine_precision@5 | 0.4352 |
| cosine_precision@10 | 0.3988 |
| cosine_recall@1 | 0.0545 |
| cosine_recall@3 | 0.1636 |
| cosine_recall@5 | 0.2726 |
| cosine_recall@10 | 0.4639 |
| cosine_ndcg@10 | 0.4522 |
| cosine_mrr@10 | 0.4414 |
| cosine_map@100 | 0.5208 |
Information Retrieval
- Dataset:
dim_64 - Evaluated with
InformationRetrievalEvaluatorwith these parameters:{ "truncate_dim": 64 }
| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.3945 |
| cosine_accuracy@3 | 0.3945 |
| cosine_accuracy@5 | 0.3945 |
| cosine_accuracy@10 | 0.4297 |
| cosine_precision@1 | 0.3945 |
| cosine_precision@3 | 0.3945 |
| cosine_precision@5 | 0.3945 |
| cosine_precision@10 | 0.3598 |
| cosine_recall@1 | 0.0499 |
| cosine_recall@3 | 0.1497 |
| cosine_recall@5 | 0.2495 |
| cosine_recall@10 | 0.4224 |
| cosine_ndcg@10 | 0.4109 |
| cosine_mrr@10 | 0.4004 |
| cosine_map@100 | 0.4763 |
Training Details
Training Dataset
json
- Dataset: json
- Size: 11,518 training samples
- Columns:
positiveandanchor - Approximate statistics based on the first 1000 samples:
positive anchor type string string details - min: 7 tokens
- mean: 239.56 tokens
- max: 410 tokens
- min: 5 tokens
- mean: 10.8 tokens
- max: 26 tokens
- Samples:
positive anchor values and preferences among older people
in relation to exercise, noting that older
people valued the outcomes of exercise
for maintaining health. They judged that
the evidence for older people was likely to be relevant to all adults and agreed
there was likely to be some uncertainty or
variability with respect to people’s values and
preferences for exercise and its outcomes.
Some GDG members suggested that given reasonably consistent benefit and very
little harms, there would be no important
uncertainty or variability regarding people’s
values on the outcomes of exercise. In the
absence of direct qualitative evidence, the GDG judged from their own experience
that resource requirements for structured
exercise programmes would vary by country
and setting, but in some settings might
be associated with moderate costs (for
structured exercise programmes, compared with self-managed physical activity). The GDG
noted that costs could also vary according to
the modality of ...exercise preferences and outcomes variability among adultsICRC, ICRC Hospital Design and Rehabilitation Guidelines, Vol. 1: Models Of Care, ICRC, Geneva, 2022: https://shop. icrc.org/icrc-hospital-design-and-rehabilitation-guidelines-volume-1-models-of-care-print-en.htmlICRC rehabilitation guidelines 2022fitness training is guided by a health worker or (if feasible) performed self-directed by
the patient following education and advice.
Metacognitive
training
Metacognitive training aims to improve social functioning through reducing cognitive biases/psychotic symptoms (e.g. delusion, impaired self-awareness or insight).
Metacognitive training is usually provided as a structured group intervention during which participants perform exercises to reflect their own thinking and receive training in
strategies to cope with cognitive biases during daily routines. Metacognitive training is
guided by a health worker.
Mindfulness-
based approaches Mindfulness-based interventions aim to achieve a state of mindfulness in which a
person becomes more aware of their physical, mental, and emotional condition in the
present moment, without becoming judgemental. Mindfulness-based interventions (e.g. mindfulness-based cognitive therapy, acceptance and commitment therapy)
help people to pay attentio...structured group interventions for metacognitive training - Loss:
MatryoshkaLosswith these parameters:{ "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy: epochper_device_train_batch_size: 32per_device_eval_batch_size: 16gradient_accumulation_steps: 16learning_rate: 2e-05num_train_epochs: 4lr_scheduler_type: cosinewarmup_ratio: 0.1bf16: Truetf32: Trueload_best_model_at_end: Trueoptim: adamw_torch_fusedbatch_sampler: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir: Falsedo_predict: Falseeval_strategy: epochprediction_loss_only: Trueper_device_train_batch_size: 32per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 16eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 4max_steps: -1lr_scheduler_type: cosinelr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Truelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size: 0fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters:auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional
Training Logs
| Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
|---|---|---|---|---|---|---|---|
| 0.4444 | 10 | 64.4729 | - | - | - | - | - |
| 0.8889 | 20 | 32.1029 | - | - | - | - | - |
| 1.0 | 23 | - | 0.4734 | 0.4741 | 0.4590 | 0.4271 | 0.3722 |
| 1.3111 | 30 | 23.9454 | - | - | - | - | - |
| 1.7556 | 40 | 19.7319 | - | - | - | - | - |
| 2.0 | 46 | - | 0.4934 | 0.4926 | 0.4723 | 0.4471 | 0.4021 |
| 2.1778 | 50 | 17.6381 | - | - | - | - | - |
| 2.6222 | 60 | 16.9329 | - | - | - | - | - |
| 3.0 | 69 | - | 0.498 | 0.4954 | 0.4746 | 0.4528 | 0.4089 |
| 3.0444 | 70 | 15.4096 | - | - | - | - | - |
| 3.4889 | 80 | 15.4012 | - | - | - | - | - |
| 3.8444 | 88 | - | 0.4990 | 0.4934 | 0.4746 | 0.4522 | 0.4109 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.11.11
- Sentence Transformers: 4.0.2
- Transformers: 4.51.1
- PyTorch: 2.6.0+cu124
- Accelerate: 1.5.2
- Datasets: 3.5.0
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}