Text Ranking
sentence-transformers
Safetensors
xlm-roberta
cross-encoder
reranker
Generated from Trainer
dataset_size:6313
loss:BinaryCrossEntropyLoss
Eval Results (legacy)
text-embeddings-inference
Instructions to use OloriBern/bge-m3-musique-mixer-3ep with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use OloriBern/bge-m3-musique-mixer-3ep with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("OloriBern/bge-m3-musique-mixer-3ep") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
Upload folder using huggingface_hub
Browse files- .gitattributes +1 -0
- README.md +365 -0
- config.json +37 -0
- eval/CrossEncoderCorrelationEvaluator_explorer-validation_results.csv +7 -0
- model.safetensors +3 -0
- special_tokens_map.json +51 -0
- tokenizer.json +3 -0
- tokenizer_config.json +63 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
README.md
ADDED
|
@@ -0,0 +1,365 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
tags:
|
| 3 |
+
- sentence-transformers
|
| 4 |
+
- cross-encoder
|
| 5 |
+
- reranker
|
| 6 |
+
- generated_from_trainer
|
| 7 |
+
- dataset_size:6313
|
| 8 |
+
- loss:BinaryCrossEntropyLoss
|
| 9 |
+
pipeline_tag: text-ranking
|
| 10 |
+
library_name: sentence-transformers
|
| 11 |
+
metrics:
|
| 12 |
+
- pearson
|
| 13 |
+
- spearman
|
| 14 |
+
model-index:
|
| 15 |
+
- name: CrossEncoder
|
| 16 |
+
results:
|
| 17 |
+
- task:
|
| 18 |
+
type: cross-encoder-correlation
|
| 19 |
+
name: Cross Encoder Correlation
|
| 20 |
+
dataset:
|
| 21 |
+
name: explorer validation
|
| 22 |
+
type: explorer-validation
|
| 23 |
+
metrics:
|
| 24 |
+
- type: pearson
|
| 25 |
+
value: 0.9535121580256509
|
| 26 |
+
name: Pearson
|
| 27 |
+
- type: spearman
|
| 28 |
+
value: 0.9229114028784231
|
| 29 |
+
name: Spearman
|
| 30 |
+
---
|
| 31 |
+
|
| 32 |
+
# CrossEncoder
|
| 33 |
+
|
| 34 |
+
This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model trained using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
|
| 35 |
+
|
| 36 |
+
## Model Details
|
| 37 |
+
|
| 38 |
+
### Model Description
|
| 39 |
+
- **Model Type:** Cross Encoder
|
| 40 |
+
<!-- - **Base model:** [Unknown](https://huggingface.co/unknown) -->
|
| 41 |
+
- **Maximum Sequence Length:** 512 tokens
|
| 42 |
+
- **Number of Output Labels:** 1 label
|
| 43 |
+
<!-- - **Training Dataset:** Unknown -->
|
| 44 |
+
<!-- - **Language:** Unknown -->
|
| 45 |
+
<!-- - **License:** Unknown -->
|
| 46 |
+
|
| 47 |
+
### Model Sources
|
| 48 |
+
|
| 49 |
+
- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
|
| 50 |
+
- **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html)
|
| 51 |
+
- **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers)
|
| 52 |
+
- **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder)
|
| 53 |
+
|
| 54 |
+
## Usage
|
| 55 |
+
|
| 56 |
+
### Direct Usage (Sentence Transformers)
|
| 57 |
+
|
| 58 |
+
First install the Sentence Transformers library:
|
| 59 |
+
|
| 60 |
+
```bash
|
| 61 |
+
pip install -U sentence-transformers
|
| 62 |
+
```
|
| 63 |
+
|
| 64 |
+
Then you can load this model and run inference.
|
| 65 |
+
```python
|
| 66 |
+
from sentence_transformers import CrossEncoder
|
| 67 |
+
|
| 68 |
+
# Download from the 🤗 Hub
|
| 69 |
+
model = CrossEncoder("cross_encoder_model_id")
|
| 70 |
+
# Get scores for pairs of texts
|
| 71 |
+
pairs = [
|
| 72 |
+
['How far from the city with the highest cost of living in the nation is Stanford?', 'Folk Singer is the fourth studio album by Muddy Waters, released in April 1964 by Chess Records. The album features Waters on acoustic guitar, backed by Willie Dixon on string bass, Clifton James on drums, and Buddy Guy on acoustic guitar. It is Waters\'s only all-acoustic album. Numerous reissues of "Folk Singer" include bonus tracks from two subsequent sessions, in April 1964 and October 1964.'],
|
| 73 |
+
['What year was the university accepting classes from the Dockyard Technical College after its closure founded?', 'The University of Southampton, which was founded in 1862 and received its Royal Charter as a university in 1952, has over 22,000 students. The university is ranked in the top 100 research universities in the world in the Academic Ranking of World Universities 2010. In 2010, the THES - QS World University Rankings positioned the University of Southampton in the top 80 universities in the world. The university considers itself one of the top 5 research universities in the UK. The university has a global reputation for research into engineering sciences, oceanography, chemistry, cancer sciences, sound and vibration research, computer science and electronics, optoelectronics and textile conservation at the Textile Conservation Centre (which is due to close in October 2009.) It is also home to the National Oceanography Centre, Southampton (NOCS), the focus of Natural Environment Research Council-funded marine research.</s>The city was also home to the Royal Naval Engineering College; opened in 1880 in Keyham, it trained engineering students for five years before they completed the remaining two years of the course at Greenwich. The college closed in 1910, but in 1940 a new college opened at Manadon. This was renamed Dockyard Technical College in 1959 before finally closing in 1994; training was transferred to the University of Southampton.'],
|
| 74 |
+
['What is the main research library at the place where Torben Grodal is employed?', "The Pennsylvania State University (commonly referred to as Penn State or PSU) is a state - related, land - grant, doctoral university with campuses and facilities throughout Pennsylvania. Founded in 1855, the university has a stated threefold mission of teaching, research, and public service. Its instructional mission includes undergraduate, graduate, professional and continuing education offered through resident instruction and online delivery. Its University Park campus, the flagship campus, lies within the Borough of State College and College Township. It has two law schools: Penn State Law, on the school's University Park campus, and Dickinson Law, located in Carlisle, 90 miles south of State College. The College of Medicine is located in Hershey. Penn State has another 19 commonwealth campuses and 5 special mission campuses located across the state. Penn State has been labeled one of the ``Public Ivies, ''a publicly funded university considered as providing a quality of education comparable to those of the Ivy League."],
|
| 75 |
+
['When did the band which released Violent and Lazy form?', 'Grinspoon is an Australian rock band from Lismore, New South Wales formed in 1995 and fronted by Phil Jamieson on vocals and guitar with Pat Davern on guitar, Joe Hansen on bass guitar and Kristian Hopes on drums. Also in 1995, they won the Triple J-sponsored Unearthed competition for Lismore, with their post-grunge song "Sickfest". Their name was taken from Dr. Lester Grinspoon an Associate Professor Emeritus of Psychiatry at Harvard Medical School, who supports marijuana for medical use.'],
|
| 76 |
+
['Who did the actor who plays toy Santa in Santa Clause 2 play in Toy Story?', "Victoria was the daughter of Prince Edward, Duke of Kent and Strathearn, the fourth son of King George III. Both the Duke of Kent and King George III died in 1820, and Victoria was raised under close supervision by her German-born mother Princess Victoria of Saxe-Coburg-Saalfeld. She inherited the throne aged 18, after her father's three elder brothers had all died, leaving no surviving legitimate children. The United Kingdom was already an established constitutional monarchy, in which the sovereign held relatively little direct political power. Privately, Victoria attempted to influence government policy and ministerial appointments; publicly, she became a national icon who was identified with strict standards of personal morality."],
|
| 77 |
+
]
|
| 78 |
+
scores = model.predict(pairs)
|
| 79 |
+
print(scores.shape)
|
| 80 |
+
# (5,)
|
| 81 |
+
|
| 82 |
+
# Or rank different texts based on similarity to a single text
|
| 83 |
+
ranks = model.rank(
|
| 84 |
+
'How far from the city with the highest cost of living in the nation is Stanford?',
|
| 85 |
+
[
|
| 86 |
+
'Folk Singer is the fourth studio album by Muddy Waters, released in April 1964 by Chess Records. The album features Waters on acoustic guitar, backed by Willie Dixon on string bass, Clifton James on drums, and Buddy Guy on acoustic guitar. It is Waters\'s only all-acoustic album. Numerous reissues of "Folk Singer" include bonus tracks from two subsequent sessions, in April 1964 and October 1964.',
|
| 87 |
+
'The University of Southampton, which was founded in 1862 and received its Royal Charter as a university in 1952, has over 22,000 students. The university is ranked in the top 100 research universities in the world in the Academic Ranking of World Universities 2010. In 2010, the THES - QS World University Rankings positioned the University of Southampton in the top 80 universities in the world. The university considers itself one of the top 5 research universities in the UK. The university has a global reputation for research into engineering sciences, oceanography, chemistry, cancer sciences, sound and vibration research, computer science and electronics, optoelectronics and textile conservation at the Textile Conservation Centre (which is due to close in October 2009.) It is also home to the National Oceanography Centre, Southampton (NOCS), the focus of Natural Environment Research Council-funded marine research.</s>The city was also home to the Royal Naval Engineering College; opened in 1880 in Keyham, it trained engineering students for five years before they completed the remaining two years of the course at Greenwich. The college closed in 1910, but in 1940 a new college opened at Manadon. This was renamed Dockyard Technical College in 1959 before finally closing in 1994; training was transferred to the University of Southampton.',
|
| 88 |
+
"The Pennsylvania State University (commonly referred to as Penn State or PSU) is a state - related, land - grant, doctoral university with campuses and facilities throughout Pennsylvania. Founded in 1855, the university has a stated threefold mission of teaching, research, and public service. Its instructional mission includes undergraduate, graduate, professional and continuing education offered through resident instruction and online delivery. Its University Park campus, the flagship campus, lies within the Borough of State College and College Township. It has two law schools: Penn State Law, on the school's University Park campus, and Dickinson Law, located in Carlisle, 90 miles south of State College. The College of Medicine is located in Hershey. Penn State has another 19 commonwealth campuses and 5 special mission campuses located across the state. Penn State has been labeled one of the ``Public Ivies, ''a publicly funded university considered as providing a quality of education comparable to those of the Ivy League.",
|
| 89 |
+
'Grinspoon is an Australian rock band from Lismore, New South Wales formed in 1995 and fronted by Phil Jamieson on vocals and guitar with Pat Davern on guitar, Joe Hansen on bass guitar and Kristian Hopes on drums. Also in 1995, they won the Triple J-sponsored Unearthed competition for Lismore, with their post-grunge song "Sickfest". Their name was taken from Dr. Lester Grinspoon an Associate Professor Emeritus of Psychiatry at Harvard Medical School, who supports marijuana for medical use.',
|
| 90 |
+
"Victoria was the daughter of Prince Edward, Duke of Kent and Strathearn, the fourth son of King George III. Both the Duke of Kent and King George III died in 1820, and Victoria was raised under close supervision by her German-born mother Princess Victoria of Saxe-Coburg-Saalfeld. She inherited the throne aged 18, after her father's three elder brothers had all died, leaving no surviving legitimate children. The United Kingdom was already an established constitutional monarchy, in which the sovereign held relatively little direct political power. Privately, Victoria attempted to influence government policy and ministerial appointments; publicly, she became a national icon who was identified with strict standards of personal morality.",
|
| 91 |
+
]
|
| 92 |
+
)
|
| 93 |
+
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
|
| 94 |
+
```
|
| 95 |
+
|
| 96 |
+
<!--
|
| 97 |
+
### Direct Usage (Transformers)
|
| 98 |
+
|
| 99 |
+
<details><summary>Click to see the direct usage in Transformers</summary>
|
| 100 |
+
|
| 101 |
+
</details>
|
| 102 |
+
-->
|
| 103 |
+
|
| 104 |
+
<!--
|
| 105 |
+
### Downstream Usage (Sentence Transformers)
|
| 106 |
+
|
| 107 |
+
You can finetune this model on your own dataset.
|
| 108 |
+
|
| 109 |
+
<details><summary>Click to expand</summary>
|
| 110 |
+
|
| 111 |
+
</details>
|
| 112 |
+
-->
|
| 113 |
+
|
| 114 |
+
<!--
|
| 115 |
+
### Out-of-Scope Use
|
| 116 |
+
|
| 117 |
+
*List how the model may foreseeably be misused and address what users ought not to do with the model.*
|
| 118 |
+
-->
|
| 119 |
+
|
| 120 |
+
## Evaluation
|
| 121 |
+
|
| 122 |
+
### Metrics
|
| 123 |
+
|
| 124 |
+
#### Cross Encoder Correlation
|
| 125 |
+
|
| 126 |
+
* Dataset: `explorer-validation`
|
| 127 |
+
* Evaluated with [<code>CECorrelationEvaluator</code>](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CECorrelationEvaluator)
|
| 128 |
+
|
| 129 |
+
| Metric | Value |
|
| 130 |
+
|:-------------|:-----------|
|
| 131 |
+
| pearson | 0.9535 |
|
| 132 |
+
| **spearman** | **0.9229** |
|
| 133 |
+
|
| 134 |
+
<!--
|
| 135 |
+
## Bias, Risks and Limitations
|
| 136 |
+
|
| 137 |
+
*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
|
| 138 |
+
-->
|
| 139 |
+
|
| 140 |
+
<!--
|
| 141 |
+
### Recommendations
|
| 142 |
+
|
| 143 |
+
*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
|
| 144 |
+
-->
|
| 145 |
+
|
| 146 |
+
## Training Details
|
| 147 |
+
|
| 148 |
+
### Training Dataset
|
| 149 |
+
|
| 150 |
+
#### Unnamed Dataset
|
| 151 |
+
|
| 152 |
+
* Size: 6,313 training samples
|
| 153 |
+
* Columns: <code>sentence_0</code>, <code>sentence_1</code>, and <code>label</code>
|
| 154 |
+
* Approximate statistics based on the first 1000 samples:
|
| 155 |
+
| | sentence_0 | sentence_1 | label |
|
| 156 |
+
|:--------|:------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 157 |
+
| type | string | string | float |
|
| 158 |
+
| details | <ul><li>min: 31 characters</li><li>mean: 79.25 characters</li><li>max: 168 characters</li></ul> | <ul><li>min: 111 characters</li><li>mean: 805.66 characters</li><li>max: 7035 characters</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.52</li><li>max: 1.0</li></ul> |
|
| 159 |
+
* Samples:
|
| 160 |
+
| sentence_0 | sentence_1 | label |
|
| 161 |
+
|:---------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
|
| 162 |
+
| <code>How far from the city with the highest cost of living in the nation is Stanford?</code> | <code>Folk Singer is the fourth studio album by Muddy Waters, released in April 1964 by Chess Records. The album features Waters on acoustic guitar, backed by Willie Dixon on string bass, Clifton James on drums, and Buddy Guy on acoustic guitar. It is Waters's only all-acoustic album. Numerous reissues of "Folk Singer" include bonus tracks from two subsequent sessions, in April 1964 and October 1964.</code> | <code>0.0</code> |
|
| 163 |
+
| <code>What year was the university accepting classes from the Dockyard Technical College after its closure founded?</code> | <code>The University of Southampton, which was founded in 1862 and received its Royal Charter as a university in 1952, has over 22,000 students. The university is ranked in the top 100 research universities in the world in the Academic Ranking of World Universities 2010. In 2010, the THES - QS World University Rankings positioned the University of Southampton in the top 80 universities in the world. The university considers itself one of the top 5 research universities in the UK. The university has a global reputation for research into engineering sciences, oceanography, chemistry, cancer sciences, sound and vibration research, computer science and electronics, optoelectronics and textile conservation at the Textile Conservation Centre (which is due to close in October 2009.) It is also home to the National Oceanography Centre, Southampton (NOCS), the focus of Natural Environment Research Council-funded marine research.</s>The city was also home to the Royal Naval Engineering College; opened...</code> | <code>1.0</code> |
|
| 164 |
+
| <code>What is the main research library at the place where Torben Grodal is employed?</code> | <code>The Pennsylvania State University (commonly referred to as Penn State or PSU) is a state - related, land - grant, doctoral university with campuses and facilities throughout Pennsylvania. Founded in 1855, the university has a stated threefold mission of teaching, research, and public service. Its instructional mission includes undergraduate, graduate, professional and continuing education offered through resident instruction and online delivery. Its University Park campus, the flagship campus, lies within the Borough of State College and College Township. It has two law schools: Penn State Law, on the school's University Park campus, and Dickinson Law, located in Carlisle, 90 miles south of State College. The College of Medicine is located in Hershey. Penn State has another 19 commonwealth campuses and 5 special mission campuses located across the state. Penn State has been labeled one of the ``Public Ivies, ''a publicly funded university considered as providing a quality of education ...</code> | <code>0.0</code> |
|
| 165 |
+
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
|
| 166 |
+
```json
|
| 167 |
+
{
|
| 168 |
+
"activation_fn": "torch.nn.modules.linear.Identity",
|
| 169 |
+
"pos_weight": null
|
| 170 |
+
}
|
| 171 |
+
```
|
| 172 |
+
|
| 173 |
+
### Training Hyperparameters
|
| 174 |
+
#### Non-Default Hyperparameters
|
| 175 |
+
|
| 176 |
+
- `eval_strategy`: steps
|
| 177 |
+
- `per_device_train_batch_size`: 4
|
| 178 |
+
- `per_device_eval_batch_size`: 4
|
| 179 |
+
- `num_train_epochs`: 1
|
| 180 |
+
|
| 181 |
+
#### All Hyperparameters
|
| 182 |
+
<details><summary>Click to expand</summary>
|
| 183 |
+
|
| 184 |
+
- `overwrite_output_dir`: False
|
| 185 |
+
- `do_predict`: False
|
| 186 |
+
- `eval_strategy`: steps
|
| 187 |
+
- `prediction_loss_only`: True
|
| 188 |
+
- `per_device_train_batch_size`: 4
|
| 189 |
+
- `per_device_eval_batch_size`: 4
|
| 190 |
+
- `per_gpu_train_batch_size`: None
|
| 191 |
+
- `per_gpu_eval_batch_size`: None
|
| 192 |
+
- `gradient_accumulation_steps`: 1
|
| 193 |
+
- `eval_accumulation_steps`: None
|
| 194 |
+
- `torch_empty_cache_steps`: None
|
| 195 |
+
- `learning_rate`: 5e-05
|
| 196 |
+
- `weight_decay`: 0.0
|
| 197 |
+
- `adam_beta1`: 0.9
|
| 198 |
+
- `adam_beta2`: 0.999
|
| 199 |
+
- `adam_epsilon`: 1e-08
|
| 200 |
+
- `max_grad_norm`: 1.0
|
| 201 |
+
- `num_train_epochs`: 1
|
| 202 |
+
- `max_steps`: -1
|
| 203 |
+
- `lr_scheduler_type`: linear
|
| 204 |
+
- `lr_scheduler_kwargs`: None
|
| 205 |
+
- `warmup_ratio`: 0.0
|
| 206 |
+
- `warmup_steps`: 0
|
| 207 |
+
- `log_level`: passive
|
| 208 |
+
- `log_level_replica`: warning
|
| 209 |
+
- `log_on_each_node`: True
|
| 210 |
+
- `logging_nan_inf_filter`: True
|
| 211 |
+
- `save_safetensors`: True
|
| 212 |
+
- `save_on_each_node`: False
|
| 213 |
+
- `save_only_model`: False
|
| 214 |
+
- `restore_callback_states_from_checkpoint`: False
|
| 215 |
+
- `no_cuda`: False
|
| 216 |
+
- `use_cpu`: False
|
| 217 |
+
- `use_mps_device`: False
|
| 218 |
+
- `seed`: 42
|
| 219 |
+
- `data_seed`: None
|
| 220 |
+
- `jit_mode_eval`: False
|
| 221 |
+
- `bf16`: False
|
| 222 |
+
- `fp16`: False
|
| 223 |
+
- `fp16_opt_level`: O1
|
| 224 |
+
- `half_precision_backend`: auto
|
| 225 |
+
- `bf16_full_eval`: False
|
| 226 |
+
- `fp16_full_eval`: False
|
| 227 |
+
- `tf32`: None
|
| 228 |
+
- `local_rank`: 0
|
| 229 |
+
- `ddp_backend`: None
|
| 230 |
+
- `tpu_num_cores`: None
|
| 231 |
+
- `tpu_metrics_debug`: False
|
| 232 |
+
- `debug`: []
|
| 233 |
+
- `dataloader_drop_last`: False
|
| 234 |
+
- `dataloader_num_workers`: 0
|
| 235 |
+
- `dataloader_prefetch_factor`: None
|
| 236 |
+
- `past_index`: -1
|
| 237 |
+
- `disable_tqdm`: False
|
| 238 |
+
- `remove_unused_columns`: True
|
| 239 |
+
- `label_names`: None
|
| 240 |
+
- `load_best_model_at_end`: False
|
| 241 |
+
- `ignore_data_skip`: False
|
| 242 |
+
- `fsdp`: []
|
| 243 |
+
- `fsdp_min_num_params`: 0
|
| 244 |
+
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
|
| 245 |
+
- `fsdp_transformer_layer_cls_to_wrap`: None
|
| 246 |
+
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
|
| 247 |
+
- `parallelism_config`: None
|
| 248 |
+
- `deepspeed`: None
|
| 249 |
+
- `label_smoothing_factor`: 0.0
|
| 250 |
+
- `optim`: adamw_torch_fused
|
| 251 |
+
- `optim_args`: None
|
| 252 |
+
- `adafactor`: False
|
| 253 |
+
- `group_by_length`: False
|
| 254 |
+
- `length_column_name`: length
|
| 255 |
+
- `project`: huggingface
|
| 256 |
+
- `trackio_space_id`: trackio
|
| 257 |
+
- `ddp_find_unused_parameters`: None
|
| 258 |
+
- `ddp_bucket_cap_mb`: None
|
| 259 |
+
- `ddp_broadcast_buffers`: False
|
| 260 |
+
- `dataloader_pin_memory`: True
|
| 261 |
+
- `dataloader_persistent_workers`: False
|
| 262 |
+
- `skip_memory_metrics`: True
|
| 263 |
+
- `use_legacy_prediction_loop`: False
|
| 264 |
+
- `push_to_hub`: False
|
| 265 |
+
- `resume_from_checkpoint`: None
|
| 266 |
+
- `hub_model_id`: None
|
| 267 |
+
- `hub_strategy`: every_save
|
| 268 |
+
- `hub_private_repo`: None
|
| 269 |
+
- `hub_always_push`: False
|
| 270 |
+
- `hub_revision`: None
|
| 271 |
+
- `gradient_checkpointing`: False
|
| 272 |
+
- `gradient_checkpointing_kwargs`: None
|
| 273 |
+
- `include_inputs_for_metrics`: False
|
| 274 |
+
- `include_for_metrics`: []
|
| 275 |
+
- `eval_do_concat_batches`: True
|
| 276 |
+
- `fp16_backend`: auto
|
| 277 |
+
- `push_to_hub_model_id`: None
|
| 278 |
+
- `push_to_hub_organization`: None
|
| 279 |
+
- `mp_parameters`:
|
| 280 |
+
- `auto_find_batch_size`: False
|
| 281 |
+
- `full_determinism`: False
|
| 282 |
+
- `torchdynamo`: None
|
| 283 |
+
- `ray_scope`: last
|
| 284 |
+
- `ddp_timeout`: 1800
|
| 285 |
+
- `torch_compile`: False
|
| 286 |
+
- `torch_compile_backend`: None
|
| 287 |
+
- `torch_compile_mode`: None
|
| 288 |
+
- `include_tokens_per_second`: False
|
| 289 |
+
- `include_num_input_tokens_seen`: no
|
| 290 |
+
- `neftune_noise_alpha`: None
|
| 291 |
+
- `optim_target_modules`: None
|
| 292 |
+
- `batch_eval_metrics`: False
|
| 293 |
+
- `eval_on_start`: False
|
| 294 |
+
- `use_liger_kernel`: False
|
| 295 |
+
- `liger_kernel_config`: None
|
| 296 |
+
- `eval_use_gather_object`: False
|
| 297 |
+
- `average_tokens_across_devices`: True
|
| 298 |
+
- `prompts`: None
|
| 299 |
+
- `batch_sampler`: batch_sampler
|
| 300 |
+
- `multi_dataset_batch_sampler`: proportional
|
| 301 |
+
- `router_mapping`: {}
|
| 302 |
+
- `learning_rate_mapping`: {}
|
| 303 |
+
|
| 304 |
+
</details>
|
| 305 |
+
|
| 306 |
+
### Training Logs
|
| 307 |
+
| Epoch | Step | Training Loss | explorer-validation_spearman |
|
| 308 |
+
|:------:|:----:|:-------------:|:----------------------------:|
|
| 309 |
+
| 0.3167 | 500 | 0.3547 | 0.9187 |
|
| 310 |
+
| 0.6333 | 1000 | 0.3377 | 0.9136 |
|
| 311 |
+
| 0.9500 | 1500 | 0.3264 | 0.9231 |
|
| 312 |
+
| 1.0 | 1579 | - | 0.9239 |
|
| 313 |
+
| -1 | -1 | - | 0.9239 |
|
| 314 |
+
| 0.3167 | 500 | 0.3111 | 0.9127 |
|
| 315 |
+
| 0.6333 | 1000 | 0.3306 | 0.9199 |
|
| 316 |
+
| 0.9500 | 1500 | 0.3175 | 0.9221 |
|
| 317 |
+
| 1.0 | 1579 | - | 0.9224 |
|
| 318 |
+
| -1 | -1 | - | 0.9224 |
|
| 319 |
+
| 0.3167 | 500 | 0.2913 | 0.9160 |
|
| 320 |
+
| 0.6333 | 1000 | 0.3053 | 0.9229 |
|
| 321 |
+
|
| 322 |
+
|
| 323 |
+
### Framework Versions
|
| 324 |
+
- Python: 3.12.11
|
| 325 |
+
- Sentence Transformers: 5.2.0
|
| 326 |
+
- Transformers: 4.57.6
|
| 327 |
+
- PyTorch: 2.9.1+cu128
|
| 328 |
+
- Accelerate: 1.12.0
|
| 329 |
+
- Datasets: 4.5.0
|
| 330 |
+
- Tokenizers: 0.22.2
|
| 331 |
+
|
| 332 |
+
## Citation
|
| 333 |
+
|
| 334 |
+
### BibTeX
|
| 335 |
+
|
| 336 |
+
#### Sentence Transformers
|
| 337 |
+
```bibtex
|
| 338 |
+
@inproceedings{reimers-2019-sentence-bert,
|
| 339 |
+
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
|
| 340 |
+
author = "Reimers, Nils and Gurevych, Iryna",
|
| 341 |
+
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
|
| 342 |
+
month = "11",
|
| 343 |
+
year = "2019",
|
| 344 |
+
publisher = "Association for Computational Linguistics",
|
| 345 |
+
url = "https://arxiv.org/abs/1908.10084",
|
| 346 |
+
}
|
| 347 |
+
```
|
| 348 |
+
|
| 349 |
+
<!--
|
| 350 |
+
## Glossary
|
| 351 |
+
|
| 352 |
+
*Clearly define terms in order to be accessible across audiences.*
|
| 353 |
+
-->
|
| 354 |
+
|
| 355 |
+
<!--
|
| 356 |
+
## Model Card Authors
|
| 357 |
+
|
| 358 |
+
*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
|
| 359 |
+
-->
|
| 360 |
+
|
| 361 |
+
<!--
|
| 362 |
+
## Model Card Contact
|
| 363 |
+
|
| 364 |
+
*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
|
| 365 |
+
-->
|
config.json
ADDED
|
@@ -0,0 +1,37 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"architectures": [
|
| 3 |
+
"XLMRobertaForSequenceClassification"
|
| 4 |
+
],
|
| 5 |
+
"attention_probs_dropout_prob": 0.1,
|
| 6 |
+
"bos_token_id": 0,
|
| 7 |
+
"classifier_dropout": null,
|
| 8 |
+
"dtype": "float32",
|
| 9 |
+
"eos_token_id": 2,
|
| 10 |
+
"hidden_act": "gelu",
|
| 11 |
+
"hidden_dropout_prob": 0.1,
|
| 12 |
+
"hidden_size": 1024,
|
| 13 |
+
"id2label": {
|
| 14 |
+
"0": "LABEL_0"
|
| 15 |
+
},
|
| 16 |
+
"initializer_range": 0.02,
|
| 17 |
+
"intermediate_size": 4096,
|
| 18 |
+
"label2id": {
|
| 19 |
+
"LABEL_0": 0
|
| 20 |
+
},
|
| 21 |
+
"layer_norm_eps": 1e-05,
|
| 22 |
+
"max_position_embeddings": 8194,
|
| 23 |
+
"model_type": "xlm-roberta",
|
| 24 |
+
"num_attention_heads": 16,
|
| 25 |
+
"num_hidden_layers": 24,
|
| 26 |
+
"output_past": true,
|
| 27 |
+
"pad_token_id": 1,
|
| 28 |
+
"position_embedding_type": "absolute",
|
| 29 |
+
"sentence_transformers": {
|
| 30 |
+
"activation_fn": "torch.nn.modules.activation.Sigmoid",
|
| 31 |
+
"version": "5.2.0"
|
| 32 |
+
},
|
| 33 |
+
"transformers_version": "4.57.6",
|
| 34 |
+
"type_vocab_size": 1,
|
| 35 |
+
"use_cache": true,
|
| 36 |
+
"vocab_size": 250002
|
| 37 |
+
}
|
eval/CrossEncoderCorrelationEvaluator_explorer-validation_results.csv
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
epoch,steps,Pearson_Correlation,Spearman_Correlation
|
| 2 |
+
1.0,1578,0.8501751995258846,0.8905569617674473
|
| 3 |
+
1.0,1578,0.8570334177160368,0.8942484497672712
|
| 4 |
+
1.0,1578,0.8568699047421272,0.8848450033650819
|
| 5 |
+
1.0,1579,0.9527875709381797,0.9238502423936007
|
| 6 |
+
1.0,1579,0.9628350133547275,0.92242053984088
|
| 7 |
+
1.0,1579,0.9621289158326546,0.9224219171223669
|
model.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:aa3db09a2ef45f714a3cc2b712f624056f2137c11ddb5b84154e20160b26c846
|
| 3 |
+
size 2271071852
|
special_tokens_map.json
ADDED
|
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"bos_token": {
|
| 3 |
+
"content": "<s>",
|
| 4 |
+
"lstrip": false,
|
| 5 |
+
"normalized": false,
|
| 6 |
+
"rstrip": false,
|
| 7 |
+
"single_word": false
|
| 8 |
+
},
|
| 9 |
+
"cls_token": {
|
| 10 |
+
"content": "<s>",
|
| 11 |
+
"lstrip": false,
|
| 12 |
+
"normalized": false,
|
| 13 |
+
"rstrip": false,
|
| 14 |
+
"single_word": false
|
| 15 |
+
},
|
| 16 |
+
"eos_token": {
|
| 17 |
+
"content": "</s>",
|
| 18 |
+
"lstrip": false,
|
| 19 |
+
"normalized": false,
|
| 20 |
+
"rstrip": false,
|
| 21 |
+
"single_word": false
|
| 22 |
+
},
|
| 23 |
+
"mask_token": {
|
| 24 |
+
"content": "<mask>",
|
| 25 |
+
"lstrip": true,
|
| 26 |
+
"normalized": false,
|
| 27 |
+
"rstrip": false,
|
| 28 |
+
"single_word": false
|
| 29 |
+
},
|
| 30 |
+
"pad_token": {
|
| 31 |
+
"content": "<pad>",
|
| 32 |
+
"lstrip": false,
|
| 33 |
+
"normalized": false,
|
| 34 |
+
"rstrip": false,
|
| 35 |
+
"single_word": false
|
| 36 |
+
},
|
| 37 |
+
"sep_token": {
|
| 38 |
+
"content": "</s>",
|
| 39 |
+
"lstrip": false,
|
| 40 |
+
"normalized": false,
|
| 41 |
+
"rstrip": false,
|
| 42 |
+
"single_word": false
|
| 43 |
+
},
|
| 44 |
+
"unk_token": {
|
| 45 |
+
"content": "<unk>",
|
| 46 |
+
"lstrip": false,
|
| 47 |
+
"normalized": false,
|
| 48 |
+
"rstrip": false,
|
| 49 |
+
"single_word": false
|
| 50 |
+
}
|
| 51 |
+
}
|
tokenizer.json
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d9a6af42442a3e3e9f05f618eae0bb2d98ca4f6a6406cb80ef7a4fa865204d61
|
| 3 |
+
size 17083052
|
tokenizer_config.json
ADDED
|
@@ -0,0 +1,63 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"added_tokens_decoder": {
|
| 3 |
+
"0": {
|
| 4 |
+
"content": "<s>",
|
| 5 |
+
"lstrip": false,
|
| 6 |
+
"normalized": false,
|
| 7 |
+
"rstrip": false,
|
| 8 |
+
"single_word": false,
|
| 9 |
+
"special": true
|
| 10 |
+
},
|
| 11 |
+
"1": {
|
| 12 |
+
"content": "<pad>",
|
| 13 |
+
"lstrip": false,
|
| 14 |
+
"normalized": false,
|
| 15 |
+
"rstrip": false,
|
| 16 |
+
"single_word": false,
|
| 17 |
+
"special": true
|
| 18 |
+
},
|
| 19 |
+
"2": {
|
| 20 |
+
"content": "</s>",
|
| 21 |
+
"lstrip": false,
|
| 22 |
+
"normalized": false,
|
| 23 |
+
"rstrip": false,
|
| 24 |
+
"single_word": false,
|
| 25 |
+
"special": true
|
| 26 |
+
},
|
| 27 |
+
"3": {
|
| 28 |
+
"content": "<unk>",
|
| 29 |
+
"lstrip": false,
|
| 30 |
+
"normalized": false,
|
| 31 |
+
"rstrip": false,
|
| 32 |
+
"single_word": false,
|
| 33 |
+
"special": true
|
| 34 |
+
},
|
| 35 |
+
"250001": {
|
| 36 |
+
"content": "<mask>",
|
| 37 |
+
"lstrip": true,
|
| 38 |
+
"normalized": false,
|
| 39 |
+
"rstrip": false,
|
| 40 |
+
"single_word": false,
|
| 41 |
+
"special": true
|
| 42 |
+
}
|
| 43 |
+
},
|
| 44 |
+
"bos_token": "<s>",
|
| 45 |
+
"clean_up_tokenization_spaces": true,
|
| 46 |
+
"cls_token": "<s>",
|
| 47 |
+
"eos_token": "</s>",
|
| 48 |
+
"extra_special_tokens": {},
|
| 49 |
+
"mask_token": "<mask>",
|
| 50 |
+
"max_length": 512,
|
| 51 |
+
"model_max_length": 512,
|
| 52 |
+
"pad_to_multiple_of": null,
|
| 53 |
+
"pad_token": "<pad>",
|
| 54 |
+
"pad_token_type_id": 0,
|
| 55 |
+
"padding_side": "right",
|
| 56 |
+
"sep_token": "</s>",
|
| 57 |
+
"sp_model_kwargs": {},
|
| 58 |
+
"stride": 0,
|
| 59 |
+
"tokenizer_class": "XLMRobertaTokenizerFast",
|
| 60 |
+
"truncation_side": "right",
|
| 61 |
+
"truncation_strategy": "longest_first",
|
| 62 |
+
"unk_token": "<unk>"
|
| 63 |
+
}
|