Matryoshka Representation Learning
Paper • 2205.13147 • Published • 27
How to use legaltextai/modernbert-embed-ft-const-legal-matryoshka with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("legaltextai/modernbert-embed-ft-const-legal-matryoshka")
sentences = [
"Discuss the implications of the Insular Cases on the application of the Citizenship Clause to American Samoa, particularly in distinguishing between incorporated and unincorporated territories. What are the practical concerns associated with this distinction?",
"To the extent jus soli is adopted into the Fourteenth Amendment, the concept of allegiance is manifested by the Citizenship Clause’s mandate that birthright citizens not merely be born within the territorial boundaries of the United States but also “subject to the jurisdiction thereof…” [citations omitted]\n\n \n\n Appellants would find any allegiance requirement of no moment because, as non-citizen nationals, American Samoans already “owe[ ] permanent allegiance to the United States.”[citations omitted] Yet, within the context of the Citizenship Clause, “[t]he evident meaning of the[ ] ... words [“subject to the jurisdiction thereof”] is, not merely subject in some respect or degree to the jurisdiction of the United States, but completely subject to their political jurisdiction, and owing them direct and immediate allegiance.” **375 [citations omitted] *306 It was on this basis that the Supreme Court declined to extend constitutional birthright citizenship to Native American tribes. [citations omitted]…Even assuming a background context grounded in principles of jus soli, we are skeptical the framers plainly intended to extend birthright citizenship to distinct, significantly self-governing political territories within the United States’s sphere of sovereignty—even where, as is the case with American Samoa, ultimate governance remains statutorily vested with the United States Government. [citations omitted]\n\nIII\n\nAnalysis of the Citizenship Clause’s application to American Samoa would be incomplete absent invocation of the sometimes contentious Insular Cases, where the Supreme Court “addressed whether the Constitution, by its own force, applies in any territory that is not a State.” [citations omitted]\n\n \n\n“The doctrine of ‘territorial incorporation’ announced in the Insular Cases distinguishes between incorporated territories, which are intended for statehood from the time of acquisition and in which the entire Constitution applies ex proprio vigore, and unincorporated territories [such as American Samoa], which are not intended for statehood and in which only [certain] fundamental constitutional rights apply by their own force.”[citations omitted].\n\n \n\nAppellants and Amici contend the Insular Cases have no application because the Citizenship Clause textually defines its own scope.[citations omitted].\n\n \n\nAmici Curiae suggest territorial incorporation doctrine should not be expanded to the Citizenship Clause because the doctrine rests on anachronistic views of race and imperialism. But the Court has continued to invoke the Insular framework when dealing with questions of territorial and extraterritorial application. [citations omitted] Although some aspects of the Insular Cases’ analysis may now be deemed politically incorrect, the framework remains both applicable and of pragmatic use in assessing the applicability of rights to unincorporated territories. [citations omitted]\n\n \n\nAs the Supreme Court…emphasized, the “common thread uniting the Insular Cases ... [is that] questions of extraterritoriality turn on objective factors and practical concerns, not formalism.” [citations omitted] While “fundamental limitations in favor of personal rights” remain guaranteed to persons born in the unincorporated territories, [citations omitted], the Insular framework recognizes the difficulties that frequently inure when “determin[ing] [whether a] particular provision of the Constitution is applicable,” absent inquiry into the impractical or anomalous. [citations omitted]\n\nA\n\n American citizenship “is one of the most valuable rights in the world today.” [citations omitted] “The freedoms and opportunities secured by United States citizenship long have been treasured by persons fortunate enough to be born with them, and are yearned for by countless less fortunate.” [citations omitted]. Accordingly, even if the Insular framework is applicable, Appellants cite to a bevy of cases to argue citizenship is a fundamental right. [citations omitted] But those cases do not arise in the territorial context. Such decisions do not reflect the Court’s considered judgment as to the existence of a fundamental right to citizenship for persons born in the United States’ unincorporated **377 *308 territories. [citations omitted].7\n\n \n\n “Fundamental” has a distinct and narrow meaning in the context of territorial rights. It is not sufficient that a right be considered fundamentally important in a colloquial sense or even that a right be “necessary to [the] [ ]American regime of ordered liberty.” [citations omitted]. Under the Insular framework the designation of fundamental extends only to the narrow category of rights and “principles which are the basis of all free government.” [citations omitted]\n\n \n\nIn this manner the Insular Cases distinguish as universally fundamental those rights so basic as to be integral to free and fair society.",
"633, 649 (concurring opinion).\n\nAn innkeeper or common carrier has always been allowed to' exclude drunks, criminals and' diseased persons, but only because the public’s interest in protecting his and his guests’ health and property outweighs its interest in providing accommodations for this small group of travelers. As a general rule, innkeepers and carriers cannot refuse their services on account of race; though the rule developed in this country that they can provide “separate but equal” facilities. And for a period of our history even,this Court upheld state laws giving sanction to such a rule. Compare Plessy v. Ferguson, 163 U. S. 537, with Gayle v. Browder, 352 U. S. 903, affirming, 142 F. Supp. 707. But surely Shelley v. Kraemer, supra, and Barrows v. Jackson, supra, show that the day has passed when an innkeeper, carrier, housing developer, or retailer can draw a• racial' line, refuse service to some on account of color, and obtain the aid of a State in enforcing his personal bias by sending outlawed customers to prison or exacting fines from them.\n\nBusiness, such as this restaurant, is still private property. ' Yet there is hardly any private enterprise that does not feel the pinch of some public regulation — from price control, to health and fire inspection, to zoning, to safety measures, to minimum wages and working conditions, to unemployment insurance. When the doors of a business are open to the public, they must be open to all regardless of race if apartheid is not to become engrained in our public places. It cannot by reason of the Equal Protection Clause become so engrained with the aid of state courts, state legislatures, or state police.\n\nII.\n\nThere is even greater reason to bar a State through its judiciary from throwing its weight on the side of racial discrimination in the present case, because we deal here with a place of public accommodation under license from, the State. This is the idea I expressed in Garner v. Louisiana, 368 U. S. 157, where another owner of a restaurant refused service to a customer because he was a Negro. That view is not novel; it.stems from the dissent of the first Mr. Justice Harlan in the Civil Rights Cases, 109 U. S. 3, 58-59:\n\n“In every material sense applicable to the practical enforcement of the Fourteenth Amendment, railroad corporations, keepers of inns, and managers of places of public amusement are agents or instrumentalities of the State, because they are charged with duties to the public, and are amenable, in respect of their duties and functions, to governmental regulation. It seems to me that, within the principle settled in Ex parte Virginia, a denial, by these instrumentalities of the State, to the citizen, because of his race, of that equality of civil rights secured to him by law, is a denial by the State, within the meaning of the Fourteenth Amendment. If it be not, then that race is left, in respect of the civil rights in question, practically at the mercy of corporations and individuals wielding power under the States.”\n\nThe nexus between the State and the private enterprise may be control, as in the case of a state agency. Pennsylvania v. Board of Trusts, 353 U. S. 230. Or the nexus may be one of numerous other devices. “State support of segregated schools through any arrangement, management, funds, or property cannot be squared” with the Equal Protection Clause. Cooper v. Aaron, 358 U. S. 1, 19. Cf. Hampton v. Jacksonville, 304 F. 2d 320. A state-assisted enterprise serving the public does not escape its constitutional duty to serve all customers irrespective of race, even though its actual operation is in the hands of a lessee. Burton v. Wilmington Parking Authority, 365 U. S. 715. Cf. Boynton v. Virginia, 364 U. S. 454. State licensing and surveillance.of a business serving the public also brings its service into the public domain. This restaurant needs a permit from Louisiana to operate; and during the existence of the license the State has broad powers of visitation and control. This restaurant is thus an instrumentality of the State since the State charges it with duties to the public and supervises its performance. The State's interest in and activity with regard to its restaurants extends far beyond any mere income-producing licensing requirement.",
"Among other things, courts at this second step have sometimes considered whether an employee’s speech interests are outweighed by “ ‘the interest of the State, as an employer, in promoting the efficiency of the public services it performs through its employees.’ ” Id., at 417, 126 S.Ct. 1951 *2424 (quoting Pickering, 391 U.S. at 568, 88 S.Ct. 1731).\n\n \n\nBoth sides ask us to employ at least certain aspects of this Pickering–Garcetti framework to resolve Mr. Kennedy’s free speech claim. They share additional common ground too. They agree that Mr. Kennedy’s speech implicates a matter of public concern. See App. to Pet. for Cert. 183; Brief for Respondent 44. They also appear to accept, at least for argument’s sake, that Mr. Kennedy’s speech does not raise questions of academic freedom that may or may not involve “additional” First Amendment “interests” beyond those captured by this framework. Garcetti, 547 U.S. at 425, 126 S.Ct. 1951; see also Keyishian v. Board of Regents of Univ. of State of N. Y., 385 U.S. 589, 603, 87 S.Ct. 675, 17 L.Ed.2d 629 (1967); Brief for Petitioner 26, n. 2. At the first step of the Pickering–Garcetti inquiry, the parties’ disagreement thus turns out to center on one question alone: Did Mr. Kennedy offer his prayers in his capacity as a private citizen, or did they amount to government speech attributable to the District?\n\n \n\nOur cases offer some helpful guidance for resolving this question. In Garcetti, the Court concluded that a prosecutor’s internal memorandum to a supervisor was made “pursuant to [his] official duties,” and thus ineligible for First Amendment protection. 547 U.S. at 421, 126 S.Ct. 1951. In reaching this conclusion, the Court relied on the fact that the prosecutor’s speech “fulfill[ed] a responsibility to advise his supervisor about how best to proceed with a pending case.” Ibid. In other words, the prosecutor’s memorandum was government speech because it was speech the government “itself ha[d] commissioned or created” and speech the employee was expected to deliver in the course of carrying out his job. Id., at 422, 126 S.Ct. 1951.\n\n \n\nBy contrast, in Lane a public employer sought to terminate an employee after he testified at a criminal trial about matters involving his government employment. 573 U.S. at 233, 134 S.Ct. 2369. The Court held that the employee’s speech was protected by the First Amendment. Id., at 231, 134 S.Ct. 2369. In doing so, the Court held that the fact the speech touched on matters related to public employment was not enough to render it government speech. Id., at 239–240, 134 S.Ct. 2369. Instead, the Court explained, the “critical question ... is whether the speech at issue is itself ordinarily within the scope of an employee’s duties.” Id., at 240, 134 S.Ct. 2369. It is an inquiry this Court has said should be undertaken “practical[ly],” rather than with a blinkered focus on the terms of some formal and capacious written job description. Garcetti, 547 U.S. at 424, 126 S.Ct. 1951. To proceed otherwise would be to allow public employers to use “excessively broad job descriptions” to subvert the Constitution’s protections. Ibid.\n\n \n\nApplying these lessons here, it seems clear to us that Mr. Kennedy has demonstrated that his speech was private speech, not government speech. When Mr. Kennedy uttered the three prayers that resulted in his suspension, he was not engaged in speech “ordinarily within the scope” of his duties as a coach. Lane, 573 U.S. at 240, 134 S.Ct. 2369. He did not speak pursuant to government policy. He was not seeking to convey a government-created message. He was not instructing players, discussing strategy, encouraging better on-field performance, or engaged in any other speech the District paid him to produce as a coach. See Part I–B, supra. Simply put: Mr. Kennedy’s prayers did not “ow[e their] existence” to Mr. Kennedy’s responsibilities as a public employee."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("legaltextai/modernbert-embed-ft-const-legal-matryoshka")
# Run inference
sentences = [
"Based on the court's ruling, what are the implications of Title VII regarding discrimination against employees based on their transgender status or failure to conform to sex stereotypes?",
'Thus, even if we\xa0agreed with the Funeral Home that Rost\'s religious exercise would be substantially burdened by enforcing Title VII in this case, we would nevertheless REVERSE the district court\'s grant of summary judgment to the Funeral Home and hold instead that requiring the Funeral Home to comply with Title VII constitutes the least restrictive means of furthering the government\'s compelling interest in eradicating discrimination against Stephens on the basis of sex. Thus, even assuming Rost\'s religious exercise is substantially burdened by the EEOC\'s enforcement action in this case, we GRANT summary judgment to the EEOC on the Funeral Home\'s RFRA defense on this alternative ground.\n\n\xa0\n\n[ … ]\n\n[ … ]\n\n\xa0\n\nIII. CONCLUSION\n\nDiscrimination against employees, either because of their failure to conform to sex stereotypes or their transgender and transitioning status, is illegal under Title VII. The unrefuted facts show that the Funeral Home fired Stephens because she refused to abide by her employer\'s stereotypical conception of her sex, and therefore the EEOC is entitled to summary judgment as to its unlawful-termination claim. RFRA provides the Funeral Home with no relief because continuing to employ Stephens would not, as a matter of law, substantially burden Rost\'s religious exercise, and even if it did, the EEOC has shown that enforcing Title VII here is the least restrictive means of furthering its compelling interest in combating and eradicating sex discrimination. We therefore REVERSE the district court\'s grant of summary judgment in favor of the Funeral Home and GRANT summary judgment to the EEOC on its unlawful-termination claim. We also REVERSE the district court\'s grant of summary judgment on the EEOC\'s discriminatory-clothing-allowance claim, as the district court erred in failing to consider the EEOC\'s claim on the merits. We REMAND this case to the district court for further proceedings consistent with this opinion.\n\n[1]\xa0We refer to Stephens using female pronouns, in accordance with the preference she has expressed through her briefing to this court.\n\n[2]\xa0All facts drawn from Def.\'s Statement of Facts (R. 55) are undisputed.\xa0See\xa0R. 64 (Pl.\'s Counter Statement of Disputed Facts) (Page ID #2066-88).\n\n[3]\xa0See also\xa0Appellee Br. at 16 ("It is a helpful exercise to think about\xa0Price Waterhouse\xa0and imagine that there was a dress code imposed which obligated Ms. Hopkins to wear a skirt while her male colleagues were obliged to wear pants. Had she simply been fired for wearing pants rather than a skirt, the case would have ended there — both sexes would have been equally burdened by the requirement to comply with their respective sex-specific standard. But what the firm could not do was fire her for being aggressive or macho when it was tolerating or rewarding the behavior among men — and when it did, it relied on a stereotype to treat her disparately from the men in the firm.").\n\n[4]\xa0Moreover, discrimination because of a person\'s transgender, intersex, or sexually indeterminate status is no less actionable than discrimination because of a person\'s identification with two religions, an unorthodox religion, or no religion at all. And "religious identity" can be just as fluid, variable, and difficult to define as "gender identity"; after all, both have "a deeply personal, internal genesis that lacks a fixed external referent." Sue Landsittel,\xa0Strange Bedfellows? Sex, Religion, and Transgender Identity Under Title VII,\xa0104 NW. U. L. REV. 1147, 1172 (2010) (advocating for "[t]he application of tests for religious identity to the problem of gender identity [because it] produces a more realistic, and therefore more appropriate, authentication framework than the current reliance on medical diagnoses and conformity with the gender binary").\n\n[5]\xa0On the other hand, there is also evidence that Stephens was fired only because of her nonconforming appearance and behavior at work, and not because of her transgender identity.\xa0See\xa0R. 53-6 (Rost Dep.',
'[citation omitted]\n\n\xa0\n\n*1994 The program imposes no geographic limitation: Parents may direct tuition payments to schools inside or outside the State, or even in foreign countries. [citation omitted] In schools that qualify for the program because they are accredited, teachers need not be certified by the State,…and Maine’s curricular requirements do not apply…Single-sex schools are eligible. [citation omitted]\n\n\xa0\n\nPrior to 1981, parents could also direct the tuition assistance payments to religious schools. Indeed, in the 1979–1980 school year, over 200 Maine students opted to attend such schools through the tuition assistance program. App. 72. In 1981, however, Maine imposed a new requirement that any school receiving tuition assistance payments must be “a nonsectarian school in accordance with the First Amendment of the United States Constitution.” [citation omitted] That provision was enacted in response to an opinion by the Maine attorney general taking the position that public funding of private religious schools violated the Establishment Clause of the First Amendment. We subsequently held, however, that a benefit program under which private citizens “direct government aid to religious schools wholly as a result of their own genuine and independent private choice” does not offend the Establishment Clause. [citation omitted] Following our decision in Zelman, the Maine Legislature considered a proposed bill to repeal the “nonsectarian” requirement, but rejected it. App. 100, 108.\n\n\xa0\n\nThe “nonsectarian” requirement for participation in Maine’s tuition assistance program remains in effect today. The Department has stated that, in administering this requirement, it “considers a sectarian school to be one that is associated with a particular faith or belief system and which, in addition to teaching academic subjects, promotes the faith or belief system with which it is associated and/or presents the material taught through the lens of this faith.” [citation omitted] “The Department’s focus is on what the school teaches through its curriculum and related activities, and how the material is presented.” …“[A]ffiliation or association with a church or religious institution is one potential indicator of a sectarian school,” but “it is not dispositive.”\n\n\xa0\n\n\xa0\n\nB\n\nThis case concerns two families that live in SAUs that neither maintain their own secondary schools nor contract with any nearby secondary school. App. 70, 71. Petitioners David and Amy Carson reside in Glenburn, Maine. Id., at 74. When this litigation commenced, the Carsons’ daughter attended high school at Bangor Christian Schools (BCS), which was founded in 1970 as a ministry of Bangor Baptist Church. Id., at 74, 80. The Carsons sent their daughter to BCS because of the school’s high academic standards and because the school’s Christian worldview aligns with their sincerely held religious beliefs. Id., at 74. Given that BCS is a “sectarian” school that cannot qualify for tuition assistance payments under Maine’s program, id., at 80, the Carsons paid the tuition for their daughter to attend BCS themselves, id., at 74.\n\n\xa0\n\nPetitioners Troy and Angela Nelson live in Palermo, Maine. Id., at 78. When this litigation commenced, the Nelsons’ daughter attended high school at Erskine Academy, a secular private school, and their son attended middle school at Temple Academy, a “sectarian” school affiliated with *1995 Centerpoint Community Church. Id., at 78, 90, 91. The Nelsons sent their son to Temple Academy because they believed it offered him a high-quality education that aligned with their sincerely held religious beliefs. Id., at 78. While they wished to send their daughter to Temple Academy too, they could not afford to pay the cost of the Academy’s tuition for both of their children. Id., at 79.\n\n\xa0\n\nBCS and Temple Academy are both accredited by the New England Association of Schools and Colleges (NEASC), and the Department considers each school a “private school approved for attendance purposes” under the State’s compulsory attendance requirement. Id., at 80, 90. Yet because neither school qualifies as “nonsectarian,” neither is eligible to receive tuition payments under Maine’s tuition assistance program. Id., at 80, 90. Absent the “nonsectarian” requirement, the Carsons and the Nelsons would have asked their respective SAUs to pay the tuition to send their children to BCS and Temple Academy, respectively. Id., at 79.\n\n\xa0\n\nIn 2018, petitioners brought suit against the commissioner of the Maine Department of Education. Id., at 11–12.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
dim_768, dim_512, dim_256, dim_128 and dim_64InformationRetrievalEvaluator| Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
|---|---|---|---|---|---|
| cosine_accuracy@1 | 0.4839 | 0.4839 | 0.4516 | 0.4409 | 0.3978 |
| cosine_accuracy@3 | 0.6989 | 0.7204 | 0.6882 | 0.6452 | 0.6022 |
| cosine_accuracy@5 | 0.7957 | 0.7849 | 0.7957 | 0.7634 | 0.7097 |
| cosine_accuracy@10 | 0.9247 | 0.9032 | 0.8817 | 0.8387 | 0.8065 |
| cosine_precision@1 | 0.4839 | 0.4839 | 0.4516 | 0.4409 | 0.3978 |
| cosine_precision@3 | 0.3799 | 0.3871 | 0.3656 | 0.3548 | 0.3405 |
| cosine_precision@5 | 0.2839 | 0.286 | 0.2796 | 0.2731 | 0.2602 |
| cosine_precision@10 | 0.172 | 0.1677 | 0.1656 | 0.1559 | 0.1538 |
| cosine_recall@1 | 0.2177 | 0.2231 | 0.2088 | 0.1873 | 0.1586 |
| cosine_recall@3 | 0.4884 | 0.5027 | 0.4718 | 0.4453 | 0.4059 |
| cosine_recall@5 | 0.5883 | 0.5936 | 0.5806 | 0.5726 | 0.526 |
| cosine_recall@10 | 0.7088 | 0.6944 | 0.6855 | 0.6541 | 0.6165 |
| cosine_ndcg@10 | 0.5864 | 0.5845 | 0.565 | 0.5356 | 0.5019 |
| cosine_mrr@10 | 0.5963 | 0.595 | 0.5674 | 0.5453 | 0.5082 |
| cosine_map@100 | 0.4916 | 0.4987 | 0.4761 | 0.4511 | 0.4182 |
anchor and positive| anchor | positive | |
|---|---|---|
| type | string | string |
| details |
|
|
| anchor | positive |
|---|---|
Based on the court's ruling, under what circumstances can a college student be held accountable for off-campus speech, and how does this relate to the standards of professionalism in a professional school setting? |
A serious question raised by Keefe in this case is whether the First Amendment protected his unprofessional speech from academic disadvantage because it was made in- on-line, off-campus Facebook postings. On appeal, Keefe framed this contention categorically, arguing that a college student may not be punished for off-campus speech unless it is speech that is unprotected by the First Amendment, such as obscenity. We reject this categorical contention. A student may demonstrate an unacceptable lack of professionalism off campus, as well as in the classroom, and by speech as well as conduct. See Yoder v. Univ. of Louisville, 526 Fed-Appx. 537, 545-46 (6th Cir.), cert. denied, — U.S. -, 134 S.Ct. 790, 187 L.Ed.2d 594 (2013); Tatro v. Univ. of Minn., 816 N.W.2d 509, 521 (Minn. 2012). Therefore, college administrators and educators in a professional school have discretion to require compliance with recognized standards of the profession, both on and off campus, “so long as their actions are ... |
Describe the two-step framework that Courts of Appeals have developed for analyzing Second Amendment challenges. What are the implications of the Supreme Court's decision to reject this framework in favor of a historical tradition-based approach? |
Petitioners sued respondents for declaratory and injunctive relief under…42 U.S.C. § 1983, alleging that respondents violated their Second and Fourteenth Amendment rights by denying their unrestricted-license applications on the basis that they had failed to show “proper cause,” i.e., had failed to demonstrate a unique need for self-defense. |
Discuss the implications of the California Alien Land Law as it pertains to the rights of American citizens, specifically in the case of Fred Oyama. How does the law affect his privileges as a citizen, and what constitutional protections are being challenged? |
269 |
MatryoshkaLoss with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
eval_strategy: epochper_device_train_batch_size: 16per_device_eval_batch_size: 16gradient_accumulation_steps: 32learning_rate: 2e-05num_train_epochs: 4lr_scheduler_type: cosinewarmup_ratio: 0.1bf16: Truetf32: Trueload_best_model_at_end: Trueoptim: adamw_torch_fusedbatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: epochprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 32eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 4max_steps: -1lr_scheduler_type: cosinelr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Truelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional| Epoch | Step | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
|---|---|---|---|---|---|---|
| 0.6038 | 1 | 0.5604 | 0.5631 | 0.5303 | 0.4907 | 0.4335 |
| 1.6038 | 2 | 0.5836 | 0.5758 | 0.5715 | 0.5180 | 0.4846 |
| 2.6038 | 3 | 0.5768 | 0.5841 | 0.5652 | 0.5296 | 0.4940 |
| 3.6038 | 4 | 0.5864 | 0.5845 | 0.565 | 0.5356 | 0.5019 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
answerdotai/ModernBERT-base