Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 15
How to use borntobeignored/qwen3-embedding-4b_lora with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("borntobeignored/qwen3-embedding-4b_lora")
sentences = [
"Company: Goldman Sachs | Year: 2017 | Question: What was the percentage change in the average daily Value-at-Risk (VaR) for interest rates from 2016 to 2017 for Goldman Sachs?",
"a reconciliation of the beginning and ending amount of unrecognized tax benefits , for the periods indicated , is as follows: .\n| ( dollars in thousands ) | 2010 | 2009 | 2008 |\n| --- | --- | --- | --- |\n| balance at january 1 | $ 29010 | $ 34366 | $ 29132 |\n| additions based on tax positions related to the current year | 7119 | 6997 | 5234 |\n| additions for tax positions of prior years | - | - | - |\n| reductions for tax positions of prior years | - | - | - |\n| settlements with taxing authorities | -12356 ( 12356 ) | -12353 ( 12353 ) | - |\n| lapses of applicable statutes of limitations | - | - | - |\n| balance at december 31 | $ 23773 | $ 29010 | $ 34366 |\nthe entire amount of the unrecognized tax benefits would affect the effective tax rate if recognized . in 2010 , the company favorably settled a 2003 and 2004 irs audit . the company recorded a net overall tax benefit including accrued interest of $ 25920 thousand . in addition , the company was also able to take down a $ 12356 thousand fin 48 reserve that had been established regarding the 2003 and 2004 irs audit . the company is no longer subject to u.s . federal , state and local or foreign income tax examinations by tax authorities for years before 2007 . the company recognizes accrued interest related to net unrecognized tax benefits and penalties in income taxes . during the years ended december 31 , 2010 , 2009 and 2008 , the company accrued and recognized a net expense ( benefit ) of approximately $ ( 9938 ) thousand , $ 1563 thousand and $ 2446 thousand , respectively , in interest and penalties . included within the 2010 net expense ( benefit ) of $ ( 9938 ) thousand is $ ( 10591 ) thousand of accrued interest related to the 2003 and 2004 irs audit . the company is not aware of any positions for which it is reasonably possible that the total amounts of unrecognized tax benefits will significantly increase or decrease within twelve months of the reporting date . for u.s . income tax purposes the company has foreign tax credit carryforwards of $ 55026 thousand that begin to expire in 2014 . in addition , for u.s . income tax purposes the company has $ 41693 thousand of alternative minimum tax credits that do not expire . management believes that it is more likely than not that the company will realize the benefits of its net deferred tax assets and , accordingly , no valuation allowance has been recorded for the periods presented . tax benefits of $ 629 thousand and $ 1714 thousand related to share-based compensation deductions for stock options exercised in 2010 and 2009 , respectively , are included within additional paid-in capital of the shareholders 2019 equity section of the consolidated balance sheets. .",
"from those currently anticipated and expressed in such forward-looking statements as a result of a number of factors , including those we discuss under 201crisk factors 201d and elsewhere in this form 10-k . you should read 201crisk factors 201d and 201cforward-looking statements . 201d executive overview general american water works company , inc . ( herein referred to as 201camerican water 201d or the 201ccompany 201d ) is the largest investor-owned united states water and wastewater utility company , as measured both by operating revenues and population served . our approximately 6400 employees provide drinking water , wastewater and other water related services to an estimated 15 million people in 47 states and in one canadian province . our primary business involves the ownership of water and wastewater utilities that provide water and wastewater services to residential , commercial , industrial and other customers . our regulated businesses that provide these services are generally subject to economic regulation by state regulatory agencies in the states in which they operate . the federal government and the states also regulate environmental , health and safety and water quality matters . our regulated businesses provide services in 16 states and serve approximately 3.2 million customers based on the number of active service connections to our water and wastewater networks . we report the results of these businesses in our regulated businesses segment . we also provide services that are not subject to economic regulation by state regulatory agencies . we report the results of these businesses in our market-based operations segment . in 2014 , we continued the execution of our strategic goals . our commitment to growth through investment in our regulated infrastructure and expansion of our regulated customer base and our market-based operations , combined with operational excellence led to continued improvement in regulated operating efficiency , improved performance of our market-based operations , and enabled us to provide increased value to our customers and investors . during the year , we focused on growth , addressed regulatory lag , made more efficient use of capital and improved our regulated operation and maintenance ( 201co&m 201d ) efficiency ratio . 2014 financial results for the year ended december 31 , 2014 , we continued to increase net income , while making significant capital investment in our infrastructure and implementing operational efficiency improvements to keep customer rates affordable . highlights of our 2014 operating results compared to 2013 and 2012 include: .\n| | 2014 | 2013 | 2012 |\n| --- | --- | --- | --- |\n| income from continuing operations | $ 2.39 | $ 2.07 | $ 2.10 |\n| income ( loss ) from discontinued operations net of tax | $ -0.04 ( 0.04 ) | $ -0.01 ( 0.01 ) | $ -0.09 ( 0.09 ) |\n| diluted earnings per share | $ 2.35 | $ 2.06 | $ 2.01 |\ncontinuing operations income from continuing operations included 4 cents per diluted share of costs resulting from the freedom industries chemical spill in west virginia in 2014 and included 14 cents per diluted share in 2013 related to a tender offer . earnings from continuing operations , adjusted for these two items , increased 10% ( 10 % ) , or 22 cents per share , mainly due to favorable operating results from our regulated businesses segment due to higher revenues and lower operating expenses , partially offset by higher depreciation expenses . also contributing to the overall increase in income from continuing operations was lower interest expense in 2014 compared to the same period in 2013. .",
"the goldman sachs group , inc . and subsidiaries management 2019s discussion and analysis the risk committee of the board and the risk governance committee ( through delegated authority from the firmwide risk committee ) approve market risk limits and sub-limits at firmwide , business and product levels , consistent with our risk appetite statement . in addition , market risk management ( through delegated authority from the risk governance committee ) sets market risk limits and sub-limits at certain product and desk levels . the purpose of the firmwide limits is to assist senior management in controlling our overall risk profile . sub-limits are set below the approved level of risk limits . sub-limits set the desired maximum amount of exposure that may be managed by any particular business on a day-to-day basis without additional levels of senior management approval , effectively leaving day-to-day decisions to individual desk managers and traders . accordingly , sub-limits are a management tool designed to ensure appropriate escalation rather than to establish maximum risk tolerance . sub-limits also distribute risk among various businesses in a manner that is consistent with their level of activity and client demand , taking into account the relative performance of each area . our market risk limits are monitored daily by market risk management , which is responsible for identifying and escalating , on a timely basis , instances where limits have been exceeded . when a risk limit has been exceeded ( e.g. , due to positional changes or changes in market conditions , such as increased volatilities or changes in correlations ) , it is escalated to senior managers in market risk management and/or the appropriate risk committee . such instances are remediated by an inventory reduction and/or a temporary or permanent increase to the risk limit . model review and validation our var and stress testing models are regularly reviewed by market risk management and enhanced in order to incorporate changes in the composition of positions included in our market risk measures , as well as variations in market conditions . prior to implementing significant changes to our assumptions and/or models , model risk management performs model validations . significant changes to our var and stress testing models are reviewed with our chief risk officer and chief financial officer , and approved by the firmwide risk committee . see 201cmodel risk management 201d for further information about the review and validation of these models . systems we have made a significant investment in technology to monitor market risk including : 2030 an independent calculation of var and stress measures ; 2030 risk measures calculated at individual position levels ; 2030 attribution of risk measures to individual risk factors of each position ; 2030 the ability to report many different views of the risk measures ( e.g. , by desk , business , product type or entity ) ; 2030 the ability to produce ad hoc analyses in a timely manner . metrics we analyze var at the firmwide level and a variety of more detailed levels , including by risk category , business , and region . the tables below present average daily var and period-end var , as well as the high and low var for the period . diversification effect in the tables below represents the difference between total var and the sum of the vars for the four risk categories . this effect arises because the four market risk categories are not perfectly correlated . the table below presents average daily var by risk category. .\n| $ in millions | year ended december 2017 | year ended december 2016 | year ended december 2015 |\n| --- | --- | --- | --- |\n| interest rates | $ 40 | $ 45 | $ 47 |\n| equity prices | 24 | 25 | 26 |\n| currency rates | 12 | 21 | 30 |\n| commodity prices | 13 | 17 | 20 |\n| diversification effect | -35 ( 35 ) | -45 ( 45 ) | -47 ( 47 ) |\n| total | $ 54 | $ 63 | $ 76 |\nour average daily var decreased to $ 54 million in 2017 from $ 63 million in 2016 , due to reductions across all risk categories , partially offset by a decrease in the diversification effect . the overall decrease was primarily due to lower levels of volatility . our average daily var decreased to $ 63 million in 2016 from $ 76 million in 2015 , due to reductions across all risk categories , partially offset by a decrease in the diversification effect . the overall decrease was primarily due to reduced exposures . goldman sachs 2017 form 10-k 91 ."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]How to use borntobeignored/qwen3-embedding-4b_lora with Unsloth Studio:
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for borntobeignored/qwen3-embedding-4b_lora to start chatting
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for borntobeignored/qwen3-embedding-4b_lora to start chatting
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for borntobeignored/qwen3-embedding-4b_lora to start chatting
pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
model_name="borntobeignored/qwen3-embedding-4b_lora",
max_seq_length=2048,
)This is a sentence-transformers model finetuned from unsloth/Qwen3-Embedding-4B on the generator dataset. It maps sentences & paragraphs to a 2560-dimensional dense vector space and can be used for retrieval.
SentenceTransformer(
(0): Transformer({'transformer_task': 'feature-extraction', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'last_hidden_state'}, 'message': {'method': 'forward', 'method_output_name': 'last_hidden_state', 'format': 'flat'}}, 'module_output_name': 'token_embeddings', 'max_seq_length': 4096, 'do_lower_case': False, 'architecture': 'PeftModelForFeatureExtraction'})
(1): Pooling({'embedding_dimension': 2560, 'pooling_mode': 'lasttoken', 'include_prompt': True})
(2): Normalize({})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("borntobeignored/qwen3-embedding-4b_lora")
# Run inference
queries = [
'Company: Hologic | Year: 2011 | Question: What is the maximum amount of additional cash that Hologic could pay as contingent payments for the acquisition of Sentinelle Medical?',
]
documents = [
'table of contents the company concluded that the acquisition of sentinelle medical did not represent a material business combination , and therefore , no pro forma financial information has been provided herein . subsequent to the acquisition date , the company 2019s results of operations include the results of sentinelle medical , which is included within the company 2019s breast health reporting segment . the company accounted for the sentinelle medical acquisition as a purchase of a business under asc 805 . the purchase price was comprised of an $ 84.8 million cash payment , which was net of certain adjustments , plus three contingent payments up to a maximum of an additional $ 250.0 million in cash . the contingent payments are based on a multiple of incremental revenue growth during the two-year period following the completion of the acquisition as follows : six months after acquisition , 12 months after acquisition , and 24 months after acquisition . pursuant to asc 805 , the company recorded its estimate of the fair value of the contingent consideration liability based on future revenue projections of the sentinelle medical business under various potential scenarios and weighted probability assumptions of these outcomes . as of the date of acquisition , these cash flow projections were discounted using a rate of 16.5% ( 16.5 % ) . the discount rate is based on the weighted-average cost of capital of the acquired business plus a credit risk premium for non-performance risk related to the liability pursuant to asc 820 . this analysis resulted in an initial contingent consideration liability of $ 29.5 million , which will be adjusted periodically as a component of operating expenses based on changes in the fair value of the liability driven by the accretion of the liability for the time value of money and changes in the assumptions pertaining to the achievement of the defined revenue growth milestones . this fair value measurement was based on significant inputs not observable in the market and thus represented a level 3 measurement as defined in asc during each quarter in fiscal 2011 , the company has re-evaluated its assumptions and updated the revenue and probability assumptions for future earn-out periods and lowered its projections . as a result of these adjustments , which were partially offset by the accretion of the liability , and using a current discount rate of approximately 17.0% ( 17.0 % ) , the company recorded a reversal of expense of $ 14.3 million in fiscal 2011 to record the contingent consideration liability at fair value . in addition , during the second quarter of fiscal 2011 , the first earn-out period ended , and the company adjusted the fair value of the contingent consideration liability for actual results during the earn-out period . this payment of $ 4.3 million was made in the third quarter of fiscal 2011 . at september 24 , 2011 , the fair value of the liability is $ 10.9 million . the company did not issue any equity awards in connection with this acquisition . the company incurred third-party transaction costs of $ 1.2 million , which were expensed within general and administrative expenses in fiscal 2010 . the purchase price was as follows: .\n| cash | $ 84751 |\n| --- | --- |\n| contingent consideration | 29500 |\n| total purchase price | $ 114251 |\nsource : hologic inc , 10-k , november 23 , 2011 powered by morningstar ae document research 2120 the information contained herein may not be copied , adapted or distributed and is not warranted to be accurate , complete or timely . the user assumes all risks for any damages or losses arising from any use of this information , except to the extent such damages or losses cannot be limited or excluded by applicable law . past financial performance is no guarantee of future results. .',
'determined that it will primarily be subject to the ietu in future periods , and as such it has recorded tax expense of approximately $ 20 million in 2007 for the deferred tax effects of the new ietu system . as of december 31 , 2007 , the company had us federal net operating loss carryforwards of approximately $ 206 million which will begin to expire in 2023 . of this amount , $ 47 million relates to the pre-acquisition period and is subject to limitation . the remaining $ 159 million is subject to limitation as a result of the change in stock ownership in may 2006 . this limitation is not expected to have a material impact on utilization of the net operating loss carryforwards . the company also had foreign net operating loss carryforwards as of december 31 , 2007 of approximately $ 564 million for canada , germany , mexico and other foreign jurisdictions with various expiration dates . net operating losses in canada have various carryforward periods and began expiring in 2007 . net operating losses in germany have no expiration date . net operating losses in mexico have a ten year carryforward period and begin to expire in 2009 . however , these losses are not available for use under the new ietu tax regulations in mexico . as the ietu is the primary system upon which the company will be subject to tax in future periods , no deferred tax asset has been reflected in the balance sheet as of december 31 , 2007 for these income tax loss carryforwards . the company adopted the provisions of fin 48 effective january 1 , 2007 . fin 48 clarifies the accounting for income taxes by prescribing a minimum recognition threshold a tax benefit is required to meet before being recognized in the financial statements . fin 48 also provides guidance on derecognition , measurement , classification , interest and penalties , accounting in interim periods , disclosure and transition . as a result of the implementation of fin 48 , the company increased retained earnings by $ 14 million and decreased goodwill by $ 2 million . in addition , certain tax liabilities for unrecognized tax benefits , as well as related potential penalties and interest , were reclassified from current liabilities to long-term liabilities . liabilities for unrecognized tax benefits as of december 31 , 2007 relate to various us and foreign jurisdictions . a reconciliation of the beginning and ending amount of unrecognized tax benefits is as follows : year ended december 31 , 2007 ( in $ millions ) .\n| | year ended december 31 2007 ( in $ millions ) |\n| --- | --- |\n| balance as of january 1 2007 | 193 |\n| increases in tax positions for the current year | 2 |\n| increases in tax positions for prior years | 28 |\n| decreases in tax positions of prior years | -21 ( 21 ) |\n| settlements | -2 ( 2 ) |\n| balance as of december 31 2007 | 200 |\nincluded in the unrecognized tax benefits of $ 200 million as of december 31 , 2007 is $ 56 million of tax benefits that , if recognized , would reduce the company 2019s effective tax rate . the company recognizes interest and penalties related to unrecognized tax benefits in the provision for income taxes . as of december 31 , 2007 , the company has recorded a liability of approximately $ 36 million for interest and penalties . this amount includes an increase of approximately $ 13 million for the year ended december 31 , 2007 . the company operates in the united states ( including multiple state jurisdictions ) , germany and approximately 40 other foreign jurisdictions including canada , china , france , mexico and singapore . examinations are ongoing in a number of those jurisdictions including , most significantly , in germany for the years 2001 to 2004 . during the quarter ended march 31 , 2007 , the company received final assessments in germany for the prior examination period , 1997 to 2000 . the effective settlement of those examinations resulted in a reduction to goodwill of approximately $ 42 million with a net expected cash outlay of $ 29 million . the company 2019s celanese corporation and subsidiaries notes to consolidated financial statements 2014 ( continued ) %%transmsg*** transmitting job : y48011 pcn : 122000000 ***%%pcmsg|f-49 |00023|yes|no|02/26/2008 22:07|0|0|page is valid , no graphics -- color : d| .',
'banking ) . the results of the first step of the impairment test showed no indication of impairment in any of the reporting units at any of the periods except december 31 , 2008 and , accordingly , the company did not perform the second step of the impairment test , except for the test performed as of december 31 , 2008 . as of december 31 , 2008 , there was an indication of impairment in the north america consumer banking , latin america consumer banking and emea consumer banking reporting units and , accordingly , the second step of testing was performed on these reporting units . based on the results of the second step of testing , the company recorded a $ 9.6 billion pretax ( $ 8.7 billion after tax ) goodwill impairment charge in the fourth quarter of 2008 , representing the entire amount of goodwill allocated to these reporting units . the primary cause for the goodwill impairment in the above reporting units was the rapid deterioration in the financial markets , as well as in the global economic outlook particularly during the period beginning mid-november through year end 2008 . this deterioration further weakened the near-term prospects for the financial services industry . these and other factors , including the increased possibility of further government intervention , also resulted in the decline in the company 2019s market capitalization from approximately $ 90 billion at july 1 , 2008 and approximately $ 74 billion at october 31 , 2008 to approximately $ 36 billion at december 31 , 2008 . the more significant fair-value adjustments in the pro forma purchase price allocation in the second step of testing were to fair-value loans and debt and were made to identify and value identifiable intangibles . the adjustments to measure the assets , liabilities and intangibles were for the purpose of measuring the implied fair value of goodwill and such adjustments are not reflected in the consolidated balance sheet . the following table shows reporting units with goodwill balances and the excess of fair value of allocated book value as of december 31 , 2008 . reporting unit ( $ in millions ) fair value as a % ( % ) of allocated book value goodwill ( post-impairment ) .\n| reporting unit ( $ inmillions ) | fair value as a % ( % ) of allocated book value | goodwill ( post-impairment ) |\n| --- | --- | --- |\n| north america cards | 139% ( 139 % ) | 6765 |\n| international cards | 218% ( 218 % ) | 4066 |\n| asia consumer banking | 293% ( 293 % ) | 3106 |\n| securities & banking | 109% ( 109 % ) | 9774 |\n| global transaction services | 994% ( 994 % ) | 1570 |\n| north america gwm | 386% ( 386 % ) | 1259 |\n| international gwm | 171% ( 171 % ) | 592 |\nwhile no impairment was noted in step one of our securities and banking reporting unit impairment test at october 31 , 2008 and december 31 , 2008 , goodwill present in that reporting unit may be particularly sensitive to further deterioration in economic conditions . under the market approach for valuing this reporting unit , the earnings multiples and transaction multiples were selected from multiples obtained using data from guideline companies and acquisitions . the selection of the actual multiple considers operating performance and financial condition such as return on equity and net income growth of securities and banking as compared to the guideline companies and acquisitions . for the valuation under the income approach , the company utilized a discount rate which it believes reflects the risk and uncertainty related to the projected cash flows , and selected 2013 as the terminal year . in 2013 , the value was derived assuming a return to historical levels of core-business profitability for the reporting unit , despite the significant losses experienced in 2008 . this assumption is based on management 2019s view that this recovery will occur based upon various macro- economic factors such as the recent u.s . government stimulus actions , restoring marketplace confidence and improved risk-management practices on an industry-wide basis . furthermore , company-specific actions such as its recently announced realignment of its businesses to optimize its global businesses for future profitable growth , will also be a factor in returning the company 2019s core securities and banking business to historical levels . small deterioration in the assumptions used in the valuations , in particular the discount rate and growth rate assumptions used in the net income projections , could significantly affect the company 2019s impairment evaluation and , hence , results . if the future were to differ adversely from management 2019s best estimate of key economic assumptions and associated cash flows were to decrease by a small margin , the company could potentially experience future material impairment charges with respect to the goodwill remaining in our securities and banking reporting unit . any such charges by themselves would not negatively affect the company 2019s tier 1 and total regulatory capital ratios , tangible capital or the company 2019s liquidity position. .',
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 2560] [3, 2560]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.7098, 0.3908, 0.3726]])
InformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.1505 |
| cosine_accuracy@3 | 0.343 |
| cosine_accuracy@5 | 0.437 |
| cosine_accuracy@10 | 0.5605 |
| cosine_precision@5 | 0.0874 |
| cosine_precision@10 | 0.056 |
| cosine_recall@5 | 0.437 |
| cosine_recall@10 | 0.5605 |
| cosine_ndcg@10 | 0.3407 |
| cosine_mrr@10 | 0.2721 |
| cosine_map@100 | 0.2857 |
anchor and positive| anchor | positive | |
|---|---|---|
| type | string | string |
| modality | text | text |
| details |
|
|
| anchor | positive |
|---|---|
Company: United Parcel Service | Year: 2010 | Question: What was the cumulative total return on investment for United Parcel Service's Class B common stock at the end of 2010, assuming $100 was invested on December 31, 2005? |
shareowner return performance graph the following performance graph and related information shall not be deemed 201csoliciting material 201d or to be 201cfiled 201d with the securities and exchange commission , nor shall such information be incorporated by reference into any future filing under the securities act of 1933 or securities exchange act of 1934 , each as amended , except to the extent that the company specifically incorporates such information by reference into such filing . the following graph shows a five year comparison of cumulative total shareowners 2019 returns for our class b common stock , the standard & poor 2019s 500 index , and the dow jones transportation average . the comparison of the total cumulative return on investment , which is the change in the quarterly stock price plus reinvested dividends for each of the quarterly periods , assumes that $ 100 was invested on december 31 , 2005 in the standard & poor 2019s 500 index , the dow jones transportation averag... |
Company: United Parcel Service | Year: 2013 | Question: What was the change in net cash from operating activities at United Parcel Service between 2011 and 2012? |
united parcel service , inc . and subsidiaries management's discussion and analysis of financial condition and results of operations liquidity and capital resources operating activities the following is a summary of the significant sources ( uses ) of cash from operating activities ( amounts in millions ) : . |
Company: Marathon Oil | Year: 2008 | Question: As of December 31, 2008, what were the total undiscounted minimum capital lease obligations for Marathon Oil, excluding assets under construction, in millions? |
marathon oil corporation notes to consolidated financial statements preferred shares 2013 in connection with the acquisition of western discussed in note 6 , the board of directors authorized a class of voting preferred stock consisting of 6 million shares . upon completion of the acquisition , we issued 5 million shares of this voting preferred stock to a trustee , who holds the shares for the benefit of the holders of the exchangeable shares discussed above . each share of voting preferred stock is entitled to one vote on all matters submitted to the holders of marathon common stock . each holder of exchangeable shares may direct the trustee to vote the number of shares of voting preferred stock equal to the number of shares of marathon common stock issuable upon the exchange of the exchangeable shares held by that holder . in no event will the aggregate number of votes entitled to be cast by the trustee with respect to the outstanding shares of voting preferred stock exceed the numb... |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false,
"directions": [
"query_to_doc"
],
"partition_mode": "joint",
"hardness_mode": null,
"hardness_strength": 0.0
}
anchor and positive| anchor | positive | |
|---|---|---|
| type | string | string |
| modality | text | text |
| details |
|
|
| anchor | positive |
|---|---|
Company: Air Products | Year: 2015 | Question: What was the total amount of unconditional purchase obligations that Air Products was committed to in 2017? |
guarantees and warranties in april 2015 , we entered into joint venture arrangements in saudi arabia . an equity bridge loan has been provided to the joint venture until 2020 to fund equity commitments , and we guaranteed the repayment of our 25% ( 25 % ) share of this loan . our venture partner guaranteed repayment of their share . our maximum exposure under the guarantee is approximately $ 100 . as of 30 september 2015 , we recorded a noncurrent liability of $ 67.5 for our obligation to make future equity contributions based on the equity bridge loan . air products has also entered into a sale of equipment contract with the joint venture to engineer , procure , and construct the industrial gas facilities that will supply gases to saudi aramco . we will provide bank guarantees to the joint venture of up to $ 326 to support our performance under the contract . we are party to an equity support agreement and operations guarantee related to an air separation facility constructed in trini... |
Company: JPMorgan Chase | Year: 2003 | Question: In JPMorgan Chase's 2003 annual report, what was the ratio of securities purchased under resale agreements to securities borrowed, based on the values of $62,801 million and $41,834 million respectively? |
notes to consolidated financial statements j.p . morgan chase & co . 98 j.p . morgan chase & co . / 2003 annual report securities financing activities jpmorgan chase enters into resale agreements , repurchase agreements , securities borrowed transactions and securities loaned transactions primarily to finance the firm 2019s inventory positions , acquire securities to cover short positions and settle other securities obligations . the firm also enters into these transactions to accommodate customers 2019 needs . securities purchased under resale agreements ( 201cresale agreements 201d ) and securities sold under repurchase agreements ( 201crepurchase agreements 201d ) are generally treated as collateralized financing transactions and are carried on the consolidated bal- ance sheet at the amounts the securities will be subsequently sold or repurchased , plus accrued interest . where appropriate , resale and repurchase agreements with the same counterparty are reported on a net basis in a... |
Company: Altria | Year: 2016 | Question: As of December 31, 2016, how many individual smoking and health cases, plus smoking and health class actions and aggregated claims litigation, were pending against PM USA and, in some instances, Altria Group, Inc.? |
altria group , inc . and subsidiaries notes to consolidated financial statements _________________________ may not be obtainable in all cases . this risk has been substantially reduced given that 47 states and puerto rico limit the dollar amount of bonds or require no bond at all . as discussed below , however , tobacco litigation plaintiffs have challenged the constitutionality of florida 2019s bond cap statute in several cases and plaintiffs may challenge state bond cap statutes in other jurisdictions as well . such challenges may include the applicability of state bond caps in federal court . states , including florida , may also seek to repeal or alter bond cap statutes through legislation . although altria group , inc . cannot predict the outcome of such challenges , it is possible that the consolidated results of operations , cash flows or financial position of altria group , inc. , or one or more of its subsidiaries , could be materially affected in a particular fiscal quarter o... |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false,
"directions": [
"query_to_doc"
],
"partition_mode": "joint",
"hardness_mode": null,
"hardness_strength": 0.0
}
per_device_train_batch_size: 1gradient_accumulation_steps: 8learning_rate: 0.0002num_train_epochs: 1warmup_ratio: 0.1bf16: Truegradient_checkpointing: unslothoverwrite_output_dir: Falsedo_predict: Falseprediction_loss_only: Trueper_device_train_batch_size: 1per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 8eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 0.0002weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedeepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsehub_revision: Nonegradient_checkpointing: unslothgradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseliger_kernel_config: Noneeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss | cosine_ndcg@10 |
|---|---|---|---|
| -1 | -1 | - | 0.3407 |
| 0.2743 | 50 | 0.0 | - |
| 0.5487 | 100 | 0.0 | - |
| 0.8230 | 150 | 0.0 | - |
| -1 | -1 | - | 0.3407 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{oord2019representationlearningcontrastivepredictive,
title={Representation Learning with Contrastive Predictive Coding},
author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
year={2019},
eprint={1807.03748},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/1807.03748},
}
from sentence_transformers import SentenceTransformer model = SentenceTransformer("borntobeignored/qwen3-embedding-4b_lora") sentences = [ "Company: Goldman Sachs | Year: 2017 | Question: What was the percentage change in the average daily Value-at-Risk (VaR) for interest rates from 2016 to 2017 for Goldman Sachs?", "a reconciliation of the beginning and ending amount of unrecognized tax benefits , for the periods indicated , is as follows: .\n| ( dollars in thousands ) | 2010 | 2009 | 2008 |\n| --- | --- | --- | --- |\n| balance at january 1 | $ 29010 | $ 34366 | $ 29132 |\n| additions based on tax positions related to the current year | 7119 | 6997 | 5234 |\n| additions for tax positions of prior years | - | - | - |\n| reductions for tax positions of prior years | - | - | - |\n| settlements with taxing authorities | -12356 ( 12356 ) | -12353 ( 12353 ) | - |\n| lapses of applicable statutes of limitations | - | - | - |\n| balance at december 31 | $ 23773 | $ 29010 | $ 34366 |\nthe entire amount of the unrecognized tax benefits would affect the effective tax rate if recognized . in 2010 , the company favorably settled a 2003 and 2004 irs audit . the company recorded a net overall tax benefit including accrued interest of $ 25920 thousand . in addition , the company was also able to take down a $ 12356 thousand fin 48 reserve that had been established regarding the 2003 and 2004 irs audit . the company is no longer subject to u.s . federal , state and local or foreign income tax examinations by tax authorities for years before 2007 . the company recognizes accrued interest related to net unrecognized tax benefits and penalties in income taxes . during the years ended december 31 , 2010 , 2009 and 2008 , the company accrued and recognized a net expense ( benefit ) of approximately $ ( 9938 ) thousand , $ 1563 thousand and $ 2446 thousand , respectively , in interest and penalties . included within the 2010 net expense ( benefit ) of $ ( 9938 ) thousand is $ ( 10591 ) thousand of accrued interest related to the 2003 and 2004 irs audit . the company is not aware of any positions for which it is reasonably possible that the total amounts of unrecognized tax benefits will significantly increase or decrease within twelve months of the reporting date . for u.s . income tax purposes the company has foreign tax credit carryforwards of $ 55026 thousand that begin to expire in 2014 . in addition , for u.s . income tax purposes the company has $ 41693 thousand of alternative minimum tax credits that do not expire . management believes that it is more likely than not that the company will realize the benefits of its net deferred tax assets and , accordingly , no valuation allowance has been recorded for the periods presented . tax benefits of $ 629 thousand and $ 1714 thousand related to share-based compensation deductions for stock options exercised in 2010 and 2009 , respectively , are included within additional paid-in capital of the shareholders 2019 equity section of the consolidated balance sheets. .", "from those currently anticipated and expressed in such forward-looking statements as a result of a number of factors , including those we discuss under 201crisk factors 201d and elsewhere in this form 10-k . you should read 201crisk factors 201d and 201cforward-looking statements . 201d executive overview general american water works company , inc . ( herein referred to as 201camerican water 201d or the 201ccompany 201d ) is the largest investor-owned united states water and wastewater utility company , as measured both by operating revenues and population served . our approximately 6400 employees provide drinking water , wastewater and other water related services to an estimated 15 million people in 47 states and in one canadian province . our primary business involves the ownership of water and wastewater utilities that provide water and wastewater services to residential , commercial , industrial and other customers . our regulated businesses that provide these services are generally subject to economic regulation by state regulatory agencies in the states in which they operate . the federal government and the states also regulate environmental , health and safety and water quality matters . our regulated businesses provide services in 16 states and serve approximately 3.2 million customers based on the number of active service connections to our water and wastewater networks . we report the results of these businesses in our regulated businesses segment . we also provide services that are not subject to economic regulation by state regulatory agencies . we report the results of these businesses in our market-based operations segment . in 2014 , we continued the execution of our strategic goals . our commitment to growth through investment in our regulated infrastructure and expansion of our regulated customer base and our market-based operations , combined with operational excellence led to continued improvement in regulated operating efficiency , improved performance of our market-based operations , and enabled us to provide increased value to our customers and investors . during the year , we focused on growth , addressed regulatory lag , made more efficient use of capital and improved our regulated operation and maintenance ( 201co&m 201d ) efficiency ratio . 2014 financial results for the year ended december 31 , 2014 , we continued to increase net income , while making significant capital investment in our infrastructure and implementing operational efficiency improvements to keep customer rates affordable . highlights of our 2014 operating results compared to 2013 and 2012 include: .\n| | 2014 | 2013 | 2012 |\n| --- | --- | --- | --- |\n| income from continuing operations | $ 2.39 | $ 2.07 | $ 2.10 |\n| income ( loss ) from discontinued operations net of tax | $ -0.04 ( 0.04 ) | $ -0.01 ( 0.01 ) | $ -0.09 ( 0.09 ) |\n| diluted earnings per share | $ 2.35 | $ 2.06 | $ 2.01 |\ncontinuing operations income from continuing operations included 4 cents per diluted share of costs resulting from the freedom industries chemical spill in west virginia in 2014 and included 14 cents per diluted share in 2013 related to a tender offer . earnings from continuing operations , adjusted for these two items , increased 10% ( 10 % ) , or 22 cents per share , mainly due to favorable operating results from our regulated businesses segment due to higher revenues and lower operating expenses , partially offset by higher depreciation expenses . also contributing to the overall increase in income from continuing operations was lower interest expense in 2014 compared to the same period in 2013. .", "the goldman sachs group , inc . and subsidiaries management 2019s discussion and analysis the risk committee of the board and the risk governance committee ( through delegated authority from the firmwide risk committee ) approve market risk limits and sub-limits at firmwide , business and product levels , consistent with our risk appetite statement . in addition , market risk management ( through delegated authority from the risk governance committee ) sets market risk limits and sub-limits at certain product and desk levels . the purpose of the firmwide limits is to assist senior management in controlling our overall risk profile . sub-limits are set below the approved level of risk limits . sub-limits set the desired maximum amount of exposure that may be managed by any particular business on a day-to-day basis without additional levels of senior management approval , effectively leaving day-to-day decisions to individual desk managers and traders . accordingly , sub-limits are a management tool designed to ensure appropriate escalation rather than to establish maximum risk tolerance . sub-limits also distribute risk among various businesses in a manner that is consistent with their level of activity and client demand , taking into account the relative performance of each area . our market risk limits are monitored daily by market risk management , which is responsible for identifying and escalating , on a timely basis , instances where limits have been exceeded . when a risk limit has been exceeded ( e.g. , due to positional changes or changes in market conditions , such as increased volatilities or changes in correlations ) , it is escalated to senior managers in market risk management and/or the appropriate risk committee . such instances are remediated by an inventory reduction and/or a temporary or permanent increase to the risk limit . model review and validation our var and stress testing models are regularly reviewed by market risk management and enhanced in order to incorporate changes in the composition of positions included in our market risk measures , as well as variations in market conditions . prior to implementing significant changes to our assumptions and/or models , model risk management performs model validations . significant changes to our var and stress testing models are reviewed with our chief risk officer and chief financial officer , and approved by the firmwide risk committee . see 201cmodel risk management 201d for further information about the review and validation of these models . systems we have made a significant investment in technology to monitor market risk including : 2030 an independent calculation of var and stress measures ; 2030 risk measures calculated at individual position levels ; 2030 attribution of risk measures to individual risk factors of each position ; 2030 the ability to report many different views of the risk measures ( e.g. , by desk , business , product type or entity ) ; 2030 the ability to produce ad hoc analyses in a timely manner . metrics we analyze var at the firmwide level and a variety of more detailed levels , including by risk category , business , and region . the tables below present average daily var and period-end var , as well as the high and low var for the period . diversification effect in the tables below represents the difference between total var and the sum of the vars for the four risk categories . this effect arises because the four market risk categories are not perfectly correlated . the table below presents average daily var by risk category. .\n| $ in millions | year ended december 2017 | year ended december 2016 | year ended december 2015 |\n| --- | --- | --- | --- |\n| interest rates | $ 40 | $ 45 | $ 47 |\n| equity prices | 24 | 25 | 26 |\n| currency rates | 12 | 21 | 30 |\n| commodity prices | 13 | 17 | 20 |\n| diversification effect | -35 ( 35 ) | -45 ( 45 ) | -47 ( 47 ) |\n| total | $ 54 | $ 63 | $ 76 |\nour average daily var decreased to $ 54 million in 2017 from $ 63 million in 2016 , due to reductions across all risk categories , partially offset by a decrease in the diversification effect . the overall decrease was primarily due to lower levels of volatility . our average daily var decreased to $ 63 million in 2016 from $ 76 million in 2015 , due to reductions across all risk categories , partially offset by a decrease in the diversification effect . the overall decrease was primarily due to reduced exposures . goldman sachs 2017 form 10-k 91 ." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4]