Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper
• 1908.10084 • Published
• 12
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("ayushexel/embed-all-MiniLM-L6-v2-squad-5-epochs")
# Run inference
sentences = [
'How many professional sports leagues have their headquarters in New York?',
'New York City is home to the headquarters of the National Football League, Major League Baseball, the National Basketball Association, the National Hockey League, and Major League Soccer. The New York metropolitan area hosts the most sports teams in these five professional leagues. Participation in professional sports in the city predates all professional leagues, and the city has been continuously hosting professional sports since the birth of the Brooklyn Dodgers in 1882. The city has played host to over forty major professional teams in the five sports and their respective competing leagues, both current and historic. Four of the ten most expensive stadiums ever built worldwide (MetLife Stadium, the new Yankee Stadium, Madison Square Garden, and Citi Field) are located in the New York metropolitan area. Madison Square Garden, its predecessor, as well as the original Yankee Stadium and Ebbets Field, are some of the most famous sporting venues in the world, the latter two having been commemorated on U.S. postage stamps.',
'New York City is home to the headquarters of the National Football League, Major League Baseball, the National Basketball Association, the National Hockey League, and Major League Soccer. The New York metropolitan area hosts the most sports teams in these five professional leagues. Participation in professional sports in the city predates all professional leagues, and the city has been continuously hosting professional sports since the birth of the Brooklyn Dodgers in 1882. The city has played host to over forty major professional teams in the five sports and their respective competing leagues, both current and historic. Four of the ten most expensive stadiums ever built worldwide (MetLife Stadium, the new Yankee Stadium, Madison Square Garden, and Citi Field) are located in the New York metropolitan area. Madison Square Garden, its predecessor, as well as the original Yankee Stadium and Ebbets Field, are some of the most famous sporting venues in the world, the latter two having been commemorated on U.S. postage stamps.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
gooqa-devTripletEvaluator| Metric | Value |
|---|---|
| cosine_accuracy | 0.413 |
question, context, and negative| question | context | negative | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| question | context | negative |
|---|---|---|
What percent were born outside of Switzerland? |
As of 2008[update], the population was 47.5% male and 52.5% female. The population was made up of 44,032 Swiss men (35.4% of the population) and 15,092 (12.1%) non-Swiss men. There were 51,531 Swiss women (41.4%) and 13,726 (11.0%) non-Swiss women. Of the population in the municipality, 39,008 or about 30.3% were born in Bern and lived there in 2000. There were 27,573 or 21.4% who were born in the same canton, while 25,818 or 20.1% were born somewhere else in Switzerland, and 27,812 or 21.6% were born outside of Switzerland. |
The city of Bern or Berne (German: Bern, pronounced [bɛrn] ( listen); French: Berne [bɛʁn]; Italian: Berna [ˈbɛrna]; Romansh: Berna [ˈbɛrnɐ] (help·info); Bernese German: Bärn [b̥æːrn]) is the de facto capital of Switzerland, referred to by the Swiss as their (e.g. in German) Bundesstadt, or "federal city".[note 1] With a population of 140,634 (November 2015), Bern is the fifth most populous city in Switzerland. The Bern agglomeration, which includes 36 municipalities, had a population of 406,900 in 2014. The metropolitan area had a population of 660,000 in 2000. Bern is also the capital of the Canton of Bern, the second most populous of Switzerland's cantons. |
Where do the dark-eyed junco migrate? |
The typical image of migration is of northern landbirds, such as swallows (Hirundinidae) and birds of prey, making long flights to the tropics. However, many Holarctic wildfowl and finch (Fringillidae) species winter in the North Temperate Zone, in regions with milder winters than their summer breeding grounds. For example, the pink-footed goose Anser brachyrhynchus migrates from Iceland to Britain and neighbouring countries, whilst the dark-eyed junco Junco hyemalis migrates from subarctic and arctic climates to the contiguous United States and the American goldfinch from taiga to wintering grounds extending from the American South northwestward to Western Oregon. Migratory routes and wintering grounds are traditional and learned by young during their first migration with their parents. Some ducks, such as the garganey Anas querquedula, move completely or partially into the tropics. The European pied flycatcher Ficedula hypoleuca also follows this migratory trend, breeding in Asia and... |
These advantages offset the high stress, physical exertion costs, and other risks of the migration. Predation can be heightened during migration: Eleonora's falcon Falco eleonorae, which breeds on Mediterranean islands, has a very late breeding season, coordinated with the autumn passage of southbound passerine migrants, which it feeds to its young. A similar strategy is adopted by the greater noctule bat, which preys on nocturnal passerine migrants. The higher concentrations of migrating birds at stopover sites make them prone to parasites and pathogens, which require a heightened immune response. |
When did United States begin to provide foreign aid to Israel? |
In July 2007, American business magnate and investor Warren Buffett's holding company Berkshire Hathaway bought an Israeli company, Iscar, its first non-U.S. acquisition, for $4 billion. Since the 1970s, Israel has received military aid from the United States, as well as economic assistance in the form of loan guarantees, which now account for roughly half of Israel's external debt. Israel has one of the lowest external debts in the developed world, and is a net lender in terms of net external debt (the total value of assets vs. liabilities in debt instruments owed abroad), which in December 2015[update] stood at a surplus of US$118 billion. |
In 1992, Yitzhak Rabin became Prime Minister following an election in which his party called for compromise with Israel's neighbors. The following year, Shimon Peres on behalf of Israel, and Mahmoud Abbas for the PLO, signed the Oslo Accords, which gave the Palestinian National Authority the right to govern parts of the West Bank and the Gaza Strip. The PLO also recognized Israel's right to exist and pledged an end to terrorism. In 1994, the Israel–Jordan Treaty of Peace was signed, making Jordan the second Arab country to normalize relations with Israel. Arab public support for the Accords was damaged by the continuation of Israeli settlements and checkpoints, and the deterioration of economic conditions. Israeli public support for the Accords waned as Israel was struck by Palestinian suicide attacks. Finally, while leaving a peace rally in November 1995, Yitzhak Rabin was assassinated by a far-right-wing Jew who opposed the Accords. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
question, context, and negative_1| question | context | negative_1 | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| question | context | negative_1 |
|---|---|---|
When did Massamba-Debat lose power in the Congo? |
Under the 1963 constitution, Massamba-Débat was elected President for a five-year term. During Massamba-Débat's term in office the regime adopted "scientific socialism" as the country's constitutional ideology. In 1965, Congo established relations with the Soviet Union, the People's Republic of China, North Korea and North Vietnam. Massamba-Débat's regime also invited several hundred Cuban army troops into the country to train his party's militia units and these troops helped his government survive a coup in 1966 led by paratroopers loyal to future President Marien Ngouabi. Nevertheless, Massamba-Débat was unable to reconcile various institutional, tribal and ideological factions within the country and his regime ended abruptly with a bloodless coup d'état in September 1968. |
Under the 1963 constitution, Massamba-Débat was elected President for a five-year term. During Massamba-Débat's term in office the regime adopted "scientific socialism" as the country's constitutional ideology. In 1965, Congo established relations with the Soviet Union, the People's Republic of China, North Korea and North Vietnam. Massamba-Débat's regime also invited several hundred Cuban army troops into the country to train his party's militia units and these troops helped his government survive a coup in 1966 led by paratroopers loyal to future President Marien Ngouabi. Nevertheless, Massamba-Débat was unable to reconcile various institutional, tribal and ideological factions within the country and his regime ended abruptly with a bloodless coup d'état in September 1968. |
I2a1b1 is found being highest where? |
On the other hand, I2a1b1 (P41.2) is typical of the South Slavic populations, being highest in Bosnia-Herzegovina (>50%). Haplogroup I2a2 is also commonly found in north-eastern Italians. There is also a high concentration of I2a2a in the Moldavian region of Romania, Moldova and western Ukraine. According to original studies, Hg I2a2 was believed to have arisen in the west Balkans sometime after the LGM, subsequently spreading from the Balkans through Central Russian Plain. Recently, Ken Nordtvedt has split I2a2 into two clades – N (northern) and S (southern), in relation where they arose compared to Danube river. He proposes that N is slightly older than S. He recalculated the age of I2a2 to be ~ 2550 years and proposed that the current distribution is explained by a Slavic expansion from the area north-east of the Carpathians. |
On the other hand, I2a1b1 (P41.2) is typical of the South Slavic populations, being highest in Bosnia-Herzegovina (>50%). Haplogroup I2a2 is also commonly found in north-eastern Italians. There is also a high concentration of I2a2a in the Moldavian region of Romania, Moldova and western Ukraine. According to original studies, Hg I2a2 was believed to have arisen in the west Balkans sometime after the LGM, subsequently spreading from the Balkans through Central Russian Plain. Recently, Ken Nordtvedt has split I2a2 into two clades – N (northern) and S (southern), in relation where they arose compared to Danube river. He proposes that N is slightly older than S. He recalculated the age of I2a2 to be ~ 2550 years and proposed that the current distribution is explained by a Slavic expansion from the area north-east of the Carpathians. |
What is a well-known lake of Hangzhou? |
Valleys and plains are found along the coastline and rivers. The north of the province lies just south of the Yangtze Delta, and consists of plains around the cities of Hangzhou, Jiaxing, and Huzhou, where the Grand Canal of China enters from the northern border to end at Hangzhou. Another relatively flat area is found along the Qu River around the cities of Quzhou and Jinhua. Major rivers include the Qiangtang and Ou Rivers. Most rivers carve out valleys in the highlands, with plenty of rapids and other features associated with such topography. Well-known lakes include the West Lake of Hangzhou and the South Lake of Jiaxing. |
Longjing tea (also called dragon well tea), originating in Hangzhou, is one of the most prestigious, if not the most prestigious Chinese tea. Hangzhou is also renowned for its silk umbrellas and hand fans. Zhejiang cuisine (itself subdivided into many traditions, including Hangzhou cuisine) is one of the eight great traditions of Chinese cuisine. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepsper_device_train_batch_size: 128per_device_eval_batch_size: 128num_train_epochs: 5warmup_ratio: 0.1fp16: Truebatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 128per_device_eval_batch_size: 128per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 5max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Truefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size: 0fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | Validation Loss | gooqa-dev_cosine_accuracy |
|---|---|---|---|---|
| -1 | -1 | - | - | 0.3284 |
| 0.2890 | 100 | 0.4443 | 0.7861 | 0.3900 |
| 0.5780 | 200 | 0.3909 | 0.7772 | 0.3950 |
| 0.8671 | 300 | 0.3944 | 0.7668 | 0.3932 |
| 1.1561 | 400 | 0.3208 | 0.7474 | 0.4066 |
| 1.4451 | 500 | 0.2724 | 0.7490 | 0.4034 |
| 1.7341 | 600 | 0.2778 | 0.7490 | 0.4056 |
| 2.0231 | 700 | 0.28 | 0.7495 | 0.3986 |
| 2.3121 | 800 | 0.205 | 0.7429 | 0.4030 |
| 2.6012 | 900 | 0.2101 | 0.7365 | 0.4100 |
| 2.8902 | 1000 | 0.2066 | 0.7348 | 0.4086 |
| 3.1792 | 1100 | 0.181 | 0.7382 | 0.4104 |
| 3.4682 | 1200 | 0.1686 | 0.7363 | 0.4114 |
| 3.7572 | 1300 | 0.1749 | 0.7438 | 0.4036 |
| 4.0462 | 1400 | 0.164 | 0.7396 | 0.4072 |
| 4.3353 | 1500 | 0.1463 | 0.7419 | 0.4098 |
| 4.6243 | 1600 | 0.1449 | 0.7389 | 0.4146 |
| 4.9133 | 1700 | 0.1475 | 0.7388 | 0.4102 |
| -1 | -1 | - | - | 0.4130 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
sentence-transformers/all-MiniLM-L6-v2