Sentence Similarity
Safetensors
roberta
zdanGL commited on
Commit
ac7dbf9
·
verified ·
1 Parent(s): 95bd3b1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -8
README.md CHANGED
@@ -12,8 +12,6 @@ base_model:
12
 
13
  **roberta-large-entity-linking** is a [RoBERTa large model](https://huggingface.co/FacebookAI/roberta-large) fine-tuned as a [bi-encoder](https://arxiv.org/pdf/1811.08008) for [entity linking](https://en.wikipedia.org/wiki/Entity_linking) tasks. The model separately embeds mentions-in-context and entity descriptions to enable semantic matching between text mentions and knowledge base entities.
14
 
15
- ## Intended Uses
16
-
17
  ### Primary Use Cases
18
  - **Entity Linking:** Link Wikipedia concepts mentioned in text to their corresponding Wikipedia pages. With [this dataset](https://huggingface.co/datasets/wikimedia/structured-wikipedia) [Wikimedia](https://huggingface.co/wikimedia) makes it easy, you can embed the entries in the "abstract" column (you may need to do some cleanup to filter out irrelevant entries).
19
  - **Zero-shot Entity Linking:** Link entities to knowledge bases without task-specific training
@@ -103,14 +101,10 @@ for i, definition in enumerate(definitions):
103
  print(f"Similarity: {sim_value:.4f}\n")
104
  ```
105
 
106
- ## Model Details
107
-
108
-
109
  ### Training Data
110
  - **Dataset:** 3 million pairs of Wikipedia anchor text links and Wikipedia page abstracts, derived from [this dataset](https://huggingface.co/datasets/wikimedia/structured-wikipedia)
111
  - **Special Token:** `[ENT]` token added to vocabulary mark entity mentions
112
 
113
-
114
  ### Training Details
115
  - **Hardware:** Single 80GB H100 GPU
116
  - **Batch Size:** 80
@@ -118,8 +112,6 @@ for i, definition in enumerate(definitions):
118
  - **Loss Function:** Batch hard triplet loss (margin=0.4)
119
  - **Max Sequence Length:** 256 tokens (both mentions and descriptions)
120
 
121
- ## Performance
122
-
123
  ### Benchmark Results
124
  - **Dataset:** Zero-Shot Entity Linking [(Logeswaran et al., 2019)](https://arxiv.org/abs/1906.07348)
125
  - **Metric:** Recall@64
 
12
 
13
  **roberta-large-entity-linking** is a [RoBERTa large model](https://huggingface.co/FacebookAI/roberta-large) fine-tuned as a [bi-encoder](https://arxiv.org/pdf/1811.08008) for [entity linking](https://en.wikipedia.org/wiki/Entity_linking) tasks. The model separately embeds mentions-in-context and entity descriptions to enable semantic matching between text mentions and knowledge base entities.
14
 
 
 
15
  ### Primary Use Cases
16
  - **Entity Linking:** Link Wikipedia concepts mentioned in text to their corresponding Wikipedia pages. With [this dataset](https://huggingface.co/datasets/wikimedia/structured-wikipedia) [Wikimedia](https://huggingface.co/wikimedia) makes it easy, you can embed the entries in the "abstract" column (you may need to do some cleanup to filter out irrelevant entries).
17
  - **Zero-shot Entity Linking:** Link entities to knowledge bases without task-specific training
 
101
  print(f"Similarity: {sim_value:.4f}\n")
102
  ```
103
 
 
 
 
104
  ### Training Data
105
  - **Dataset:** 3 million pairs of Wikipedia anchor text links and Wikipedia page abstracts, derived from [this dataset](https://huggingface.co/datasets/wikimedia/structured-wikipedia)
106
  - **Special Token:** `[ENT]` token added to vocabulary mark entity mentions
107
 
 
108
  ### Training Details
109
  - **Hardware:** Single 80GB H100 GPU
110
  - **Batch Size:** 80
 
112
  - **Loss Function:** Batch hard triplet loss (margin=0.4)
113
  - **Max Sequence Length:** 256 tokens (both mentions and descriptions)
114
 
 
 
115
  ### Benchmark Results
116
  - **Dataset:** Zero-Shot Entity Linking [(Logeswaran et al., 2019)](https://arxiv.org/abs/1906.07348)
117
  - **Metric:** Recall@64