Update README.md
Browse files
README.md
CHANGED
|
@@ -41,7 +41,7 @@ This version of the Google T5-Base model has been fine-tuned on a bilingual data
|
|
| 41 |
### Model Sources
|
| 42 |
|
| 43 |
<!-- Provide the basic links for the model. -->
|
| 44 |
-
- **Repository:** https://github.com/leks-forever/
|
| 45 |
<!-- - **Paper [optional]:** [More Information Needed] -->
|
| 46 |
<!-- - **Demo [optional]:** [More Information Needed] -->
|
| 47 |
|
|
@@ -82,11 +82,6 @@ print(translation)
|
|
| 82 |
|
| 83 |
The model was fine-tuned on the [bible-lezghian-russian](https://huggingface.co/datasets/leks-forever/bible-lezghian-russian) dataset, which contains 13,800 parallel sentences in Russian and Lezgian. The dataset was split into three parts: 90% for training, 5% for validation, and 5% for testing.
|
| 84 |
|
| 85 |
-
### Preprocessing
|
| 86 |
-
|
| 87 |
-
The preprocessing step included tokenization with a custom-trained SentencePiece NLLB-based tokenizer on the Russian-Lezgian corpus.
|
| 88 |
-
|
| 89 |
-
|
| 90 |
|
| 91 |
#### Training Hyperparameters
|
| 92 |
|
|
|
|
| 41 |
### Model Sources
|
| 42 |
|
| 43 |
<!-- Provide the basic links for the model. -->
|
| 44 |
+
- **Repository:** https://github.com/leks-forever/mt5-tuning
|
| 45 |
<!-- - **Paper [optional]:** [More Information Needed] -->
|
| 46 |
<!-- - **Demo [optional]:** [More Information Needed] -->
|
| 47 |
|
|
|
|
| 82 |
|
| 83 |
The model was fine-tuned on the [bible-lezghian-russian](https://huggingface.co/datasets/leks-forever/bible-lezghian-russian) dataset, which contains 13,800 parallel sentences in Russian and Lezgian. The dataset was split into three parts: 90% for training, 5% for validation, and 5% for testing.
|
| 84 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 85 |
|
| 86 |
#### Training Hyperparameters
|
| 87 |
|