Update README.md
Browse files
README.md
CHANGED
|
@@ -36,29 +36,23 @@ tags:
|
|
| 36 |
|
| 37 |
## Summary
|
| 38 |
|
| 39 |
-
The "whisper-large-v3-tiny-caesar" is an acoustic model based on ["openai/whisper-large-v3"](https://huggingface.co/openai/whisper-large-v3) suitable for Automatic Speech Recognition in code
|
| 40 |
|
| 41 |
## Model Description
|
| 42 |
|
| 43 |
-
The "whisper-large-v3-tiny-caesar" is an acoustic model suitable for Automatic Speech Recognition in code
|
| 44 |
-
|
| 45 |
-
CAESAR is an acronym with the following meaning:
|
| 46 |
-
|
| 47 |
-
(CA)talan (ES)panish (A)utomatic (R)ecognition
|
| 48 |
-
|
| 49 |
-
While "tiny" indicates that this model was finetuned with a very small amount of synthetic data (2 hours only).
|
| 50 |
|
| 51 |
## Intended Uses and Limitations
|
| 52 |
|
| 53 |
-
This model can be used for Automatic Speech Recognition (ASR) in code
|
| 54 |
|
| 55 |
## How to Get Started with the Model
|
| 56 |
|
| 57 |
-
To see an updated and functional version of this code, please
|
| 58 |
|
| 59 |
### Installation
|
| 60 |
|
| 61 |
-
|
| 62 |
|
| 63 |
Create a virtual environment:
|
| 64 |
```bash
|
|
@@ -74,7 +68,7 @@ pip install datasets transformers
|
|
| 74 |
```
|
| 75 |
|
| 76 |
### For Inference
|
| 77 |
-
|
| 78 |
|
| 79 |
```bash
|
| 80 |
#Install Prerequisites
|
|
@@ -89,7 +83,7 @@ pip install jiwer
|
|
| 89 |
#This code works with GPU
|
| 90 |
|
| 91 |
#Notice that: load_metric is no longer part of datasets.
|
| 92 |
-
#
|
| 93 |
#(Note from November 2024)
|
| 94 |
|
| 95 |
import torch
|
|
@@ -136,7 +130,7 @@ print(WER)
|
|
| 136 |
|
| 137 |
### Training data
|
| 138 |
|
| 139 |
-
The specific dataset used to create the model is a corpus called CAESAR-tiny which has not been released at the moment.
|
| 140 |
|
| 141 |
### Training procedure
|
| 142 |
|
|
@@ -174,7 +168,7 @@ If this model contributes to your research, please cite the work:
|
|
| 174 |
|
| 175 |
### Author
|
| 176 |
|
| 177 |
-
The fine-tuning process was
|
| 178 |
|
| 179 |
### Contact
|
| 180 |
For further information, please send an email to <[email protected]>.
|
|
@@ -189,4 +183,4 @@ Copyright(c) 2024 by Language Technologies Unit, Barcelona Supercomputing Center
|
|
| 189 |
### Funding
|
| 190 |
This work has been promoted and financed by the Generalitat de Catalunya through the [Aina project](https://projecteaina.cat/).
|
| 191 |
|
| 192 |
-
The training of the model was possible thanks to the
|
|
|
|
| 36 |
|
| 37 |
## Summary
|
| 38 |
|
| 39 |
+
The "whisper-large-v3-tiny-caesar" is an acoustic model based on ["openai/whisper-large-v3"](https://huggingface.co/openai/whisper-large-v3) suitable for Automatic Speech Recognition in code-switching conditions between Spanish and Catalan.
|
| 40 |
|
| 41 |
## Model Description
|
| 42 |
|
| 43 |
+
The "whisper-large-v3-tiny-caesar" is an acoustic model suitable for Automatic Speech Recognition in code-switching conditions between Spanish and Catalan. It is the result of fine-tuning the ["openai/whisper-large-v3"](https://huggingface.co/openai/whisper-large-v3) with [CAESAR-TINY](https://huggingface.co/datasets/BSC-LT/CAESAR-TINY), a 2-hour code-switching dataset in Spanish/Catalan.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
|
| 45 |
## Intended Uses and Limitations
|
| 46 |
|
| 47 |
+
This model can be used for Automatic Speech Recognition (ASR) in code-switching conditions between Spanish and Catalan. The model is intended to transcribe audio files to plain text.
|
| 48 |
|
| 49 |
## How to Get Started with the Model
|
| 50 |
|
| 51 |
+
To see an updated and functional version of this code, please check our [Notebook](https://colab.research.google.com/drive/1MHiPrffNTwiyWeUyMQvSdSbfkef_8aJC?usp=sharing)
|
| 52 |
|
| 53 |
### Installation
|
| 54 |
|
| 55 |
+
To use this model, you may install [datasets](https://huggingface.co/docs/datasets/installation) and [transformers](https://huggingface.co/docs/transformers/installation):
|
| 56 |
|
| 57 |
Create a virtual environment:
|
| 58 |
```bash
|
|
|
|
| 68 |
```
|
| 69 |
|
| 70 |
### For Inference
|
| 71 |
+
To transcribe audio in Catalan using this model, you can follow this example:
|
| 72 |
|
| 73 |
```bash
|
| 74 |
#Install Prerequisites
|
|
|
|
| 83 |
#This code works with GPU
|
| 84 |
|
| 85 |
#Notice that: load_metric is no longer part of datasets.
|
| 86 |
+
# You have to remove it and use evaluate's load instead.
|
| 87 |
#(Note from November 2024)
|
| 88 |
|
| 89 |
import torch
|
|
|
|
| 130 |
|
| 131 |
### Training data
|
| 132 |
|
| 133 |
+
The specific dataset used to create the model is a corpus called CAESAR-tiny, which has not been released at the moment.
|
| 134 |
|
| 135 |
### Training procedure
|
| 136 |
|
|
|
|
| 168 |
|
| 169 |
### Author
|
| 170 |
|
| 171 |
+
The fine-tuning process was performed during November (2024) in the [Language Technologies Unit](https://huggingface.co/BSC-LT) of the [Barcelona Supercomputing Center](https://www.bsc.es/) by [Carlos Daniel Hernández Mena](https://huggingface.co/carlosdanielhernandezmena).
|
| 172 |
|
| 173 |
### Contact
|
| 174 |
For further information, please send an email to <[email protected]>.
|
|
|
|
| 183 |
### Funding
|
| 184 |
This work has been promoted and financed by the Generalitat de Catalunya through the [Aina project](https://projecteaina.cat/).
|
| 185 |
|
| 186 |
+
The training of the model was possible thanks to the computing time provided by [Barcelona Supercomputing Center](https://www.bsc.es/) through MareNostrum 5.
|