AiLab-IMCS-UL
/

mbert-lv-emotions-ekman

Safetensors

Latvian

bert

Model card Files Files and versions

xet

Community

normundsg commited on Nov 10, 2025

Commit

b17fa13

verified ·

1 Parent(s): e9e806d

Update README.md

Browse files

Files changed (1) hide show

README.md +90 -113

README.md CHANGED Viewed

@@ -1,113 +1,90 @@
----
-license: mit
-datasets:
-- SkyWater21/lv_emotions
-language:
-- lv
-base_model:
-- google-bert/bert-base-multilingual-cased
----
-Fine-tuned [Multilingual BERT](https://huggingface.co/google-bert/bert-base-multilingual-cased) for multi-label emotion classification task.
-Model was trained on [lv_emotions](https://huggingface.co/datasets/SkyWater21/lv_emotions) dataset. This dataset is Latvian translation of [GoEmotions](https://huggingface.co/datasets/go_emotions) and [Twitter Emotions](https://huggingface.co/datasets/SkyWater21/lv_twitter_emotions) dataset. Google Translate was used to generate the machine translation.
-Original 26 emotions were mapped to 6 base emotions as per Dr. Ekman theory.
-Labels predicted by classifier:
-```yaml
-0: anger
-1: disgust
-2: fear
-3: joy
-4: sadness
-5: surprise
-6: neutral
-```
-Label mapping from 27 emotions from GoEmotion to 6 base emotions as per Dr. Ekman theory:
-|GoEmotion|Ekman|
-|---|---|
-| admiration | joy|
-| amusement | joy|
-| anger | anger|
-| annoyance | anger|
-| approval | joy|
-| caring | joy|
-| confusion | surprise|
-| curiosity | surprise|
-| desire | joy|
-| disappointment | sadness|
-| disapproval | anger|
-| disgust | disgust|
-| embarrassment | sadness|
-| excitement | joy|
-| fear | fear|
-| gratitude | joy|
-| grief | sadness|
-| joy | joy|
-| love | joy|
-| nervousness | fear|
-| optimism | joy|
-| pride | joy|
-| realization | surprise|
-| relief | joy|
-| remorse | sadness|
-| sadness | sadness|
-| surprise | surprise|
-| neutral | neutral|
-Seed used for random number generator is 42:
-```python
-def set_seed(seed=42):
-    random.seed(seed)
-    np.random.seed(seed)
-    torch.manual_seed(seed)
-    if torch.cuda.is_available():
-        torch.cuda.manual_seed_all(seed)
-```
-Training parameters:
-```yaml
-max_length: null
-batch_size: 32
-shuffle: True
-num_workers: 4
-pin_memory: False
-drop_last: False
-optimizer: adam
-lr: 0.00001
-weight_decay: 0
-problem_type: multi_label_classification
-num_epochs: 4
-```
-Evaluation results on test split of [lv_go_emotions](https://huggingface.co/datasets/SkyWater21/lv_emotions/viewer/combined/lv_go_emotions_test)
-|              |Precision|Recall|F1-Score|Support|
-|--------------|---------|------|--------|-------|
-|anger         |     0.50|  0.35|    0.41|    726|
-|disgust       |     0.44|  0.28|    0.35|    123|
-|fear          |     0.58|  0.47|    0.52|     98|
-|joy           |     0.80|  0.76|    0.78|   2104|
-|sadness       |     0.66|  0.41|    0.51|    379|
-|surprise      |     0.59|  0.55|    0.57|    677|
-|neutral       |     0.71|  0.43|    0.54|   1787|
-|micro avg     |     0.70|  0.55|    0.62|   5894|
-|macro avg     |     0.61|  0.46|    0.52|   5894|
-|weighted avg  |     0.69|  0.55|    0.61|   5894|
-|samples avg   |     0.58|  0.56|    0.57|   5894|
-Evaluation results on test split of [lv_twitter_emotions](https://huggingface.co/datasets/SkyWater21/lv_emotions/viewer/combined/lv_twitter_emotions_test)
-|              |Precision|Recall|F1-Score|Support|
-|--------------|---------|------|--------|-------|
-|anger         |     0.92|  0.88|    0.90|  12013|
-|disgust       |     0.90|  0.94|    0.92|  14117|
-|fear          |     0.82|  0.67|    0.74|   3342|
-|joy           |     0.88|  0.84|    0.86|   5913|
-|sadness       |     0.86|  0.75|    0.80|   4786|
-|surprise      |     0.94|  0.56|    0.70|   1510|
-|neutral       |     0.00|  0.00|    0.00|      0|
-|micro avg     |     0.90|  0.85|    0.87|  41681|
-|macro avg     |     0.76|  0.66|    0.70|  41681|
-|weighted avg  |     0.90|  0.85|    0.87|  41681|
-|samples avg   |     0.85|  0.85|    0.85|  41681|

+---
+license: apache-2.0
+datasets:
+- AiLab-IMCS-UL/go_emotions-lv
+- AiLab-IMCS-UL/twitter_emotions-lv
+language:
+- lv
+base_model:
+- google-bert/bert-base-multilingual-cased
+---
+# Latvian Basic Emotion Classifier
+A fine-tuned version of [Multilingual BERT](https://huggingface.co/google-bert/bert-base-multilingual-cased) for multi-label text classification of six basic emotions (+neutral) in Latvian, as defined by Ekman’s theory.
+The model is trained on a combined dataset of [go_emotions-lv](https://huggingface.co/datasets/AiLab-IMCS-UL/go_emotions-lv) and [twitter_emotions-lv](https://huggingface.co/datasets/AiLab-IMCS-UL/twitter_emotions-lv).
+Predicted labels:
+```yaml
+0: anger
+1: disgust
+2: fear
+3: joy
+4: sadness
+5: surprise
+6: neutral
+```
+The random seed used for initialization was 42:
+```python
+def set_seed(seed=42):
+    random.seed(seed)
+    np.random.seed(seed)
+    torch.manual_seed(seed)
+    if torch.cuda.is_available():
+        torch.cuda.manual_seed_all(seed)
+```
+Training parameters:
+```yaml
+max_length: null
+batch_size: 32
+shuffle: True
+num_workers: 4
+pin_memory: False
+drop_last: False
+optimizer: adam
+lr: 0.00001
+weight_decay: 0
+problem_type: multi_label_classification
+num_epochs: 4
+```
+## Evaluation
+Evaluation results on the test split of [go_emotions-lv](https://huggingface.co/datasets/AiLab-IMCS-UL/go_emotions-lv/viewer/simplified_ekman/test):
+|              |Precision|Recall|F1-Score|Support|
+|--------------|---------|------|--------|-------|
+|anger         |     0.50|  0.35|    0.41|    726|
+|disgust       |     0.44|  0.28|    0.35|    123|
+|fear          |     0.58|  0.47|    0.52|     98|
+|joy           |     0.80|  0.76|    0.78|   2104|
+|sadness       |     0.66|  0.41|    0.51|    379|
+|surprise      |     0.59|  0.55|    0.57|    677|
+|neutral       |     0.71|  0.43|    0.54|   1787|
+|micro avg     |     0.70|  0.55|    0.62|   5894|
+|macro avg     |     0.61|  0.46|    0.52|   5894|
+|weighted avg  |     0.69|  0.55|    0.61|   5894|
+|samples avg   |     0.58|  0.56|    0.57|   5894|
+Evaluation results on the test split of [twitter_emotions-lv](https://huggingface.co/datasets/AiLab-IMCS-UL/twitter_emotions-lv/viewer/simplified_ekman/test):
+|              |Precision|Recall|F1-Score|Support|
+|--------------|---------|------|--------|-------|
+|anger         |     0.92|  0.88|    0.90|  12013|
+|disgust       |     0.90|  0.94|    0.92|  14117|
+|fear          |     0.82|  0.67|    0.74|   3342|
+|joy           |     0.88|  0.84|    0.86|   5913|
+|sadness       |     0.86|  0.75|    0.80|   4786|
+|surprise      |     0.94|  0.56|    0.70|   1510|
+|micro avg     |     0.90|  0.85|    0.87|  41681|
+|macro avg     |     0.76|  0.66|    0.70|  41681|
+|weighted avg  |     0.90|  0.85|    0.87|  41681|
+|samples avg   |     0.85|  0.85|    0.85|  41681|
+## See also
+https://huggingface.co/AiLab-IMCS-UL/lvbert-emotions-ekman
+## Acknowledgements
+This work was supported by the EU Recovery and Resilience Facility project [Language Technology Initiative](https://www.vti.lu.lv) (2.3.1.1.i.0/1/22/I/CFLA/002).