sagteam/cedr_v1
Viewer • Updated • 18.8k • 407 • 6
How to use seara/rubert-tiny2-russian-emotion-detection-cedr with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="seara/rubert-tiny2-russian-emotion-detection-cedr") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("seara/rubert-tiny2-russian-emotion-detection-cedr")
model = AutoModelForSequenceClassification.from_pretrained("seara/rubert-tiny2-russian-emotion-detection-cedr")This is RuBERT-tiny2 model fine-tuned for emotion classification of short Russian texts. The task is a multi-label classification with the following labels:
0: no_emotion
1: joy
2: sadness
3: surprise
4: fear
5: anger
Label to Russian label:
no_emotion: нет эмоции
joy: радость
sadness: грусть
surprise: удивление
fear: страх
anger: злость
from transformers import pipeline
model = pipeline(model="seara/rubert-tiny2-cedr-russian-emotion")
model("Привет, ты мне нравишься!")
# [{'label': 'joy', 'score': 0.9605025053024292}]
This model was trained on CEDR dataset.
An overview of the training data can be found in it's Hugging Face card or in the source article.
Training were done in this project with this parameters:
tokenizer.max_length: null
batch_size: 64
optimizer: adam
lr: 0.00001
weight_decay: 0
num_epochs: 30
| no_emotion | joy | sadness | surprise | fear | anger | micro avg | macro avg | weighted avg | |
|---|---|---|---|---|---|---|---|---|---|
| precision | 0.82 | 0.84 | 0.84 | 0.79 | 0.78 | 0.55 | 0.81 | 0.77 | 0.8 |
| recall | 0.84 | 0.83 | 0.85 | 0.66 | 0.67 | 0.33 | 0.78 | 0.7 | 0.78 |
| f1-score | 0.83 | 0.83 | 0.84 | 0.72 | 0.72 | 0.41 | 0.79 | 0.73 | 0.79 |
| auc-roc | 0.92 | 0.96 | 0.96 | 0.91 | 0.91 | 0.77 | 0.94 | 0.91 | 0.93 |
| support | 734 | 353 | 379 | 170 | 141 | 125 | 1902 | 1902 | 1902 |