tatiana-merz/cyrillic_turkic_langs
Viewer • Updated • 90k • 266 • 1
How to use tatiana-merz/turkic-cyrillic-classifier with Transformers:
# Use a pipeline as a high-level helper
from transformers import pipeline
pipe = pipeline("text-classification", model="tatiana-merz/turkic-cyrillic-classifier") # Load model directly
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("tatiana-merz/turkic-cyrillic-classifier")
model = AutoModelForSequenceClassification.from_pretrained("tatiana-merz/turkic-cyrillic-classifier")This model is a fine-tuned version of bert-base-multilingual-cased on an tatiana-merz/cyrillic_turkic_langs dataset. It achieves the following results on the evaluation set:
{'test_loss': 0.013604652136564255,
'test_accuracy': 0.997,
'test_f1': 0.9969996069718668,
'test_runtime': 60.5479,
'test_samples_per_second': 148.643,
'test_steps_per_second': 2.329}
The model classifies text based on a provided Turkic language written in Cyrillic script.
The following hyperparameters were used during training:
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
|---|---|---|---|---|---|
| 0.1063 | 1.0 | 1000 | 0.0204 | 0.9950 | 0.9950 |
| 0.0126 | 2.0 | 2000 | 0.0136 | 0.9970 | 0.9970 |