dewata_bert_gelu_new
This model is a fine-tuned version of pijarcandra22/dewata_bert_gelu_new on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.1709
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 80
- eval_batch_size: 80
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 150
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 1.0053 | 1.0 | 2153 | 0.8893 |
| 0.9725 | 2.0 | 4306 | 0.9588 |
| 0.923 | 3.0 | 6459 | 0.9913 |
| 0.8701 | 4.0 | 8612 | 1.0152 |
| 0.831 | 5.0 | 10765 | 1.0373 |
| 0.7833 | 6.0 | 12918 | 1.0701 |
| 0.7444 | 7.0 | 15071 | 1.0928 |
| 0.7085 | 8.0 | 17224 | 1.1052 |
| 0.6736 | 9.0 | 19377 | 1.0928 |
| 0.6428 | 10.0 | 21530 | 1.1113 |
| 0.6135 | 11.0 | 23683 | 1.1086 |
| 0.5882 | 12.0 | 25836 | 1.1334 |
| 0.5636 | 13.0 | 27989 | 1.1329 |
| 0.5429 | 14.0 | 30142 | 1.1266 |
| 0.5198 | 15.0 | 32295 | 1.1369 |
| 0.4972 | 16.0 | 34448 | 1.1462 |
| 0.4785 | 17.0 | 36601 | 1.1437 |
| 0.4631 | 18.0 | 38754 | 1.1468 |
| 0.4416 | 19.0 | 40907 | 1.1401 |
| 0.4281 | 20.0 | 43060 | 1.1498 |
| 0.4158 | 21.0 | 45213 | 1.1579 |
| 0.4005 | 22.0 | 47366 | 1.1539 |
| 0.3932 | 23.0 | 49519 | 1.1429 |
| 0.3774 | 24.0 | 51672 | 1.1516 |
| 0.3704 | 25.0 | 53825 | 1.1487 |
| 0.358 | 26.0 | 55978 | 1.1580 |
| 0.3505 | 27.0 | 58131 | 1.1465 |
| 0.3434 | 28.0 | 60284 | 1.1290 |
| 0.3394 | 29.0 | 62437 | 1.1210 |
| 0.3341 | 30.0 | 64590 | 1.1304 |
| 0.432 | 31.0 | 66743 | 0.3157 |
| 0.4259 | 32.0 | 68896 | 0.3127 |
| 0.4162 | 33.0 | 71049 | 0.3207 |
| 0.4097 | 34.0 | 73202 | 0.3155 |
| 0.4061 | 35.0 | 75355 | 0.3281 |
| 0.4256 | 36.0 | 77508 | 0.3150 |
| 0.4182 | 37.0 | 79661 | 0.3209 |
| 0.4097 | 38.0 | 81814 | 0.3272 |
| 0.3989 | 39.0 | 83967 | 0.3271 |
| 0.3918 | 40.0 | 86120 | 0.3401 |
| 0.383 | 41.0 | 88273 | 0.3412 |
| 0.3802 | 42.0 | 90426 | 0.3564 |
| 0.3677 | 43.0 | 92579 | 0.3669 |
| 0.3644 | 44.0 | 94732 | 0.3714 |
| 0.3558 | 45.0 | 96885 | 0.3840 |
| 0.3474 | 46.0 | 99038 | 0.3870 |
| 0.345 | 47.0 | 101191 | 0.3910 |
| 0.3403 | 48.0 | 103344 | 0.3809 |
| 0.335 | 49.0 | 105497 | 0.3842 |
| 0.3355 | 50.0 | 107650 | 0.3871 |
| 0.423 | 51.0 | 109803 | 0.3118 |
| 0.4372 | 52.0 | 111956 | 0.3450 |
| 0.4371 | 53.0 | 114109 | 0.3708 |
| 0.4318 | 54.0 | 116262 | 0.4065 |
| 0.4251 | 55.0 | 118415 | 0.4411 |
| 0.417 | 56.0 | 120568 | 0.4701 |
| 0.4108 | 57.0 | 122721 | 0.5073 |
| 0.4037 | 58.0 | 124874 | 0.5471 |
| 0.3965 | 59.0 | 127027 | 0.5702 |
| 0.385 | 60.0 | 129180 | 0.5940 |
| 0.381 | 61.0 | 131333 | 0.6328 |
| 0.3684 | 62.0 | 133486 | 0.6490 |
| 0.3613 | 63.0 | 135639 | 0.6735 |
| 0.3544 | 64.0 | 137792 | 0.6931 |
| 0.3513 | 65.0 | 139945 | 0.7139 |
| 0.3437 | 66.0 | 142098 | 0.7597 |
| 0.3382 | 67.0 | 144251 | 0.7609 |
| 0.3295 | 68.0 | 146404 | 0.7867 |
| 0.3229 | 69.0 | 148557 | 0.7975 |
| 0.3178 | 70.0 | 150710 | 0.8265 |
| 0.3131 | 71.0 | 152863 | 0.8474 |
| 0.3096 | 72.0 | 155016 | 0.8449 |
| 0.304 | 73.0 | 157169 | 0.8579 |
| 0.2979 | 74.0 | 159322 | 0.8673 |
| 0.2948 | 75.0 | 161475 | 0.8785 |
| 0.2891 | 76.0 | 163628 | 0.9001 |
| 0.2863 | 77.0 | 165781 | 0.8918 |
| 0.2814 | 78.0 | 167934 | 0.9475 |
| 0.3586 | 79.0 | 170087 | 0.2512 |
| 0.345 | 80.0 | 172240 | 0.2598 |
| 0.3332 | 81.0 | 174393 | 0.2644 |
| 0.3243 | 82.0 | 176546 | 0.3046 |
| 0.3155 | 83.0 | 178699 | 0.2897 |
| 0.3081 | 84.0 | 180852 | 0.3066 |
| 0.2992 | 85.0 | 183005 | 0.3316 |
| 0.2951 | 86.0 | 185158 | 0.3526 |
| 0.292 | 87.0 | 187311 | 0.3605 |
| 0.2818 | 88.0 | 189464 | 0.3753 |
| 0.2795 | 89.0 | 191617 | 0.4040 |
| 0.2721 | 90.0 | 193770 | 0.4169 |
| 0.2631 | 91.0 | 195923 | 0.4250 |
| 0.2627 | 92.0 | 198076 | 0.4417 |
| 0.2562 | 93.0 | 200229 | 0.4658 |
| 0.2532 | 94.0 | 202382 | 0.4780 |
| 0.2509 | 95.0 | 204535 | 0.4964 |
| 0.2435 | 96.0 | 206688 | 0.5008 |
| 0.2419 | 97.0 | 208841 | 0.5169 |
| 0.2395 | 98.0 | 210994 | 0.5423 |
| 0.2327 | 99.0 | 213147 | 0.5314 |
| 0.2307 | 100.0 | 215300 | 0.5656 |
| 0.2315 | 101.0 | 217453 | 0.5440 |
| 0.2242 | 102.0 | 219606 | 0.5671 |
| 0.2233 | 103.0 | 221759 | 0.5865 |
| 0.2187 | 104.0 | 223912 | 0.5996 |
| 0.2152 | 105.0 | 226065 | 0.5873 |
| 0.2136 | 106.0 | 228218 | 0.6175 |
| 0.2117 | 107.0 | 230371 | 0.6169 |
| 0.2094 | 108.0 | 232524 | 0.6271 |
| 0.2056 | 109.0 | 234677 | 0.6647 |
| 0.2047 | 110.0 | 236830 | 0.6386 |
| 0.2005 | 111.0 | 238983 | 0.6509 |
| 0.2019 | 112.0 | 241136 | 0.6425 |
| 0.1978 | 113.0 | 243289 | 0.6766 |
| 0.1967 | 114.0 | 245442 | 0.6739 |
| 0.1933 | 115.0 | 247595 | 0.6835 |
| 0.192 | 116.0 | 249748 | 0.6923 |
| 0.1863 | 117.0 | 251901 | 0.7049 |
| 0.1846 | 118.0 | 254054 | 0.6914 |
| 0.1859 | 119.0 | 256207 | 0.7283 |
| 0.1828 | 120.0 | 258360 | 0.7262 |
| 0.1801 | 121.0 | 260513 | 0.7205 |
| 0.1762 | 122.0 | 262666 | 0.7326 |
| 0.1759 | 123.0 | 264819 | 0.7486 |
| 0.174 | 124.0 | 266972 | 0.7398 |
| 0.1752 | 125.0 | 269125 | 0.7233 |
| 0.2345 | 126.0 | 271278 | 0.1523 |
| 0.2226 | 127.0 | 273431 | 0.1528 |
| 0.2151 | 128.0 | 275584 | 0.1552 |
| 0.2109 | 129.0 | 277737 | 0.1500 |
| 0.2066 | 130.0 | 279890 | 0.1569 |
| 0.2028 | 131.0 | 282043 | 0.1485 |
| 0.1976 | 132.0 | 284196 | 0.1588 |
| 0.1925 | 133.0 | 286349 | 0.1544 |
| 0.1921 | 134.0 | 288502 | 0.1567 |
| 0.1864 | 135.0 | 290655 | 0.1557 |
| 0.184 | 136.0 | 292808 | 0.1609 |
| 0.1819 | 137.0 | 294961 | 0.1631 |
| 0.1801 | 138.0 | 297114 | 0.1555 |
| 0.1802 | 139.0 | 299267 | 0.1624 |
| 0.1757 | 140.0 | 301420 | 0.1633 |
| 0.1759 | 141.0 | 303573 | 0.1739 |
| 0.1713 | 142.0 | 305726 | 0.1635 |
| 0.1707 | 143.0 | 307879 | 0.1718 |
| 0.1694 | 144.0 | 310032 | 0.1659 |
| 0.1665 | 145.0 | 312185 | 0.1646 |
| 0.1669 | 146.0 | 314338 | 0.1779 |
| 0.1674 | 147.0 | 316491 | 0.1765 |
| 0.1673 | 148.0 | 318644 | 0.1753 |
| 0.1643 | 149.0 | 320797 | 0.1697 |
| 0.1649 | 150.0 | 322950 | 0.1709 |
Framework versions
- Transformers 4.45.2
- Pytorch 2.5.1+cu124
- Datasets 2.19.0
- Tokenizers 0.20.3
- Downloads last month
- 53
Model tree for pijarcandra22/dewata_bert_gelu_new
Unable to build the model tree, the base model loops to the model itself. Learn more.