dewata_bert_gelu_new

This model is a fine-tuned version of pijarcandra22/dewata_bert_gelu_new on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1709

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 80
  • eval_batch_size: 80
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 150
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.0053 1.0 2153 0.8893
0.9725 2.0 4306 0.9588
0.923 3.0 6459 0.9913
0.8701 4.0 8612 1.0152
0.831 5.0 10765 1.0373
0.7833 6.0 12918 1.0701
0.7444 7.0 15071 1.0928
0.7085 8.0 17224 1.1052
0.6736 9.0 19377 1.0928
0.6428 10.0 21530 1.1113
0.6135 11.0 23683 1.1086
0.5882 12.0 25836 1.1334
0.5636 13.0 27989 1.1329
0.5429 14.0 30142 1.1266
0.5198 15.0 32295 1.1369
0.4972 16.0 34448 1.1462
0.4785 17.0 36601 1.1437
0.4631 18.0 38754 1.1468
0.4416 19.0 40907 1.1401
0.4281 20.0 43060 1.1498
0.4158 21.0 45213 1.1579
0.4005 22.0 47366 1.1539
0.3932 23.0 49519 1.1429
0.3774 24.0 51672 1.1516
0.3704 25.0 53825 1.1487
0.358 26.0 55978 1.1580
0.3505 27.0 58131 1.1465
0.3434 28.0 60284 1.1290
0.3394 29.0 62437 1.1210
0.3341 30.0 64590 1.1304
0.432 31.0 66743 0.3157
0.4259 32.0 68896 0.3127
0.4162 33.0 71049 0.3207
0.4097 34.0 73202 0.3155
0.4061 35.0 75355 0.3281
0.4256 36.0 77508 0.3150
0.4182 37.0 79661 0.3209
0.4097 38.0 81814 0.3272
0.3989 39.0 83967 0.3271
0.3918 40.0 86120 0.3401
0.383 41.0 88273 0.3412
0.3802 42.0 90426 0.3564
0.3677 43.0 92579 0.3669
0.3644 44.0 94732 0.3714
0.3558 45.0 96885 0.3840
0.3474 46.0 99038 0.3870
0.345 47.0 101191 0.3910
0.3403 48.0 103344 0.3809
0.335 49.0 105497 0.3842
0.3355 50.0 107650 0.3871
0.423 51.0 109803 0.3118
0.4372 52.0 111956 0.3450
0.4371 53.0 114109 0.3708
0.4318 54.0 116262 0.4065
0.4251 55.0 118415 0.4411
0.417 56.0 120568 0.4701
0.4108 57.0 122721 0.5073
0.4037 58.0 124874 0.5471
0.3965 59.0 127027 0.5702
0.385 60.0 129180 0.5940
0.381 61.0 131333 0.6328
0.3684 62.0 133486 0.6490
0.3613 63.0 135639 0.6735
0.3544 64.0 137792 0.6931
0.3513 65.0 139945 0.7139
0.3437 66.0 142098 0.7597
0.3382 67.0 144251 0.7609
0.3295 68.0 146404 0.7867
0.3229 69.0 148557 0.7975
0.3178 70.0 150710 0.8265
0.3131 71.0 152863 0.8474
0.3096 72.0 155016 0.8449
0.304 73.0 157169 0.8579
0.2979 74.0 159322 0.8673
0.2948 75.0 161475 0.8785
0.2891 76.0 163628 0.9001
0.2863 77.0 165781 0.8918
0.2814 78.0 167934 0.9475
0.3586 79.0 170087 0.2512
0.345 80.0 172240 0.2598
0.3332 81.0 174393 0.2644
0.3243 82.0 176546 0.3046
0.3155 83.0 178699 0.2897
0.3081 84.0 180852 0.3066
0.2992 85.0 183005 0.3316
0.2951 86.0 185158 0.3526
0.292 87.0 187311 0.3605
0.2818 88.0 189464 0.3753
0.2795 89.0 191617 0.4040
0.2721 90.0 193770 0.4169
0.2631 91.0 195923 0.4250
0.2627 92.0 198076 0.4417
0.2562 93.0 200229 0.4658
0.2532 94.0 202382 0.4780
0.2509 95.0 204535 0.4964
0.2435 96.0 206688 0.5008
0.2419 97.0 208841 0.5169
0.2395 98.0 210994 0.5423
0.2327 99.0 213147 0.5314
0.2307 100.0 215300 0.5656
0.2315 101.0 217453 0.5440
0.2242 102.0 219606 0.5671
0.2233 103.0 221759 0.5865
0.2187 104.0 223912 0.5996
0.2152 105.0 226065 0.5873
0.2136 106.0 228218 0.6175
0.2117 107.0 230371 0.6169
0.2094 108.0 232524 0.6271
0.2056 109.0 234677 0.6647
0.2047 110.0 236830 0.6386
0.2005 111.0 238983 0.6509
0.2019 112.0 241136 0.6425
0.1978 113.0 243289 0.6766
0.1967 114.0 245442 0.6739
0.1933 115.0 247595 0.6835
0.192 116.0 249748 0.6923
0.1863 117.0 251901 0.7049
0.1846 118.0 254054 0.6914
0.1859 119.0 256207 0.7283
0.1828 120.0 258360 0.7262
0.1801 121.0 260513 0.7205
0.1762 122.0 262666 0.7326
0.1759 123.0 264819 0.7486
0.174 124.0 266972 0.7398
0.1752 125.0 269125 0.7233
0.2345 126.0 271278 0.1523
0.2226 127.0 273431 0.1528
0.2151 128.0 275584 0.1552
0.2109 129.0 277737 0.1500
0.2066 130.0 279890 0.1569
0.2028 131.0 282043 0.1485
0.1976 132.0 284196 0.1588
0.1925 133.0 286349 0.1544
0.1921 134.0 288502 0.1567
0.1864 135.0 290655 0.1557
0.184 136.0 292808 0.1609
0.1819 137.0 294961 0.1631
0.1801 138.0 297114 0.1555
0.1802 139.0 299267 0.1624
0.1757 140.0 301420 0.1633
0.1759 141.0 303573 0.1739
0.1713 142.0 305726 0.1635
0.1707 143.0 307879 0.1718
0.1694 144.0 310032 0.1659
0.1665 145.0 312185 0.1646
0.1669 146.0 314338 0.1779
0.1674 147.0 316491 0.1765
0.1673 148.0 318644 0.1753
0.1643 149.0 320797 0.1697
0.1649 150.0 322950 0.1709

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.5.1+cu124
  • Datasets 2.19.0
  • Tokenizers 0.20.3
Downloads last month
53
Safetensors
Model size
0.4B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for pijarcandra22/dewata_bert_gelu_new

Unable to build the model tree, the base model loops to the model itself. Learn more.

Evaluation results