dewata_bert_gelu_new

This model is a fine-tuned version of pijarcandra22/dewata_bert_gelu_new on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1709

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 80
eval_batch_size: 80
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 150
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
1.0053	1.0	2153	0.8893
0.9725	2.0	4306	0.9588
0.923	3.0	6459	0.9913
0.8701	4.0	8612	1.0152
0.831	5.0	10765	1.0373
0.7833	6.0	12918	1.0701
0.7444	7.0	15071	1.0928
0.7085	8.0	17224	1.1052
0.6736	9.0	19377	1.0928
0.6428	10.0	21530	1.1113
0.6135	11.0	23683	1.1086
0.5882	12.0	25836	1.1334
0.5636	13.0	27989	1.1329
0.5429	14.0	30142	1.1266
0.5198	15.0	32295	1.1369
0.4972	16.0	34448	1.1462
0.4785	17.0	36601	1.1437
0.4631	18.0	38754	1.1468
0.4416	19.0	40907	1.1401
0.4281	20.0	43060	1.1498
0.4158	21.0	45213	1.1579
0.4005	22.0	47366	1.1539
0.3932	23.0	49519	1.1429
0.3774	24.0	51672	1.1516
0.3704	25.0	53825	1.1487
0.358	26.0	55978	1.1580
0.3505	27.0	58131	1.1465
0.3434	28.0	60284	1.1290
0.3394	29.0	62437	1.1210
0.3341	30.0	64590	1.1304
0.432	31.0	66743	0.3157
0.4259	32.0	68896	0.3127
0.4162	33.0	71049	0.3207
0.4097	34.0	73202	0.3155
0.4061	35.0	75355	0.3281
0.4256	36.0	77508	0.3150
0.4182	37.0	79661	0.3209
0.4097	38.0	81814	0.3272
0.3989	39.0	83967	0.3271
0.3918	40.0	86120	0.3401
0.383	41.0	88273	0.3412
0.3802	42.0	90426	0.3564
0.3677	43.0	92579	0.3669
0.3644	44.0	94732	0.3714
0.3558	45.0	96885	0.3840
0.3474	46.0	99038	0.3870
0.345	47.0	101191	0.3910
0.3403	48.0	103344	0.3809
0.335	49.0	105497	0.3842
0.3355	50.0	107650	0.3871
0.423	51.0	109803	0.3118
0.4372	52.0	111956	0.3450
0.4371	53.0	114109	0.3708
0.4318	54.0	116262	0.4065
0.4251	55.0	118415	0.4411
0.417	56.0	120568	0.4701
0.4108	57.0	122721	0.5073
0.4037	58.0	124874	0.5471
0.3965	59.0	127027	0.5702
0.385	60.0	129180	0.5940
0.381	61.0	131333	0.6328
0.3684	62.0	133486	0.6490
0.3613	63.0	135639	0.6735
0.3544	64.0	137792	0.6931
0.3513	65.0	139945	0.7139
0.3437	66.0	142098	0.7597
0.3382	67.0	144251	0.7609
0.3295	68.0	146404	0.7867
0.3229	69.0	148557	0.7975
0.3178	70.0	150710	0.8265
0.3131	71.0	152863	0.8474
0.3096	72.0	155016	0.8449
0.304	73.0	157169	0.8579
0.2979	74.0	159322	0.8673
0.2948	75.0	161475	0.8785
0.2891	76.0	163628	0.9001
0.2863	77.0	165781	0.8918
0.2814	78.0	167934	0.9475
0.3586	79.0	170087	0.2512
0.345	80.0	172240	0.2598
0.3332	81.0	174393	0.2644
0.3243	82.0	176546	0.3046
0.3155	83.0	178699	0.2897
0.3081	84.0	180852	0.3066
0.2992	85.0	183005	0.3316
0.2951	86.0	185158	0.3526
0.292	87.0	187311	0.3605
0.2818	88.0	189464	0.3753
0.2795	89.0	191617	0.4040
0.2721	90.0	193770	0.4169
0.2631	91.0	195923	0.4250
0.2627	92.0	198076	0.4417
0.2562	93.0	200229	0.4658
0.2532	94.0	202382	0.4780
0.2509	95.0	204535	0.4964
0.2435	96.0	206688	0.5008
0.2419	97.0	208841	0.5169
0.2395	98.0	210994	0.5423
0.2327	99.0	213147	0.5314
0.2307	100.0	215300	0.5656
0.2315	101.0	217453	0.5440
0.2242	102.0	219606	0.5671
0.2233	103.0	221759	0.5865
0.2187	104.0	223912	0.5996
0.2152	105.0	226065	0.5873
0.2136	106.0	228218	0.6175
0.2117	107.0	230371	0.6169
0.2094	108.0	232524	0.6271
0.2056	109.0	234677	0.6647
0.2047	110.0	236830	0.6386
0.2005	111.0	238983	0.6509
0.2019	112.0	241136	0.6425
0.1978	113.0	243289	0.6766
0.1967	114.0	245442	0.6739
0.1933	115.0	247595	0.6835
0.192	116.0	249748	0.6923
0.1863	117.0	251901	0.7049
0.1846	118.0	254054	0.6914
0.1859	119.0	256207	0.7283
0.1828	120.0	258360	0.7262
0.1801	121.0	260513	0.7205
0.1762	122.0	262666	0.7326
0.1759	123.0	264819	0.7486
0.174	124.0	266972	0.7398
0.1752	125.0	269125	0.7233
0.2345	126.0	271278	0.1523
0.2226	127.0	273431	0.1528
0.2151	128.0	275584	0.1552
0.2109	129.0	277737	0.1500
0.2066	130.0	279890	0.1569
0.2028	131.0	282043	0.1485
0.1976	132.0	284196	0.1588
0.1925	133.0	286349	0.1544
0.1921	134.0	288502	0.1567
0.1864	135.0	290655	0.1557
0.184	136.0	292808	0.1609
0.1819	137.0	294961	0.1631
0.1801	138.0	297114	0.1555
0.1802	139.0	299267	0.1624
0.1757	140.0	301420	0.1633
0.1759	141.0	303573	0.1739
0.1713	142.0	305726	0.1635
0.1707	143.0	307879	0.1718
0.1694	144.0	310032	0.1659
0.1665	145.0	312185	0.1646
0.1669	146.0	314338	0.1779
0.1674	147.0	316491	0.1765
0.1673	148.0	318644	0.1753
0.1643	149.0	320797	0.1697
0.1649	150.0	322950	0.1709

Framework versions

Transformers 4.45.2
Pytorch 2.5.1+cu124
Datasets 2.19.0
Tokenizers 0.20.3

Downloads last month: 53

Safetensors

Model size

0.4B params

Tensor type

F32

Model tree for pijarcandra22/dewata_bert_gelu_new

Unable to build the model tree, the base model loops to the model itself. Learn more.

pijarcandra22
/

dewata_bert_gelu_new

dewata_bert_gelu_new

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for pijarcandra22/dewata_bert_gelu_new

Evaluation results