life2lang-it

This model is a fine-tuned version of khairi/life2lang-pt on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 12
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 48
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.15
num_epochs: 1

Training Loss	Epoch	Step	Validation Loss
20.6891	0.0219	100	11.6935
4.5604	0.0438	200	1.7273
1.6816	0.0657	300	1.6981
1.6611	0.0876	400	1.6912
1.6509	0.1095	500	1.6862
1.6474	0.1314	600	1.6836
1.6497	0.1533	700	1.6798
1.6509	0.1752	800	1.6782
1.6374	0.1971	900	1.6766
1.6437	0.2190	1000	1.6746
1.6267	0.2409	1100	1.6735
1.6151	0.2628	1200	1.6709
1.637	0.2847	1300	1.6681
1.6363	0.3066	1400	1.6667
1.6104	0.3285	1500	1.6655
1.6337	0.3504	1600	1.6653
1.6317	0.3723	1700	1.6616
1.6205	0.3942	1800	1.6607
1.6119	0.4161	1900	1.6582
1.6124	0.4380	2000	1.6576
1.6296	0.4599	2100	1.6566
1.6288	0.4818	2200	1.6555
1.6092	0.5037	2300	1.6539
1.6246	0.5256	2400	1.6532
1.6122	0.5475	2500	1.6513
1.6069	0.5694	2600	1.6507
1.625	0.5913	2700	1.6495
1.6034	0.6132	2800	1.6481
1.6121	0.6351	2900	1.6474
1.605	0.6570	3000	1.6463
1.6168	0.6789	3100	1.6457
1.6058	0.7008	3200	1.6446
1.6081	0.7227	3300	1.6437
1.6073	0.7446	3400	1.6435
1.5987	0.7665	3500	1.6430
1.6056	0.7883	3600	1.6420
1.6292	0.8102	3700	1.6420
1.5917	0.8321	3800	1.6419
1.609	0.8540	3900	1.6413
1.6061	0.8759	4000	1.6411
1.5914	0.8978	4100	1.6410
1.6153	0.9197	4200	1.6408
1.6006	0.9416	4300	1.6409
1.5895	0.9635	4400	1.6408
1.611	0.9854	4500	1.6408

Safetensors

Model size

0.2B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

Finetuned

(1)

this model

Quantizations