7625b181d2e92ac5b1fc024c03df47c9

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-ru on the Helsinki-NLP/opus_books [it-sv] dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7958
  • Data Size: 1.0
  • Epoch Runtime: 5.5030
  • Bleu: 1.2188

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 32
  • total_eval_batch_size: 32
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss Data Size Epoch Runtime Bleu
No log 0 0 7.9482 0 1.0405 0.0175
No log 1 74 7.0727 0.0078 1.2689 0.0379
No log 2 148 6.4936 0.0156 1.3069 0.0400
0.2321 3 222 6.0663 0.0312 1.4860 0.0441
0.2321 4 296 5.5789 0.0625 1.5789 0.0654
0.3904 5 370 4.9782 0.125 1.9256 0.0736
0.3904 6 444 4.3889 0.25 2.4105 0.0475
0.9876 7 518 3.9695 0.5 3.5018 0.0651
2.5677 8.0 592 3.5792 1.0 5.7064 0.2648
3.525 9.0 666 3.3921 1.0 5.3753 0.4170
3.3565 10.0 740 3.2510 1.0 5.5686 0.5257
3.1458 11.0 814 3.1299 1.0 5.6335 0.6348
3.0346 12.0 888 3.0564 1.0 5.5839 0.7650
2.8539 13.0 962 2.9883 1.0 6.1253 0.7988
2.7784 14.0 1036 2.9352 1.0 5.4016 0.9171
2.6314 15.0 1110 2.9048 1.0 5.6046 0.9230
2.5463 16.0 1184 2.8656 1.0 5.3371 0.9870
2.4282 17.0 1258 2.8403 1.0 5.4282 0.9432
2.3614 18.0 1332 2.8179 1.0 5.5174 1.0111
2.2422 19.0 1406 2.8004 1.0 5.9597 1.1130
2.1763 20.0 1480 2.7917 1.0 5.3992 1.0947
2.075 21.0 1554 2.7932 1.0 5.9755 1.1365
1.9988 22.0 1628 2.8020 1.0 5.8151 1.1447
1.9175 23.0 1702 2.7943 1.0 5.5309 1.2010
1.8459 24.0 1776 2.7958 1.0 5.5030 1.2188

Framework versions

  • Transformers 4.57.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.1
Downloads last month
1
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for contemmcm/7625b181d2e92ac5b1fc024c03df47c9

Finetuned
(41)
this model