taiypeo's picture
Model save
3816c55 verified
|
Raw
History Blame
4.87 kB
metadata
library_name: transformers
license: apache-2.0
base_model: facebook/bart-large
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: bart-large-wiki-doc-full
    results: []

bart-large-wiki-doc-full

This model is a fine-tuned version of facebook/bart-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.9843
  • Sari: 51.0933
  • Sari Add: 12.516
  • Sari Keep: 45.354
  • Sari Del: 95.4099
  • Fkgl: 6.642
  • Bleu: 19.5688
  • D Sari: 0.4555
  • D Sari Keep: 0.3857
  • D Sari Del: 0.8046
  • D Sari Add: 0.1763

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.06
  • num_epochs: 15
  • label_smoothing_factor: 0.3

Training results

Training Loss Epoch Step Validation Loss Sari Sari Add Sari Keep Sari Del Fkgl Bleu D Sari D Sari Keep D Sari Del D Sari Add
5.6237 0.2862 500 5.1875 47.043 5.9566 40.0258 95.1466 7.2471 12.9924 0.4057 0.3169 0.7965 0.1038
5.0682 0.5725 1000 5.1042 48.192 9.0538 40.5192 95.0031 7.2139 15.7017 0.4288 0.3477 0.8055 0.1332
5.0363 0.8587 1500 5.0920 48.4536 9.2349 41.6306 94.4954 7.6174 20.3138 0.4245 0.3505 0.7871 0.1358
4.9284 1.1448 2000 5.0709 49.1161 9.9088 42.5012 94.9385 7.3117 18.5046 0.4339 0.3599 0.7961 0.1455
4.8644 1.4311 2500 5.0343 49.604 10.9617 42.8337 95.0164 7.1277 18.8397 0.4403 0.3681 0.7991 0.1536
4.8532 1.7173 3000 5.0192 49.3028 10.3908 42.1003 95.4172 6.9579 14.4955 0.4389 0.3588 0.8116 0.1462
4.8343 2.0034 3500 5.0058 49.2487 11.0341 42.1651 94.5467 7.2777 21.4311 0.4401 0.3693 0.7926 0.1583
4.719 2.2897 4000 5.0043 49.8632 11.2301 42.9018 95.4578 6.5783 15.364 0.4509 0.3735 0.8138 0.1654
4.6898 2.5759 4500 4.9904 49.9624 11.3392 43.3503 95.1978 6.9214 18.1433 0.449 0.3764 0.8068 0.1637
4.7061 2.8622 5000 4.9840 49.9367 11.8606 42.6481 95.3013 6.9934 16.9996 0.4516 0.3748 0.8158 0.1643
4.6392 3.1483 5500 4.9898 50.5502 11.928 44.3611 95.3614 6.9017 18.7169 0.4523 0.3813 0.8091 0.1665
4.578 3.4345 6000 4.9893 50.39 12.0634 43.6891 95.4175 6.8384 17.0137 0.4567 0.3837 0.8154 0.171
4.5981 3.7208 6500 4.9843 51.0933 12.516 45.354 95.4099 6.642 19.5688 0.4555 0.3857 0.8046 0.1763
4.5826 4.0069 7000 4.9818 50.788 12.8273 44.4249 95.1118 6.6555 21.2107 0.4594 0.3942 0.8049 0.1791
4.4927 4.2931 7500 5.0009 50.4098 12.2269 43.9055 95.0968 6.754 20.3637 0.4602 0.3948 0.8096 0.1763
4.5009 4.5794 8000 4.9908 50.7179 12.8769 43.783 95.4938 6.5702 17.565 0.4601 0.3863 0.8152 0.1787
4.5128 4.8656 8500 4.9830 50.7341 12.5935 44.2741 95.3346 6.7056 18.9024 0.4588 0.3905 0.8076 0.1783
4.457 5.1517 9000 5.0095 50.8947 12.7442 44.829 95.111 6.8278 21.5163 0.462 0.3987 0.8062 0.181

Framework versions

  • Transformers 4.57.3
  • Pytorch 2.9.1+cu128
  • Datasets 3.6.0
  • Tokenizers 0.22.1