bart-large-wikilarge

This model is a fine-tuned version of facebook/bart-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9051
  • Sari: 37.0457
  • Paper Sari: 37.1345
  • Sari Add: 4.1443
  • Sari Keep: 77.5177
  • Sari Del: 29.475
  • Paper Sari Add: 4.1443
  • Paper Sari Keep: 77.7084
  • Paper Sari Del: 29.5508
  • Fkgl: 8.4619
  • Bleu: 92.6286

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.06
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Sari Paper Sari Sari Add Sari Keep Sari Del Paper Sari Add Paper Sari Keep Paper Sari Del Fkgl Bleu
2.0874 0.1080 1000 0.9401 31.9299 32.0978 0.5451 77.7799 17.4648 0.5456 78.1089 17.6388 8.868 96.2302
1.5465 0.2159 2000 0.8906 31.4483 31.5849 0.7191 78.6496 14.9763 0.7195 78.9791 15.056 9.0121 96.5849
1.4814 0.3239 3000 0.8681 31.2134 31.3383 0.9158 79.1129 13.6116 0.9163 79.4448 13.6539 9.1436 97.0903
1.4484 0.4318 4000 0.8677 31.4137 31.5424 0.9679 79.2607 14.0123 0.9683 79.5972 14.0619 9.133 97.3217
1.4237 0.5398 5000 0.8712 32.6532 32.7787 1.3592 78.7355 17.8648 1.3601 79.0417 17.9343 8.9947 96.5928
1.4088 0.6477 6000 0.8711 33.8505 33.9891 1.6794 77.7335 22.1385 1.6804 78.0123 22.2745 8.7814 95.2315
1.3827 0.7557 7000 0.8577 33.4554 33.5788 1.8273 78.6389 19.8999 1.8278 78.9318 19.9767 8.9175 96.2523
1.3724 0.8637 8000 0.8546 33.9822 34.1092 1.9029 78.1136 21.93 1.9038 78.3902 22.0335 8.8339 95.6744
1.3765 0.9716 9000 0.8560 34.3858 34.5199 2.0618 77.5883 23.5074 2.0627 77.8534 23.6436 8.7336 94.8992
1.3394 1.0796 10000 0.8543 33.9466 34.0591 2.2936 78.893 20.6532 2.2939 79.1763 20.7072 8.8746 95.9382
1.3215 1.1875 11000 0.8586 34.8154 34.9403 2.5226 77.7114 24.2122 2.5232 77.967 24.3307 8.6369 94.8519
1.3201 1.2955 12000 0.8635 35.0642 35.1821 2.6475 77.9966 24.5485 2.648 78.2487 24.6497 8.6923 94.8076
1.305 1.4034 13000 0.8538 34.3125 34.4191 2.5477 78.9463 21.4435 2.548 79.2199 21.4895 8.8855 95.7623
1.3007 1.5114 14000 0.8594 35.2924 35.4213 2.6975 77.1695 26.0102 2.698 77.4097 26.1563 8.5157 93.9044
1.2941 1.6193 15000 0.8670 35.1955 35.3046 2.9201 78.0836 24.5827 2.9204 78.3297 24.6637 8.6739 94.4185
1.2843 1.7273 16000 0.8618 34.7826 34.9015 2.5214 78.3308 23.4956 2.5216 78.5955 23.5875 8.7743 95.1965
1.2845 1.8353 17000 0.8661 34.4209 34.5391 2.2746 78.1474 22.8407 2.275 78.4122 22.9303 8.7781 95.3004
1.2752 1.9432 18000 0.8763 35.4414 35.5519 2.9318 78.0589 25.3335 2.9321 78.3016 25.422 8.6975 94.5009
1.2275 2.0512 19000 0.8691 35.6042 35.7092 3.2336 78.0741 25.5048 3.2338 78.3111 25.5826 8.6568 94.5273
1.212 2.1591 20000 0.8709 35.7725 35.8786 3.1674 77.5545 26.5955 3.1677 77.7781 26.6901 8.5776 93.7323
1.2132 2.2671 21000 0.8662 34.9038 35.0133 2.7852 78.5778 23.3486 2.7853 78.8389 23.4158 8.81 95.2465
1.2052 2.3750 22000 0.8707 35.4538 35.5547 3.1225 78.438 24.801 3.1226 78.6803 24.8613 8.7035 94.7578
1.2175 2.4830 23000 0.8745 35.6625 35.7636 3.2546 78.2719 25.4612 3.2548 78.5084 25.5276 8.6879 94.7659
1.2014 2.5910 24000 0.8725 35.6727 35.779 3.0206 77.7743 26.2232 3.0209 78.0033 26.313 8.5494 93.8561
1.195 2.6989 25000 0.8726 35.7564 35.8621 3.12 77.9879 26.1613 3.1202 78.2202 26.246 8.596 93.9884
1.2045 2.8069 26000 0.8702 35.579 35.6877 3.0601 77.8844 25.7926 3.0603 78.1198 25.883 8.6031 94.1975
1.1882 2.9148 27000 0.8659 35.1077 35.2157 2.9039 78.3907 24.0286 2.9041 78.6437 24.0993 8.7458 94.7959
1.1946 3.0228 28000 0.8895 36.2244 36.3277 3.4612 77.5469 27.6651 3.4613 77.7619 27.7599 8.5522 93.2346
1.1346 3.1307 29000 0.8777 36.4978 36.5959 3.6593 77.8239 28.0103 3.6594 78.0366 28.0917 8.5201 93.3092
1.1325 3.2387 30000 0.8841 35.8627 35.9653 3.3274 77.7601 26.5006 3.3275 77.9843 26.5841 8.5529 93.4478
1.1345 3.3466 31000 0.8774 36.0619 36.159 3.4726 77.8183 26.8947 3.4728 78.0361 26.9679 8.5371 93.6775
1.1456 3.4546 32000 0.8806 36.154 36.258 3.3219 77.7428 27.3972 3.3221 77.9628 27.4892 8.5411 93.3262
1.1239 3.5626 33000 0.8762 36.2594 36.3554 3.5953 77.9167 27.2661 3.5955 78.1329 27.3379 8.5613 93.631
1.1227 3.6705 34000 0.8809 35.6 35.7039 3.1605 78.0969 25.5426 3.1606 78.3333 25.6179 8.687 94.0505
1.1413 3.7785 35000 0.8794 36.6955 36.7995 3.708 77.1888 29.1897 3.7082 77.3896 29.3007 8.4227 92.7837
1.1423 3.8864 36000 0.8956 36.1318 36.2347 3.3576 77.581 27.4568 3.3578 77.7975 27.5488 8.5717 93.1018
1.1339 3.9944 37000 0.8829 36.4258 36.5211 3.6958 77.4683 28.1132 3.696 77.6717 28.1957 8.5213 92.9871
1.0747 4.1023 38000 0.9051 37.0457 37.1345 4.1443 77.5177 29.475 4.1443 77.7084 29.5508 8.4619 92.6286
1.0711 4.2103 39000 0.8930 36.2805 36.3798 3.5565 78.0056 27.2794 3.5565 78.2265 27.3563 8.611 93.7963
1.0785 4.3183 40000 0.8929 36.618 36.7162 3.6923 77.392 28.7699 3.6924 77.5932 28.8631 8.4574 92.7116
1.0818 4.4262 41000 0.8896 36.8682 36.9674 3.8285 77.4044 29.3716 3.8286 77.6034 29.4702 8.4212 92.9933
1.0811 4.5342 42000 0.9119 36.7358 36.8258 3.888 77.669 28.6503 3.8881 77.8679 28.7214 8.484 92.7557
1.0872 4.6421 43000 0.8924 36.3606 36.46 3.5563 77.7412 27.7843 3.5564 77.9549 27.8688 8.512 93.3203
1.0968 4.7501 44000 0.8921 36.4742 36.5676 3.735 77.7654 27.9221 3.7352 77.9725 27.995 8.5476 93.3622
1.0835 4.8580 45000 0.9040 36.7224 36.8112 4.053 77.4397 28.6746 4.0531 77.6335 28.747 8.5208 92.7691
1.0895 4.9660 46000 0.8906 36.706 36.7989 3.9402 77.6053 28.5724 3.9404 77.8063 28.6501 8.5282 93.1717
1.0533 5.0740 47000 0.9085 36.4749 36.5574 3.8835 78.1434 27.3978 3.8837 78.3486 27.4398 8.6253 93.4822
1.027 5.1819 48000 0.9055 36.7132 36.8084 3.7814 77.1845 29.1737 3.7815 77.3773 29.2664 8.4275 92.3575

Framework versions

  • Transformers 4.57.3
  • Pytorch 2.9.1+cu128
  • Datasets 3.6.0
  • Tokenizers 0.22.1
Downloads last month
2
Safetensors
Model size
0.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for taiypeo/bart-large-wikilarge

Finetuned
(197)
this model