learn3r
/

bart_large_gov

text2text-generation

Generated from Trainer

Eval Results (legacy)

Model card Files Files and versions

bart_large_gov / README.md

learn3r's picture

End of training

e58e5db verified over 2 years ago

|

History Blame Contribute Delete

3.91 kB

	---
	license: apache-2.0
	base_model: facebook/bart-large
	tags:
	- generated_from_trainer
	datasets:
	- learn3r/gov_report_memsum_oracle
	metrics:
	- rouge
	model-index:
	- name: bart_large_gov
	results:
	- task:
	name: Summarization
	type: summarization
	dataset:
	name: learn3r/gov_report_memsum_oracle
	type: learn3r/gov_report_memsum_oracle
	metrics:
	- name: Rouge1
	type: rouge
	value: 71.9948
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bart_large_gov

	This model is a fine-tuned version of [facebook/bart-large](https://huggingface.co/facebook/bart-large) on the learn3r/gov_report_memsum_oracle dataset.
	It achieves the following results on the evaluation set:
	- Loss: 1.4266
	- Rouge1: 71.9948
	- Rouge2: 41.0084
	- Rougel: 38.0938
	- Rougelsum: 69.4488
	- Gen Len: 751.0288

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- gradient_accumulation_steps: 16
	- total_train_batch_size: 128
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 20.0

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:--------:\|
	\| 1.7352 \| 1.0 \| 136 \| 1.5224 \| 72.0472 \| 41.3267 \| 36.4817 \| 69.4011 \| 685.9300 \|
	\| 1.6874 \| 1.99 \| 272 \| 1.4779 \| 71.7737 \| 40.8546 \| 36.8472 \| 69.2034 \| 699.4866 \|
	\| 1.5695 \| 3.0 \| 409 \| 1.4583 \| 72.2243 \| 41.372 \| 37.8382 \| 69.6295 \| 695.0977 \|
	\| 1.4951 \| 3.99 \| 545 \| 1.4495 \| 71.5808 \| 40.5556 \| 37.152 \| 69.0536 \| 753.5967 \|
	\| 1.496 \| 5.0 \| 682 \| 1.4386 \| 72.1271 \| 41.1645 \| 38.4096 \| 69.6176 \| 700.2160 \|
	\| 1.4258 \| 6.0 \| 818 \| 1.4374 \| 71.9975 \| 41.0013 \| 37.9947 \| 69.449 \| 743.7068 \|
	\| 1.4301 \| 7.0 \| 955 \| 1.4296 \| 71.8896 \| 40.8303 \| 38.346 \| 69.357 \| 724.5062 \|
	\| 1.4015 \| 8.0 \| 1091 \| 1.4313 \| 72.0031 \| 40.9229 \| 38.2581 \| 69.4154 \| 731.2685 \|
	\| 1.391 \| 8.99 \| 1227 \| 1.4266 \| 71.9948 \| 41.0084 \| 38.0938 \| 69.4488 \| 751.0288 \|
	\| 1.3642 \| 10.0 \| 1364 \| 1.4287 \| 71.9115 \| 40.8683 \| 38.1602 \| 69.3514 \| 756.9568 \|
	\| 1.3516 \| 10.99 \| 1500 \| 1.4289 \| 72.3822 \| 41.5074 \| 38.8088 \| 69.8232 \| 719.2798 \|
	\| 1.3243 \| 12.0 \| 1637 \| 1.4301 \| 71.83 \| 40.764 \| 38.1124 \| 69.2767 \| 749.9475 \|
	\| 1.3582 \| 12.99 \| 1773 \| 1.4283 \| 71.9495 \| 40.9556 \| 38.4201 \| 69.4394 \| 736.6698 \|
	\| 1.3149 \| 14.0 \| 1910 \| 1.4298 \| 71.9599 \| 40.8875 \| 38.2722 \| 69.4209 \| 753.3230 \|
	\| 1.288 \| 15.0 \| 2046 \| 1.4326 \| 72.1615 \| 41.1549 \| 38.611 \| 69.5977 \| 744.8858 \|
	\| 1.2937 \| 16.0 \| 2183 \| 1.4315 \| 71.9783 \| 40.9073 \| 38.4263 \| 69.4109 \| 755.5340 \|
	\| 1.258 \| 17.0 \| 2319 \| 1.4328 \| 72.0298 \| 40.931 \| 38.4845 \| 69.4823 \| 734.6399 \|
	\| 1.2617 \| 17.99 \| 2455 \| 1.4336 \| 71.9488 \| 40.8816 \| 38.4521 \| 69.4151 \| 744.7068 \|
	\| 1.2864 \| 19.0 \| 2592 \| 1.4346 \| 72.1334 \| 40.9965 \| 38.5682 \| 69.5666 \| 744.2449 \|
	\| 1.2936 \| 19.94 \| 2720 \| 1.4351 \| 72.0397 \| 40.9431 \| 38.4161 \| 69.5028 \| 744.4588 \|


	### Framework versions

	- Transformers 4.37.0.dev0
	- Pytorch 2.0.1+cu117
	- Datasets 2.14.5
	- Tokenizers 0.15.0