Update model card with test results and usage examples

cd6bb6c verified 2 months ago

3.2 kB

language: en
license: apache-2.0
tags:
  - grammar-correction
  - gec
  - english
  - flan-t5
  - coedit
datasets:
  - grammarly/coedit
base_model: google/flan-t5-small
pipeline_tag: text-generation

FlanT5-Small Grammar Correction

Fine-tuned google/flan-t5-small on the grammarly/coedit dataset for English Grammar Error Correction (GEC).

Training Details

Base model: google/flan-t5-small (77M params)
Dataset: grammarly/coedit (GEC subset, 2000 training examples)
Training recipe: Based on CoEdIT paper (EMNLP 2023)
Epochs: 3
Learning rate: 3e-4
Final training loss: 0.27

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("xhimanshuz/flan-t5-small-grammar-correction")
model = AutoModelForSeq2SeqLM.from_pretrained("xhimanshuz/flan-t5-small-grammar-correction")

text = "Fix the grammar: I goes to school yesterday and learn many thing."
inputs = tokenizer(text, return_tensors="pt", max_length=128, truncation=True)
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# Output: I went to school yesterday and learned many things.

Supported Instructions

Use instruction prefixes from the CoEdIT format:

"Fix the grammar: <text>"
"Fix grammatical errors in this sentence: <text>"
"Improve the grammaticality: <text>"
"Remove all grammatical errors from this text: <text>"

Example Results

Input	Output
I goes to school yesterday and learn many thing.	I went to school yesterday and learned many things.
She don't know what are she doing.	She doesn't know what she is doing.
The informations was very helpfull for our researchs.	The information was very helpful for our research.
He have went to the market and buyed some apple.	He has gone to the market and bought some apple.
The childs was playing in park when it start raining.	The children were playing in the park when it started raining.

Training Loss Curve

Step	Loss	Epoch
1	0.669	0.00
100	0.484	0.40
250	0.448	1.00
500	0.325	2.00
750	0.292	3.00

Scaling Up

This model was trained on a 2000-example subset on CPU as a demonstration. For better performance:

More data: Train on the full 19K GEC examples from grammarly/coedit, or all 69K examples (including simplification, paraphrasing, etc.)
Larger model: Use google/flan-t5-base (250M) or google/flan-t5-large (770M)
GPU training: Use A10G or A100 GPUs for faster training with larger batch sizes
More epochs: Train for 5 epochs with early stopping (CoEdIT paper recipe)

Citation

@inproceedings{raheja2023coedit,
  title={CoEdIT: Text Editing by Task-Specific Instruction Tuning},
  author={Raheja, Vipul and Kumar, Dhruv and Koo, Ryan and Kang, Dongyeop},
  booktitle={EMNLP 2023},
  year={2023}
}