xhimanshuz's picture
Update model card with test results and usage examples
cd6bb6c verified
|
Raw
History Blame Contribute Delete
3.2 kB
metadata
language: en
license: apache-2.0
tags:
  - grammar-correction
  - gec
  - english
  - flan-t5
  - coedit
datasets:
  - grammarly/coedit
base_model: google/flan-t5-small
pipeline_tag: text-generation

FlanT5-Small Grammar Correction

Fine-tuned google/flan-t5-small on the grammarly/coedit dataset for English Grammar Error Correction (GEC).

Training Details

  • Base model: google/flan-t5-small (77M params)
  • Dataset: grammarly/coedit (GEC subset, 2000 training examples)
  • Training recipe: Based on CoEdIT paper (EMNLP 2023)
  • Epochs: 3
  • Learning rate: 3e-4
  • Final training loss: 0.27

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

tokenizer = AutoTokenizer.from_pretrained("xhimanshuz/flan-t5-small-grammar-correction")
model = AutoModelForSeq2SeqLM.from_pretrained("xhimanshuz/flan-t5-small-grammar-correction")

text = "Fix the grammar: I goes to school yesterday and learn many thing."
inputs = tokenizer(text, return_tensors="pt", max_length=128, truncation=True)
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# Output: I went to school yesterday and learned many things.

Supported Instructions

Use instruction prefixes from the CoEdIT format:

  • "Fix the grammar: <text>"
  • "Fix grammatical errors in this sentence: <text>"
  • "Improve the grammaticality: <text>"
  • "Remove all grammatical errors from this text: <text>"

Example Results

Input Output
I goes to school yesterday and learn many thing. I went to school yesterday and learned many things.
She don't know what are she doing. She doesn't know what she is doing.
The informations was very helpfull for our researchs. The information was very helpful for our research.
He have went to the market and buyed some apple. He has gone to the market and bought some apple.
The childs was playing in park when it start raining. The children were playing in the park when it started raining.

Training Loss Curve

Step Loss Epoch
1 0.669 0.00
100 0.484 0.40
250 0.448 1.00
500 0.325 2.00
750 0.292 3.00

Scaling Up

This model was trained on a 2000-example subset on CPU as a demonstration. For better performance:

  1. More data: Train on the full 19K GEC examples from grammarly/coedit, or all 69K examples (including simplification, paraphrasing, etc.)
  2. Larger model: Use google/flan-t5-base (250M) or google/flan-t5-large (770M)
  3. GPU training: Use A10G or A100 GPUs for faster training with larger batch sizes
  4. More epochs: Train for 5 epochs with early stopping (CoEdIT paper recipe)

Citation

@inproceedings{raheja2023coedit,
  title={CoEdIT: Text Editing by Task-Specific Instruction Tuning},
  author={Raheja, Vipul and Kumar, Dhruv and Koo, Ryan and Kang, Dongyeop},
  booktitle={EMNLP 2023},
  year={2023}
}