--- license: apache-2.0 language: - en base_model: - facebook/bart-large --- # legal-bart-summarizer This is a fine-tuned version of `facebook/bart-large`, trained specifically to summarize long legal documents using the LegalSum dataset. It’s designed to take dense legal texts and produce clear, concise summaries—especially useful for making legal content more digestible. ## About the Model The base model is `facebook/bart-large`, a powerful encoder-decoder architecture that works well for sequence-to-sequence tasks like summarization. I fine-tuned it on the full LegalSum dataset, which contains legal documents paired with human-written extractive summaries. The model handles input sequences up to 1024 tokens and generates summaries capped at 512 tokens. It was trained over 5 epochs using a batch size of 4 and a learning rate of 3e-5, with mixed precision (fp16) to speed things up and save memory. ## How to Use Here’s a quick example using 🤗 Transformers: ```python from transformers import BartTokenizer, BartForConditionalGeneration tokenizer = BartTokenizer.from_pretrained("whyredfire/legal-bart-summarizer") model = BartForConditionalGeneration.from_pretrained("whyredfire/legal-bart-summarizer") text = "Insert your legal document here..." inputs = tokenizer([text], max_length=1024, truncation=True, return_tensors="pt") summary_ids = model.generate(inputs["input_ids"], max_length=512, num_beams=4, early_stopping=True) print(tokenizer.decode(summary_ids[0], skip_special_tokens=True)) ``` ## Training Details - **Base model**: `facebook/bart-large` - **Epochs**: ~5 - **Batch size**: 4 - **Max input length**: 1024 - **Max summary length**: 512 - **Learning rate**: 3e-5 - **Gradient accumulation**: 4 steps - **Warmup ratio**: 10% - **Weight decay**: 0.01 - **Mixed precision**: Enabled (fp16) - **Seed**: 42 ## Evaluation Results - **Test loss**: 0.98 - **Test runtime**: ~19.5 seconds - **Samples/sec**: ~55.7 - **Steps/sec**: ~13.96 Evaluated using Hugging Face’s Trainer with ROUGE and similar metrics under the hood. ## Use Cases This model is a good fit if you're working with: - Legal research tools - Brief generation for lawyers and law students - Making court rulings and legal judgments more readable ## Limitations While the model does a decent job on civil law-style documents, it's not bulletproof. It might struggle with out-of-domain legal texts like criminal or tax law and—like most generative models—it can sometimes hallucinate or omit important legal details. Always double-check the output before relying on it for anything serious. ## License Please make sure your use complies with the licensing terms of both the base model (`facebook/bart-large`) and the LegalSum dataset. This model is shared for research and experimentation purposes. ## Reference If you’re looking for more context on the dataset and task setup, check out the paper: > **CivilSum: A Dataset for Abstractive Summarization of Indian Court Decisions** > [https://dl.acm.org/doi/pdf/10.1145/3626772.3657859](https://dl.acm.org/doi/pdf/10.1145/3626772.3657859)