Nav772's picture
Upload README.md with huggingface_hub
97fea57 verified
|
Raw
History Blame Contribute Delete
1.99 kB
---
license: mit
language: en
datasets:
- SetFit/amazon_reviews_multi_en
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- sentiment-analysis
- roberta
- multi-class-classification
---
# RoBERTa Fine-tuned on Amazon Reviews (5-Star Rating)
## Model Description
This model is a fine-tuned version of `roberta-base` for 5-class sentiment classification, predicting star ratings (1-5) from Amazon product reviews.
## Comparison with DistilBERT
This model was trained as part of a model comparison study:
| Model | Parameters | Accuracy | Off-by-one Accuracy | Inference Speed |
|-------|------------|----------|---------------------|-----------------|
| DistilBERT | 67M | 54.95% | 92.45% | 1.83x faster |
| **RoBERTa** | **125M** | **59.90%** | **95.10%** | Baseline |
RoBERTa provides ~5 percentage points higher accuracy at the cost of slower inference.
## Training Data
- **Dataset**: SetFit/amazon_reviews_multi_en
- **Train samples**: 20,000 (subset)
- **Test samples**: 2,000 (subset)
- **Classes**: 1 star, 2 stars, 3 stars, 4 stars, 5 stars
## Training Procedure
- **Base model**: roberta-base
- **Epochs**: 3
- **Batch size**: 16
- **Learning rate**: 2e-5
- **Max sequence length**: 256
## Usage
```python
from transformers import pipeline
classifier = pipeline("text-classification", model="Nav772/roberta-amazon-reviews-5star")
result = classifier("This product exceeded my expectations! Great quality.")
print(result)
```
## When to Use This Model
- Choose **RoBERTa** when accuracy is the priority and latency is less critical
- Choose **DistilBERT** when you need faster inference or have resource constraints
## Demo
Try the model comparison demo: [sentiment-model-comparison](https://huggingface.co/spaces/Nav772/sentiment-model-comparison)
## Limitations
- Trained on Amazon product reviews; may not generalize to other review domains
- Adjacent star ratings (e.g., 2 vs 3 stars) are inherently difficult to distinguish
- English language only