File size: 1,991 Bytes
97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 21b71d0 97fea57 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
---
license: mit
language: en
datasets:
- SetFit/amazon_reviews_multi_en
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- sentiment-analysis
- roberta
- multi-class-classification
---
# RoBERTa Fine-tuned on Amazon Reviews (5-Star Rating)
## Model Description
This model is a fine-tuned version of `roberta-base` for 5-class sentiment classification, predicting star ratings (1-5) from Amazon product reviews.
## Comparison with DistilBERT
This model was trained as part of a model comparison study:
| Model | Parameters | Accuracy | Off-by-one Accuracy | Inference Speed |
|-------|------------|----------|---------------------|-----------------|
| DistilBERT | 67M | 54.95% | 92.45% | 1.83x faster |
| **RoBERTa** | **125M** | **59.90%** | **95.10%** | Baseline |
RoBERTa provides ~5 percentage points higher accuracy at the cost of slower inference.
## Training Data
- **Dataset**: SetFit/amazon_reviews_multi_en
- **Train samples**: 20,000 (subset)
- **Test samples**: 2,000 (subset)
- **Classes**: 1 star, 2 stars, 3 stars, 4 stars, 5 stars
## Training Procedure
- **Base model**: roberta-base
- **Epochs**: 3
- **Batch size**: 16
- **Learning rate**: 2e-5
- **Max sequence length**: 256
## Usage
```python
from transformers import pipeline
classifier = pipeline("text-classification", model="Nav772/roberta-amazon-reviews-5star")
result = classifier("This product exceeded my expectations! Great quality.")
print(result)
```
## When to Use This Model
- Choose **RoBERTa** when accuracy is the priority and latency is less critical
- Choose **DistilBERT** when you need faster inference or have resource constraints
## Demo
Try the model comparison demo: [sentiment-model-comparison](https://huggingface.co/spaces/Nav772/sentiment-model-comparison)
## Limitations
- Trained on Amazon product reviews; may not generalize to other review domains
- Adjacent star ratings (e.g., 2 vs 3 stars) are inherently difficult to distinguish
- English language only
|