---
license: mit
tags:
  - phishing-detection
  - email-security
  - text-classification
  - deberta
task: text-classification
datasets:
  - custom
---

# Email Phishing Detector V2

This model is fine-tuned for email phishing detection. It classifies emails as phishing (1) or safe (0).

## Model Description

This model is based on **microsoft/deberta-v3-base** and has been fine-tuned for phishing detection tasks.

## Training Details

- **Base Model**: microsoft/deberta-v3-base
- **Training Samples**: 37395
- **Validation Samples**: 7479
- **Test Samples**: 4986
- **Epochs**: 5
- **Batch Size**: 14
- **Learning Rate**: 2e-05
- **Max Length**: 512
- **Loss Type**: focal
- **LR Scheduler**: cosine


## Additional Training Parameters

- **Augmentation**: False






## Evaluation Results

### Test Set Metrics

- **Loss**: 0.0044
- **Accuracy**: 0.9966
- **F1**: 0.9970
- **Precision**: 0.9975
- **Recall**: 0.9965
- **Roc Auc**: 1.0000
- **Mcc**: 0.9930
- **True Positives**: 2836.0000
- **True Negatives**: 2133.0000
- **False Positives**: 7.0000
- **False Negatives**: 10.0000
- **Runtime**: 94.4784
- **Samples Per Second**: 52.7740
- **Steps Per Second**: 3.7790
- **Epoch**: 5.0000

### Validation Set Metrics

- **Loss**: 0.0050
- **Accuracy**: 0.9965
- **F1**: 0.9969
- **Precision**: 0.9986
- **Recall**: 0.9953
- **Roc Auc**: 0.9999
- **Mcc**: 0.9929
- **True Positives**: 4248.0000
- **True Negatives**: 3205.0000
- **False Positives**: 6.0000
- **False Negatives**: 20.0000
- **Runtime**: 112.4284
- **Samples Per Second**: 66.5220
- **Steps Per Second**: 4.7590
- **Epoch**: 5.0000


## Usage

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "nhellyercreek/email-phishing-detector-v2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example inference
text = "Your email or URL text here"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
outputs = model(**inputs)
predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)

# Get prediction
predicted_class = predictions.argmax().item()
confidence = predictions[0][predicted_class].item()

print(f"Predicted class: {predicted_class} (phishing=1, safe=0)")
print(f"Confidence: {confidence:.4f}")
```

## Limitations

This model was trained on specific datasets and may not generalize to all types of phishing attempts. Always use additional security measures in production environments.

## Citation

If you use this model, please cite:

```bibtex
@misc{nhellyercreek_email_phishing_detector_v2,
  title={Email Phishing Detector V2},
  author={Your Name},
  year={2024},
  publisher={Hugging Face},
  howpublished={\url{https://huggingface.co/nhellyercreek/email-phishing-detector-v2}}
}
```