--- license: mit tags: - phishing-detection - email-security - text-classification - deberta task: text-classification datasets: - custom --- # Email Phishing Detector V2 This model is fine-tuned for email phishing detection. It classifies emails as phishing (1) or safe (0). ## Model Description This model is based on **microsoft/deberta-v3-base** and has been fine-tuned for phishing detection tasks. ## Training Details - **Base Model**: microsoft/deberta-v3-base - **Training Samples**: 37395 - **Validation Samples**: 7479 - **Test Samples**: 4986 - **Epochs**: 5 - **Batch Size**: 14 - **Learning Rate**: 2e-05 - **Max Length**: 512 - **Loss Type**: focal - **LR Scheduler**: cosine ## Additional Training Parameters - **Augmentation**: False ## Evaluation Results ### Test Set Metrics - **Loss**: 0.0044 - **Accuracy**: 0.9966 - **F1**: 0.9970 - **Precision**: 0.9975 - **Recall**: 0.9965 - **Roc Auc**: 1.0000 - **Mcc**: 0.9930 - **True Positives**: 2836.0000 - **True Negatives**: 2133.0000 - **False Positives**: 7.0000 - **False Negatives**: 10.0000 - **Runtime**: 94.4784 - **Samples Per Second**: 52.7740 - **Steps Per Second**: 3.7790 - **Epoch**: 5.0000 ### Validation Set Metrics - **Loss**: 0.0050 - **Accuracy**: 0.9965 - **F1**: 0.9969 - **Precision**: 0.9986 - **Recall**: 0.9953 - **Roc Auc**: 0.9999 - **Mcc**: 0.9929 - **True Positives**: 4248.0000 - **True Negatives**: 3205.0000 - **False Positives**: 6.0000 - **False Negatives**: 20.0000 - **Runtime**: 112.4284 - **Samples Per Second**: 66.5220 - **Steps Per Second**: 4.7590 - **Epoch**: 5.0000 ## Usage ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification import torch # Load model and tokenizer model_name = "nhellyercreek/email-phishing-detector-v2" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained(model_name) # Example inference text = "Your email or URL text here" inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512) outputs = model(**inputs) predictions = torch.nn.functional.softmax(outputs.logits, dim=-1) # Get prediction predicted_class = predictions.argmax().item() confidence = predictions[0][predicted_class].item() print(f"Predicted class: {predicted_class} (phishing=1, safe=0)") print(f"Confidence: {confidence:.4f}") ``` ## Limitations This model was trained on specific datasets and may not generalize to all types of phishing attempts. Always use additional security measures in production environments. ## Citation If you use this model, please cite: ```bibtex @misc{nhellyercreek_email_phishing_detector_v2, title={Email Phishing Detector V2}, author={Your Name}, year={2024}, publisher={Hugging Face}, howpublished={\url{https://huggingface.co/nhellyercreek/email-phishing-detector-v2}} } ```