--- license: mit base_model: alger-ia/dziribert tags: - generated_from_trainer - text-classification - arabic - algerian - darija - french datasets: - algerian-telecom-comments language: - ar - fr metrics: - accuracy - f1 model-index: - name: dziribert-algerie-telecom-v1 results: - task: name: Text Classification type: text-classification dataset: name: Algerian Telecom Comments type: custom metrics: - name: Accuracy type: accuracy value: 0.87 - name: F1 Score type: f1 value: 0.85 --- # DziriBERT Algérie Télécom Classifier This model is a fine-tuned version of [alger-ia/dziribert](https://huggingface.co/alger-ia/dziribert) for classifying Algerian telecom customer comments into urgency levels. ## Model Description - **Base Model**: DziriBERT (BERT for Algerian Arabic) - **Task**: Text Classification - **Languages**: Arabic, French, Algerian Darija - **Classes**: 5 urgency levels - `High_Urgency`: عاجل جداً - `Medium_Urgency`: متوسط الأهمية - `Low_Medium_Urgency`: منخفض-متوسط - `Low_Urgency`: منخفض الأهمية - `No_Urgency`: لا توجد أهمية ## Usage ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification from transformers import pipeline tokenizer = AutoTokenizer.from_pretrained("tarekAeb/dziribert-algerie-telecom-v1") model = AutoModelForSequenceClassification.from_pretrained("tarekAeb/dziribert-algerie-telecom-v1") classifier = pipeline("text-classification", model=model, tokenizer=tokenizer) # Example result = classifier("الانترنت ما يخدم من البارح") print(result) # [{'label': 'High_Urgency', 'score': 0.89}] ``` ## Training Data - **Dataset**: 9,771 customer comments from Algerian telecom social media - **Sources**: Facebook, Instagram, Twitter posts - **Preprocessing**: Text cleaning, normalization, augmentation ## Performance - **Accuracy**: 87% - **F1-Score**: 85% - **Training Strategy**: Layer freezing, hyperparameter optimization ## Demo Try the live demo: [Comment Genie](https://your-demo-url.com)