Isabelbinu's picture
Upload README.md
88a50ed verified
|
Raw
History Blame Contribute Delete
5.03 kB

ArgueBot β€” Argument Scheme & Fallacy Classifier

This model is a fine-tuned version of roberta-large trained on a combined dataset of argument schemes and logical fallacies. It classifies any input text into one of 24 categories β€” either a valid argument scheme type or a logical fallacy type β€” in a single inference pass.


Model Details

Property Value
Base Model roberta-large
Task 24-class text classification
Argument Scheme Classes 11
Fallacy Classes 13
Total Classes 24

Training Parameters

These are the exact parameters used to train this model:

Parameter Value Description
Learning Rate 2e-5 Step size for AdamW optimiser
Batch Size 8 Samples per training step
Max Token Length 128 Maximum input tokens per sample
Weight Decay 0.01 L2 regularisation in AdamW
Max Epochs 20 Training ceiling β€” early stopping cuts this short
Gradient Clip 1.0 Max gradient norm to prevent exploding gradients
Train / Val / Test Split 70 / 15 / 15 Stratified split
Optimiser AdamW Adaptive learning rate with weight decay

Early Stopping

Parameter Value Description
Patience 3 Stops after 3 consecutive epochs with no improvement
Min Delta 0.001 Minimum change that counts as an improvement
Monitor val_loss Watches validation loss to detect overfitting

The best model checkpoint is saved automatically whenever validation loss improves. When early stopping triggers, the best weights are restored before evaluation.

Data Preprocessing

Step What it does
Exact deduplication Removes rows where both text and label are identical
Near-duplicate removal Removes rows where the same text appears with a different label
Short text removal Drops texts with fewer than 5 words
Class weight balancing Computes sklearn.compute_class_weight("balanced") across all 24 classes
Shuffle Fixed random_state=42 for reproducibility

Supported Labels

βœ… Argument Schemes (valid arguments)

  1. Argument from Analogy
  2. Argument from Alternatives
  3. Argument from Cause to Effect
  4. Argument from Commitment
  5. Argument from Example
  6. Argument from Expert Opinion
  7. Argument from Negative Consequences
  8. Argument from Positive Consequences
  9. Argument from Practical Reasoning
  10. Argument from Sign
  11. Argument from Values

⚑ Fallacy Types

  1. Ad Hominem
  2. Ad Populum
  3. Appeal to Emotion
  4. Circular Reasoning
  5. Equivocation
  6. Fallacy of Credibility
  7. Fallacy of Extension
  8. Fallacy of Logic
  9. Fallacy of Relevance
  10. False Causality
  11. False Dilemma
  12. Faulty Generalization
  13. Intentional

Dataset

Source Type Classes Samples
EthiX + Macagno Argument schemes 11 1829
LOGIC dataset Fallacy types 13 4124
  • Text column: Argument
  • Label column: Label
  • Both datasets combined into a single CSV before training

Results

Metric Value
Accuracy 0.54
Macro f1-score 0.49

How to Load and Use the Model

Step 1 β€” Install

pip install transformers torch

Step 2 β€” Load from Hugging Face



import json, torch, requests
from transformers import AutoTokenizer, AutoModelForSequenceClassification

MODEL_PATH = "Isabelbinu/roberta-large-argumentscheme-fallacy-classifier"

tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
model     = AutoModelForSequenceClassification.from_pretrained(MODEL_PATH)
model.eval()

meta      = requests.get(f"https://huggingface.co/{MODEL_PATH}/resolve/main/metadata.json").json()
label_map = {int(k): v for k, v in meta["label_map"].items()}
scheme_ids = set(meta["scheme_ids"])

def predict(text):
    inputs  = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
    with torch.no_grad():
        probs = torch.softmax(model(**inputs).logits, dim=-1).squeeze()
    pred_id = int(probs.argmax())
    return {
        "verdict":    "βœ… Valid Argument" if pred_id in scheme_ids else "⚑ Fallacy",
        "label":      label_map[pred_id],
        "confidence": f"{probs[pred_id]:.1%}",
    }

# Test 1 
print(predict("Introducing a four-day work week will boost employee wellbeing, reduce burnout, and ultimately increase overall productivity."))


# Test 2
print(predict("Don't trust him β€” he was caught lying before, so everything he says is wrong."))

Applications

  • Education: Teach critical thinking by identifying argument types in real texts
  • Debate Analysis: Evaluate the quality of reasoning in speeches and essays
  • Fact-checking: Flag logically flawed reasoning in news and social media
  • Media Literacy: Help readers identify manipulation tactics in persuasive content
  • AI Assistants: Add argumentation reasoning to conversational AI systems

License

This model is licensed under the MIT License.