---
language: en
license: mit
library_name: scikit-learn
tags:
  - tabular-classification
  - legal-ai
  - msme
  - dispute-resolution
  - lightgbm
  - calibrated-probabilities
metrics:
  - auc
  - f1
  - balanced_accuracy
model-index:
  - name: MSME Payment Outcome Predictor (LightGBM)
    results:
      - task:
          type: tabular-classification
        dataset:
          type: legal-disputes
          name: MSME Payment Dispute Dataset
        metrics:
          - type: auc
            value: 0.72
            name: AUC
          - type: f1
            value: 0.61
            name: F1 Score
          - type: balanced_accuracy
            value: 0.63
            name: Balanced Accuracy
---
# MSME Payment Outcome Predictor (LightGBM)

## Overview
This model predicts the probabilistic outcome of MSME payment disputes:

- **Win**
- **Settlement**
- **Escalation to MSEFC**

The model outputs **calibrated probabilities** for each outcome.

## Model Architecture

- **Algorithm**: LightGBM (Gradient Boosted Decision Trees)
- **Calibration**: Isotonic Regression (`CalibratedClassifierCV`)
- **Preprocessing**:
  - OneHotEncoding (categorical features)
  - Numeric features passthrough
  - Class balancing enabled

## Input Features

| Feature            | Type                               |
|--------------------|------------------------------------|
| claim_amount       | float                              |
| delay_days         | float                              |
| buyer_type         | categorical (govt/private)         |
| contract_present   | binary                             |
| industry_sector    | categorical                        |
| claim_imputed      | binary                             |
| delay_imputed      | binary                             |

## Output Format

```json
{
  "predicted_label": "win",
  "probabilities": {
    "win": 0.59,
    "settlement": 0.05,
    "escalation": 0.35
  }
}
```

## Performance Metrics

- Primary metric: AUC-ROC (macro) ≈ 0.72
- Balanced Accuracy ≈ 0.63
- F1 Macro ≈ 0.61

## Intended Use

- Legal risk scoring
- MSME advisory tools
- Research prototype
- Decision support systems

## Limitations

- Based on structured extracted data only
- Does not include full legal document text
- Not intended for judicial automation