Supply Chain Delay Prediction (DataCo Dataset)

This repository contains the trained classification model, feature engineering pipelines, exploratory notebooks, and database integration schemas for predicting supply chain shipment delays.

πŸ“Š Model Performance

The predictive model was trained on the DataCo Supply Chain dataset (excluding post-shipment features to avoid data leakage):

  • XGBoost Classifier (Best Model):
    • Accuracy: 73.43%
    • ROC-AUC: 83.62%
    • Late Delivery Class Precision: 86% (high precision ensures low false-alarm rates for operations)
  • Random Forest Classifier:
    • Accuracy: 72.23%
    • ROC-AUC: 83.68%

Top Feature Importances (XGBoost)

  1. days_for_shipping_scheduled (53.4%) - The scheduled shipping window constraint.
  2. shipping_mode (23.5%) - Shipping level (First Class, Second Class, Same Day, Standard).
  3. order_hour (6.0%) - Time of day the order was placed.
  4. type (3.7%) - Transaction/Payment type (Debit, Transfer, etc.)

πŸ“ Repository Structure

β”œβ”€β”€ Models/
β”‚   β”œβ”€β”€ best_xgb_model.json           (XGBoost model structure)
β”‚   β”œβ”€β”€ best_xgb_model.pkl            (XGBoost model pickle)
β”‚   β”œβ”€β”€ label_encoders.pkl            (Categorical feature encoders)
β”‚   └── scaler.pkl                    (Feature scaling parameters)
β”œβ”€β”€ Scripts/
β”‚   β”œβ”€β”€ data_cleaning.py              (Preprocessing logic)
β”‚   β”œβ”€β”€ database_loader.py            (MySQL database bulk uploader)
β”‚   └── model_training.py             (Machine learning pipeline)
β”œβ”€β”€ Notebooks/
β”‚   β”œβ”€β”€ 1_Data_Cleaning_EDA.ipynb     (EDA & SQL queries)
β”‚   └── 2_Delay_Prediction_ML.ipynb   (Machine learning experiments)
β”œβ”€β”€ SQL/
β”‚   β”œβ”€β”€ schema.sql                    (MySQL 3NF schema definitions)
β”‚   └── analysis_queries.sql          (Analytical queries)
β”œβ”€β”€ requirements.txt                  (Python dependencies)
└── PowerBI_Design_Guide.md           (Power BI visualization layout design blueprint)

πŸ› οΈ How to Use

1. Requirements

Ensure you have the required libraries installed:

pip install -r requirements.txt

2. Loading the Model and Predicting in Python

import pickle
import pandas as pd

# Load encoders, scaler, and model
with open("Models/label_encoders.pkl", "rb") as f:
    encoders = pickle.load(f)
with open("Models/scaler.pkl", "rb") as f:
    scaler = pickle.load(f)
with open("Models/best_xgb_model.pkl", "rb") as f:
    model = pickle.load(f)

# Example: Make a prediction (ensure feature engineering matches model_training.py)
# predictions = model.predict(X_scaled)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support