Scikit-Learn Industry Models - South Africa
Collection
Four sklearn GradientBoostingClassifier pipelines for banking, insurance, retail, and mining use cases trained on South African data. • 8 items • Updated
How to use ThabangTheActuaryCoder/retail-customer-churn-model with Scikit-learn:
from huggingface_hub import hf_hub_download
import joblib
model = joblib.load(
hf_hub_download("ThabangTheActuaryCoder/retail-customer-churn-model", "sklearn_model.joblib")
)
# only load pickle files from sources you trust
# read more about it here https://skops.readthedocs.io/en/stable/persistence.htmlA GradientBoostingClassifier pipeline for predicting customer churn, trained on South African retail data.
This model is intended for educational and demonstration purposes as part of an end-to-end ML pipeline showcasing Databricks, MLflow, Azure ML, and Hugging Face Hub integration.
| Property | Value |
|---|---|
| Classifier | GradientBoostingClassifier |
| Pipeline steps | preprocessor -> classifier |
| Training samples | 9,600 |
| Test samples | 2,400 |
| Target column | target |
| Created | 2026-06-16T15:37:49.920979+00:00 |
| Metric | Score |
|---|---|
| Accuracy | 0.8421 |
| Precision | 0.7084 |
| Recall | 0.5928 |
| F1 | 0.6455 |
| ROC AUC | 0.8817 |
Numeric: monthly_spend_zar, days_since_last_purchase, num_support_tickets, loyalty_points, account_age_months, num_returns_last_year, avg_order_value_zar, num_orders_last_6m, discount_usage_rate
Categorical: membership_tier, preferred_channel, province
import joblib
from huggingface_hub import hf_hub_download
import pandas as pd
# Download and load the model
model_path = hf_hub_download(
repo_id="ThabangTheActuaryCoder/retail-customer-churn-model",
filename="customer_churn_model.joblib",
)
model = joblib.load(model_path)
# Create a sample input
sample = pd.DataFrame([{"monthly_spend_zar": 0, "days_since_last_purchase": 0, "num_support_tickets": 0, "loyalty_points": 0, "account_age_months": 0, "num_returns_last_year": 0, "avg_order_value_zar": 0, "num_orders_last_6m": 0, "discount_usage_rate": 0, "membership_tier": 0, "preferred_channel": 0, "province": 0}])
# Predict
prediction = model.predict(sample)
probabilities = model.predict_proba(sample)
print(f"Prediction: {prediction}, Probabilities: {probabilities}")