inria-soda/tabular-benchmark
Viewer β’ Updated β’ 17.2M β’ 5.71k β’ 48
How to use richeechabhadiya/house-price-predictor with Scikit-learn:
from huggingface_hub import hf_hub_download
import joblib
model = joblib.load(
hf_hub_download("richeechabhadiya/house-price-predictor", "sklearn_model.joblib")
)
# only load pickle files from sources you trust
# read more about it here https://skops.readthedocs.io/en/stable/persistence.htmlAn ensemble of XGBoost + LightGBM + sklearn GradientBoosting for predicting house prices in the King County (Seattle) area.
reg_num_house_sales config)log(price) β exponentiate predictions with np.exp() for dollar amounts| Model | RMSE (log) | RΒ² | MAE (log) | RMSE ($) | MAE ($) |
|---|---|---|---|---|---|
| XGBoost | 0.1789 | 0.8890 | 0.1242 | $138,503 | $71,551 |
| LightGBM | 0.1792 | 0.8886 | 0.1250 | $139,210 | $72,513 |
| sklearn GB | 0.1783 | 0.8897 | 0.1248 | $137,950 | $71,936 |
| Ensemble | 0.1769 | 0.8915 | 0.1228 | $136,893 | $70,936 |
π Best Model: Ensemble (RΒ² = 0.8915)
| Feature | Importance |
|---|---|
| grade | 0.5047 |
| sqft_living | 0.1845 |
| lat | 0.1537 |
| long | 0.0248 |
| sqft_living15 | 0.0247 |
| yr_built | 0.0212 |
| sqft_above | 0.0130 |
| sqft_lot15 | 0.0129 |
| bathrooms | 0.0125 |
| sqft_lot | 0.0119 |
| yr_renovated | 0.0112 |
| bedrooms | 0.0073 |
| date_month | 0.0069 |
| sqft_basement | 0.0063 |
| date_day | 0.0043 |
import joblib
import numpy as np
from huggingface_hub import hf_hub_download
# Download and load model
model_path = hf_hub_download("richeechabhadiya/house-price-predictor", "xgboost_model.joblib")
model = joblib.load(model_path)
# Predict (input: 15 features as numpy array)
# Features: bedrooms, bathrooms, sqft_living, sqft_lot, grade, sqft_above,
# sqft_basement, yr_built, yr_renovated, lat, long,
# sqft_living15, sqft_lot15, date_month, date_day
sample = np.array([[3, 2.0, 1800, 7500, 7, 1800, 0, 1990, 0, 47.5, -122.2, 1700, 7500, 6, 15]])
log_price = model.predict(sample)
price_dollars = np.exp(log_price)
print(f"Predicted price: ${price_dollars[0]:,.0f}")
import joblib
import numpy as np
from huggingface_hub import hf_hub_download
# Load all 3 models
xgb = joblib.load(hf_hub_download("richeechabhadiya/house-price-predictor", "xgboost_model.joblib"))
lgbm = joblib.load(hf_hub_download("richeechabhadiya/house-price-predictor", "lightgbm_model.joblib"))
skgb = joblib.load(hf_hub_download("richeechabhadiya/house-price-predictor", "sklearn_gb_model.joblib"))
# Ensemble prediction (average)
sample = np.array([[3, 2.0, 1800, 7500, 7, 1800, 0, 1990, 0, 47.5, -122.2, 1700, 7500, 6, 15]])
pred = (xgb.predict(sample) + lgbm.predict(sample) + skgb.predict(sample)) / 3
price = np.exp(pred)
print(f"Ensemble predicted price: ${price[0]:,.0f}")
Trained with hyperparameters from NeurIPS 2022 benchmark research:
xgboost_model.joblib β XGBoost model (2.4 MB)lightgbm_model.joblib β LightGBM model (2.1 MB)sklearn_gb_model.joblib β sklearn GradientBoosting model (1.9 MB)model_metadata.json β Full training metadata, results, and feature namesfeature_importance.json β Feature importance scores