🏠 Egypt Real Estate Price Predictor

A machine learning model trained on 18,838 Egyptian property listings to predict real estate prices across Egypt.

Model Details

Property Value
Algorithm XGBoost (gradient boosting)
Target log1p(price in EGP)
R² Score (log scale) 0.647
Median APE 27.5%
Training samples 18,838
Cities covered 16 Egyptian governorates

Features Used

Feature Description
size_sqm Property area in square meters
bedrooms_clean Number of bedrooms
bathrooms Number of bathrooms
city Governorate (Cairo, Giza, North Coast, etc.)
property_type Apartment, Villa, Chalet, Duplex, etc.
is_installment Payment method (0=Cash, 1=Installments)
city_median_price Median price in that city (target encoding)
type_median_price Median price for that property type
bed_bath_ratio Bedrooms / Bathrooms ratio
rooms_total Total rooms count
size_per_room Average sqm per room

Cities Covered

Cairo · Giza · North Coast · Red Sea · Alexandria · Suez · Qalyubia · South Sinai · Matrouh · Al Daqahlya · Aswan · Asyut · Damietta · Luxor · Sharqia · Kafr El Sheikh

Property Types

Apartment · Villa · Chalet · Duplex · Townhouse · Twin House · Penthouse · Hotel Apartment · Other

Pipeline

The model uses a sklearn.pipeline.Pipeline with:

  1. ColumnTransformer — StandardScaler for numerics, OrdinalEncoder for categoricals
  2. XGBRegressor — 1000 estimators, lr=0.02, max_depth=8

Usage

import joblib
import numpy as np
import pandas as pd
import json

# Load model and metadata
model = joblib.load("real_estate_model.pkl")
with open("model_metadata.json") as f:
    meta = json.load(f)

def predict_price(area_sqm, bedrooms, bathrooms, city, property_type, is_installment=0):
    cm = meta["city_median"]
    tm = meta["type_median"]
    rooms = bedrooms + bathrooms
    global_med = meta["global_median"]
    
    row = pd.DataFrame([{
        "size_sqm":         area_sqm,
        "bedrooms_clean":   bedrooms,
        "bathrooms":        bathrooms,
        "is_installment":   is_installment,
        "bed_bath_ratio":   bedrooms / max(bathrooms, 1),
        "rooms_total":      rooms,
        "city_median_price": cm.get(city, global_med),
        "type_median_price": tm.get(property_type, global_med),
        "size_per_room":    area_sqm / max(rooms, 1),
        "city":             city,
        "property_type":    property_type,
    }])
    
    log_price = model.predict(row)[0]
    price = np.expm1(log_price)
    return {
        "estimated_price_egp": round(price),
        "price_low_egp":       round(price * 0.90),
        "price_high_egp":      round(price * 1.15),
        "price_per_sqm_egp":   round(price / area_sqm),
    }

# Example
result = predict_price(
    area_sqm=150,
    bedrooms=3,
    bathrooms=2,
    city="Cairo",
    property_type="Apartment"
)
print(result)
# {'estimated_price_egp': 9800000, 'price_low_egp': 8820000, ...}

Data Source

Scraped from PropertyFinder Egypt. Data cleaned, outliers removed, and features engineered for optimal model performance.

Limitations

  • Prices reflect market listings (not sale prices)
  • High variance is inherent to real estate (view, floor, finish not captured)
  • North Coast / Red Sea prices are seasonal/resort properties
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support