HousePrice Analytics
Live Australian residential property market intelligence. Growth hotspot detection, XGBoost + LightGBM price predictions, and composite investment scoring — powered by ABS 6416.0 data.
What It Does
Growth Hotspot Detection
Statistical outlier analysis across 8 Australian capitals. Flags cities with YoY growth above market average + accelerating 3-quarter momentum. Scatter plots reveal acceleration vs year-on-year movement at a glance.
ML Price Prediction
XGBoost + LightGBM ensemble trained on 5 years of ABS quarterly index data. Features include 4-quarter lag values, rolling averages, and acceleration metrics. Time-series cross-validated for real predictive validity — not just in-sample fitting.
Investment Scoring
Composite 0–100 score per city combining: growth momentum (40%), ML prediction confidence (30%), volatility-adjusted stability (15%), and market position signals (15%). Ranked table with radar chart breakdown for top cities.
Trend Analysis
Interactive price index trends with city comparison, QoQ change bars, and YoY rankings. Powered by ABS 6416.0 Residential Property Price Indexes — the same source RBA and Treasury use.
Stack
- Docker (containerized app)
- DigitalOcean droplet
- Nginx reverse proxy
- Daily ETL cron (6am UTC)
- scikit-learn (preprocessing, CV)
- XGBoost + LightGBM ensemble
- TimeSeriesSplit cross-validation
- 14-feature lag engineering
- ABS Cat. 6416.0 (primary)
- Domain API (suburb stats)
- SQLite database
- Plotly interactive charts
What I Learned
- Time-series CV matters. Standard k-fold leaks future data into training — TimeSeriesSplit prevents this and gives a realistic performance estimate.
- ABS data is surprisingly usable. The 6416.0 RPPI dataset is clean, consistent, and goes back decades. More reliable than scraped sources for trend analysis.
- Ensemble prediction improves stability. XGBoost and LightGBM tend to diverge on edge cases — averaging their predictions reduces individual model bias.
- Composite scoring needs weighting discipline. Easy to game a score by tweaking weights. Each weight needs a defensible reason — documented in the methodology.