eXplainable Machine Learning for Real Estate: XGBoost and Shapley Values in Price Prediction

Tris Kee, Winky K. O. Ho

Abstract


This study examines the application of eXplainable Artificial Intelligence (XAI) in property market research, utilizing housing transaction data from Quarry Bay, Hong Kong. The research employs the XGBoost algorithm to predict property prices and subsequently computes Shapley Additive Explanations (SHAP) values to quantify feature importance. A beeswarm plot is used to visualize the distribution of SHAP values, uncovering complex relationships between prices and property characteristics. The findings demonstrate how features such as square footage and property age contribute to average price predictions, offering valuable insights for urban planning and real estate decision-making. In contrast to the traditional black-box models, this study integrates XAI methodologies to enhance model interpretability, thereby fostering trust in AI-driven market analyses. The novelty of this research lies in its combination of machine learning and explainable techniques, bridging the gap between predictive accuracy and interpretability in property valuation. By advancing data-driven decision-making, this study underscores the potential of XAI in promoting transparency and facilitating informed policymaking in the property market.

 

Doi: 10.28991/CEJ-2025-011-05-022

Full Text: PDF


Keywords


Property Prices; XGBoost; Shapley Value.

References


Chen, T., & Guestrin, C. (2016). XGBoost. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 785–794. doi:10.1145/2939672.2939785.

Shapley, L. S. (2020). A Value for 𝑛-PERSON Games. Classics in Game Theory, 69–79, Princeton University Press, Princeton, United States. doi:10.2307/j.ctv173f1fh.12.

Lundberg, S. M., Nair, B., Vavilala, M. S., Horibe, M., Eisses, M. J., Adams, T., Liston, D. E., Low, D. K. W., Newman, S. F., Kim, J., & Lee, S. I. (2018). Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nature Biomedical Engineering, 2(10), 749–760. doi:10.1038/s41551-018-0304-0.

Kumkar, P., Madan, I., Kale, A., Khanvilkar, O., & Khan, A. (2018). Comparison of Ensemble Methods for Real Estate Appraisal. 2018 3rd International Conference on Inventive Computation Technologies (ICICT), 297–300. doi:10.1109/icict43934.2018.9034449.

Stang, M., Krämer, B., Nagl, C., & Schäfers, W. (2023). From human business to machine learning-methods for automating real estate appraisals and their practical implications. Journal of Real Estate Economics, 9(2), 81–108. doi:10.1365/s41056-022-00063-1.

Calainho, F. D., van de Minne, A. M., & Francke, M. K. (2024). A Machine Learning Approach to Price Indices: Applications in Commercial Real Estate. Journal of Real Estate Finance and Economics, 68(4), 624–653. doi:10.1007/s11146-022-09893-1.

Lenaers, I., & De Moor, L. (2023). Exploring XAI techniques for enhancing model transparency and interpretability in real estate rent prediction: A comparative study. Finance Research Letters, 58, 104306. doi:10.1016/j.frl.2023.104306.

Trindade Neves, F., Aparicio, M., & de Castro Neto, M. (2024). The Impacts of Open Data and eXplainable AI on Real Estate Price Predictions in Smart Cities. Applied Sciences (Switzerland), 14(5), 2209. doi:10.3390/app14052209.

Jin, S., Zheng, H., Marantz, N., & Roy, A. (2024). Understanding the effects of socioeconomic factors on housing price appreciation using explainable AI. Applied Geography, 169, 103339. doi:10.1016/j.apgeog.2024.103339.

Kok, N., Koponen, E. L., & Martínez-Barbosa, C. A. (2017). Big data in real estate? From manual appraisal to automated valuation. Journal of Portfolio Management, 43(6), 202–211. doi:10.3905/jpm.2017.43.6.202.

Yang, B., & Cao, B. (2018). Research on Ensemble Learning-based Housing Price Prediction Model. Big Geospatial Data and Data Science, 1(1), 1–8. doi:10.23977/bgdds.2018.11001.

Hendrayati, H., Achyarsyah, M., Marimon, F., Hartono, U., & Putit, L. (2024). The Impact of Artificial Intelligence on Digital Marketing: Leveraging Potential in a Competitive Business Landscape. Emerging Science Journal, 8(6), 2343–2359. doi:10.28991/ESJ-2024-08-06-012.

Burnwal, Y., & Jaiswal, D. R. C. (2023). A Comprehensive Survey on Prediction Models and the Impact of XGBoost. International Journal for Research in Applied Science and Engineering Technology, 11(12), 1552–1556. doi:10.22214/ijraset.2023.57625.

Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140. doi:10.1007/bf00058655.

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32. doi:10.1023/A:1010933404324.

Banerjee, D., & Dutta, S. (2018). Predicting the housing price direction using machine learning techniques. IEEE International Conference on Power, Control, Signals and Instrumentation Engineering, ICPCSI 2017, 2998–3000. doi:10.1109/ICPCSI.2017.8392275.

Choy, L. H. T., & Ho, W. K. O. (2023). The Use of Machine Learning in Real Estate Research. Land, 12(4), 740. doi:10.3390/land12040740.

Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232. doi:10.1214/aos/1013203451.

Ho, W. K. O., Tang, B. S., & Wong, S. W. (2021). Predicting property prices with machine learning algorithms. Journal of Property Research, 38(1), 48–70. doi:10.1080/09599916.2020.1832558.

DataScientest (2024). XGBoost: The Champion of Competitive Machine Learning. DataScientest, Paris, France. Available online: https://datascientest.com/en/xgboost-the-champion-of-competitive-machine-learning (accessed on April 2025).

Mahesh, T. R., Vinoth Kumar, V., Muthukumaran, V., Shashikala, H. K., Swapna, B., & Guluwadi, S. (2022). Performance Analysis of XGBoost Ensemble Methods for Survivability with the Classification of Breast Cancer. Journal of Sensors, 2022. doi:10.1155/2022/4649510.

Zhang, P., Jia, Y., & Shang, Y. (2022). Research and application of XGBoost in imbalanced data. International Journal of Distributed Sensor Networks, 18(6), 1–10. doi:10.1177/15501329221106935.

Akiba, T., Sano, S., Yanase, T., Ohta, T., & Koyama, M. (2019). Optuna. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2623–2631. doi:10.1145/3292500.3330701.

Kee, T., & Ho, W. K. O. (2024). Optimizing Machine Learning Models for Urban Sciences: A Comparative Analysis of Hyperparameter Tuning Methods. Preprints, 1–17. doi:10.20944/preprints202406.0264.v2.

Kee, T., & Ho, W. (2024). Predicting Industrial Property Prices with Explainable Artificial Intelligence. Preprints, 1–22. doi:10.20944/preprints202409.0875.v1.

SHAP (2018). Welcome to the SHAP documentation. SHAP, Osaka, Japan. Available online: https://shap.readthedocs.io/en/latest/ (accessed on April 2025).

Lundberg, S. M., & Lee, S. I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4–9 December, 2017, Long Beach, United States.

Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., & Lee, S. I. (2020). From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2(1), 56–67. doi:10.1038/s42256-019-0138-9.

RVD (2024). Hong Kong Property Review. Hong Kong Special Administrative Region Government, Central Government Complex, Hong Kong. Available online: https://www.rvd.gov.hk/en/publications/hkpr_previous.html (accessed on April 2025).

Khan, A., Debnath, P., Sayeed, A. A., Sumon, F. I., Rahman, A., Khan, T., & Pant, L. (2024). Explainable AI and Machine Learning Model for California House Price Predictions: Intelligent Model for Homebuyers and Policymakers. Journal of Business and Management Studies, 6(5), 73–84. doi:10.32996/jbms.2024.6.5.9.


Full Text: PDF

DOI: 10.28991/CEJ-2025-011-05-022

Refbacks

  • There are currently no refbacks.




Copyright (c) 2025 Winky K.O. Ho

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.
x
Message