Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/92375
Title: A data analytic approach to property price prediction, influenced by geographic elements
Authors: Scicluna Calleja, Michael (2021)
Keywords: Housing -- Prices -- Malta -- Forecasting -- Mathematical models
Dwellings -- Valuation -- Malta
Business intelligence -- Malta
Machine learning
Neural networks (Computer science) -- Malta
Geographic information systems -- Malta
Issue Date: 2021
Citation: Scicluna Calleja, M. (2021). A data analytic approach to property price prediction, influenced by geographic elements (Bachelor's dissertation).
Abstract: Property sales in Malta throughout the COVID-19 pandemic, topped €3 billion in 2020, surpassing 2019 figures. Despite this influx in property sales, interviews with local real estate executives revealed that the majority of local real estate agencies value property listings manually, without the help of any ML technologies. It also emerged that the value of a property is heavily influenced by its location, which location is characterised by amenities. The exploration for the best predictive model is a popular approach in research, though few explore the influence of external amenities on this prediction. This study intended to explore the influence of amenities on property valuation by exploring whether predictive accuracy improves when considering proximal amenities. Real estate data for the period 2015 to 2020 was sourced from a local real estate agency. Records containing blank attributes, location outliers and property types in limited supply such as farmhouses were removed. Prices were adjusted to mitigate the effect of price increases over the period. An online map service was utilised to obtain latitude and longitude values for all property listings (geocoding), as well as to extract amenities around the Maltese islands and their respective coordinates. Four types of amenities were considered; bus stops, shops, natural features and other amenities such as restaurants, bars and cafes. A tier system, shown in Figure 1 was used, where for each listing, the quantities of amenities which fall within each of the proximity thresholds, 100m, 200m and 400m, were stored. Two types of predictive models were developed; multi-layer perceptron (MLP) neural networks and multiple linear regression (MLR) models where a number of model configurations considering property data with no amenities, individual groups of amenities or all amenities were configured. The models’ performance was determined by considering the mean absolute percentage error (MAPE) and root mean squared error (RMSE) produced, considering the magnitude and standard deviation of errors respectively. It was observed that the less attributes the MLR models were given, these models tended to fare better. The base model which considered solely property-specific data such as property type, locality, number of bedrooms, bathrooms, coordinates and square area performed the best with a 22.81% MAPE, seeing all other models produce higher MAPE readings. On the other hand, the MLP base model registered a 19.21% MAPE, whilst the best performing model developed, considering a number of amenities at different proximity measures, scored an 11.69%. Therefore, since the MAPE reduced by 7.52% and RMSE reduced by around 50% when considering proximal amenities, this may suggest that such consideration contributes towards a more accurate prediction. Hence, this was indicative that the optimised MLP model was the better overall performer, registering around 11% less error when comparing the best performing MLP and MLR models.
Description: B.Sc. IT (Hons)(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/92375
Appears in Collections:Dissertations - FacICT - 2021
Dissertations - FacICTCIS - 2021

Files in This Item:
File Description SizeFormat 
21BITCB007.pdf
  Restricted Access
5.45 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.