Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/107857
Title: Crime analysis
Authors: Attard, Karl (2022)
Keywords: Crime analysis
Machine learning
Data mining
Issue Date: 2022
Citation: Attard, K. (2022). Crime analysis (Bachelor's dissertation).
Abstract: This dissertation focuses on using data mining techniques within the crime analysis domain, which is becoming very popular since many datasets are being publicly available to researchers. We focus on three main objectives; predicting the type of crime, predicting the future crime rate and the investigation of how much data is required to train our models. Two distinct American public datasets are chosen for this study, one focusing on the city of San Francisco (SF) and the other on New York City (NYC). Our first objective concerns crime type prediction, where we implemented several models, ranging from classical Machine Learning (ML) to Deep Learning (DL) methods, that are able to classify the crime category based on the inputted data, mainly being, the time of crime occurrence, its location and how the crime was resolved. The results obtained are then directly compared to [31], where our main goal was to replicate and improve on their findings. The second objective covers crime rate prediction. We implemented both ML and DL regression-based models, meaning that we predict a continuous value (the expected future crimes), as opposed to the classificationbased approach above (which predicts a category). Additionally, we integrated external data (population, unemployment rate and median income data) to help us create more accurate models. Similar to the first objective, we replicate and improve on the results obtained by [9]. In our last objective, we investigated how much data the models require to learn from and yet still manage to produce decent results. Hence, for each prior objective, we diminished the original size of the training set whilst keeping the test set unchanged to check how much data is needed for all the models developed to retain their optimal performance. Our classification and regression models managed to produce better results than [31] and [9] respectively. We inferred that the DL classification model created was not more effective than the other ML models built. We also observed that the unemployment rate feature is the most effective external data when predicting the future crime rate. Finally, diminishing the training sets of Objectives 1 and 2 respectively by approximately half allowed the models to retain a decent performance in their predictions.
Description: B.Sc. IT (Hons)(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/107857
Appears in Collections:Dissertations - FacICT - 2022
Dissertations - FacICTAI - 2022

Files in This Item:
File Description SizeFormat 
2208ICTICT390900013564_1.PDF
  Restricted Access
1 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.