Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/76927
Full metadata record
DC FieldValueLanguage
dc.date.accessioned2021-06-08T13:46:38Z-
dc.date.available2021-06-08T13:46:38Z-
dc.date.issued2020-
dc.identifier.citationSpiteri, J. (2020). Automatic crime information gathering and data analytics from online news reports (Bachelor's dissertation).en_GB
dc.identifier.urihttps://www.um.edu.mt/library/oar/handle/123456789/76927-
dc.descriptionB.Sc. IT (Hons)(Melit.)en_GB
dc.description.abstractOne of the major challenges faced by law enforcement is that of the prioritisation and rostering of resources, maximising chances of having the right resources at the right place and at the right time. This research proposes a hybrid machine learning technology which uses a set of customised crawlers to gather data on a daily basis from newspaper articles. Articles that deal with criminal offences are identified, analysed and their inherent details extracted using Natural Language Processing (NLP) Technology. Articles coming from different sources are converged using a standardised format that allows the details of the criminal act (such as crime, location, time, criminal, etc.) to be easily accessed. Related data such as population, literacy etc. are also extracted from other sources using dedicated web crawlers and cross referenced with the criminal events themselves. Web crawling is automated using a special bot designed to initiate the crawling processes regularly. A visualisation engine is being proposed to allow users to quickly and effectively browse the criminal event database using a feature rich search engine enabling specific parameters to be easily identified and depicted. Representations include geographical/calendar heat maps, graphs, etc. Previous research in similar areas has utilised various machine learning techniques with different success rates. This research aims to study the effectiveness of K-Means and DBSCAN [87] based technologies when applied to crime prediction. K-Means uses a purely statistical past-data based model to attempt to predict the incidence of crime; while DBSCAN uses clustering techniques which could include other datasets in addition to past criminal event data. Various datasets has been used to evaluate the performance of the proposed technology; with encouraging results. The Precision/Recall/F-Measure technique used in previous studies [85], [96], has been utilised to compute the F-Measure of both techniques. Moreover, geographically different regions (Malta and Boston) where used to evaluate different crime patterns. While the large number of possible prediction configurations make it very difficult to cover all the possible scenarios, both techniques performed quite well, with the K-Means based one being slightly more accurate when predicting recurring crimes. Predictions of monthly instances of specific crimes were achieved with a combined (NLP + Prediction) F-Measure of 0.78 which compares very favourably with other studies, even those who only covered prediction on a ready-made dataset without any NLP related inaccuracies.en_GB
dc.language.isoenen_GB
dc.rightsinfo:eu-repo/semantics/restrictedAccessen_GB
dc.subjectOnline journalismen_GB
dc.subjectCriminal statisticsen_GB
dc.subjectCrimeen_GB
dc.subjectNatural language processing (Computer science)en_GB
dc.subjectMachine learningen_GB
dc.titleAutomatic crime information gathering and data analytics from online news reportsen_GB
dc.typebachelorThesisen_GB
dc.rights.holderThe copyright of this work belongs to the author(s)/publisher. The rights of this work are as defined by the appropriate Copyright Legislation or as modified by any successive legislation. Users may access this work and can make use of the information contained in accordance with the Copyright Legislation provided that the author must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the prior permission of the copyright holder.en_GB
dc.publisher.institutionUniversity of Maltaen_GB
dc.publisher.departmentFaculty of Information and Communication Technology. Department of Computer Information Systemsen_GB
dc.description.reviewedN/Aen_GB
dc.contributor.creatorSpiteri, Janica (2020)-
Appears in Collections:Dissertations - FacICT - 2020
Dissertations - FacICTCIS - 2020

Files in This Item:
File Description SizeFormat 
20BITSD019.pdf
  Restricted Access
3.16 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.