Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/52904
Title: | Mining disaster impact |
Authors: | Camilleri, Stephen |
Keywords: | Earthquakes Earthquakes -- Press coverage Text data mining |
Issue Date: | 2019 |
Citation: | Camilleri, S. (2019). Mining disaster impact (Master’s dissertation). |
Abstract: | News agencies have the leading role of deciding what coverage is given on earthquakes. So far, very few studies discuss how news agencies react when it comes to the reporting of such events. None of these studies specify that harvesting and processing of data is being carried out on the fly and automatically without human intervention. This study exploits the use of text mining tools to automatically map in near real-time earthquake events with newspaper articles, paving the way for temporal and spatial correlation analysis. Results are presented from a software application that automatically harvests, identifies, clusters (using No-K-Means technique) and extracts earthquake features on the fly from multilingual online news reports, published by 23 leading, international news agencies. As a result, a corpus of 253,129 news articles was harvested over 11 months. Earthquake-related features that are mined include the magnitude, location, event date, the number of casualties, injured and structural damage caused. Cluster feature extraction is also carried out for cross-referencing purposes and alleviates challenges related to text-parsing, language/translation and differences in coverage of earthquake events. Each cluster is then mapped against earthquake events, aggregated from seismic readings stored by United States Geological Survey (USGS). The quality of data clusters was evaluated in five experiments, where the near-optimal similarity threshold by CVI was found to be 0.2 and precision resulting in 0.918, recall 0.845, F-Measure 0.843, accuracy 0.797, purity 0.348 and NMI 0.663. Fifteen experiments evaluated feature extraction in terms of mean absolute error and accuracy. The accuracy between automatically-extracted and manually-extracted features was 92.9% for casualties, 95.7% for injured, 94.4% for damage, 92.5% for magnitude and 81.1% for extracted dates. A 84.46% accuracy was achieved when the system was evaluated as a whole and 90% of the respondents gave a good rating when evaluating the overall experience of the front-end application. This research also provides a modest impact analysis, conclusions and recommendations for future work in this domain area. |
Description: | M.SC.ARTIFICIAL INTELLIGENCE |
URI: | https://www.um.edu.mt/library/oar/handle/123456789/52904 |
Appears in Collections: | Dissertations - FacICT - 2019 Dissertations - FacICTAI - 2019 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
19MAIPT003.pdf Restricted Access | 2.04 MB | Adobe PDF | View/Open Request a copy |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.