Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/29528
Title: EXCITE : extracting geographic information from text
Authors: Farrugia, James
Keywords: Natural language processing (Computer science)
Computational linguistics
Geography -- Data processing
Issue Date: 2017
Abstract: Geotagging is an automatic process that detects geographic locations in unstructured text. In addition it maps the detected locations to latitude-longitude pairs. EXCITE is a geo- tagging system that is applied on a number of online local news articles and other o cial notices such as police road closures and scheduled power cuts. It detects references to localities and streets of the Maltese Islands within collections of unstructured text (e.g. news reports). In addition, EXCITE visualises the detected locations on a map of the Maltese Islands accessed via a simple and interactive interface. The system accepts news articles written both in English and in Maltese. EXCITE is available online on http://ec2-54-244-103-196.us-west-2.compute.amazonaws.com/. Geotagging is performed using a gazetteer based lookup method. Our gazetteer con- tains references to localities and streets mapped to latitude-longitude pairs, in the form of a hierarchy. Our gazetteer for the Maltese Islands has been constructed by utilising freely available OpenStreetMap data. The geotagging system then applies Named Entity Recognition on English articles and, given the absence of NLP tools in Maltese, applies a number of rules to extract locations in Maltese articles. The system must disambiguate any ambiguities such as when a detection can refer to either a location or to another entity, or when a detection can refer to two distinct locations. EXCITE utilises the gazetteer in order to solve ambiguity and to ground each detection to latitude-longitude pairs. EXCITE also introduces the geotagging process to the Maltese language and establishes groundwork in the detection of location entities when considering the lack of language tools for Maltese. EXCITE achieved satisfactory results when evaluated. The results show that the gazetteer contains references to all localities in Malta and Gozo and also refers to ap- proximately 75% of streets. EXCITE also detects 91% and 86% of the referenced locations in English and Maltese news articles respectively. These results are very encouraging. The achieved results for the Maltese language are particularly satisfying since the implemented methods were constructed from scratch and make no use of language tools such as NERs.
Description: B.SC.IT(HONS)
URI: https://www.um.edu.mt/library/oar//handle/123456789/29528
Appears in Collections:Dissertations - FacICT - 2017
Dissertations - FacICTAI - 2017

Files in This Item:
File Description SizeFormat 
17BITAI013.pdf
  Restricted Access
4.41 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.