Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/95508
Title: Cross document coreference resolution and disambiguation for named entities in user web history documents
Authors: Zammit, Matthew Joseph (2012)
Keywords: Computational linguistics
Search engines
Natural language processing (Computer science)
Issue Date: 2012
Citation: Zammit, M. J. (2012). Cross document coreference resolution and disambiguation for named entities in user web history documents (Bachelor's dissertation).
Abstract: At present, search engine technology does not measure relevance according to the information needs of the user, but rather to the query searched. This is not an ideal approach since different users use identical queries for different information needs. One of the reasons this may happen is because of ambiguity between named entities such as persons, organisations, locations, etc. This dissertation attempts to solve the problem from the client's side by using a baseline streaming cross document coreference resolution approach to discover and disambiguate named entities from the user's web history. Several orthographic and contextual similarity measures are used for this task, including tests involving dice score and topic features. Cosine similarity measure is then used to calculate the similarity between the named entity and the clusters. The final score dictates whether the named entity is to be merged into a cluster or to be formed into a new one. Queries submitted to the search engine are then expanded by using coreference from the most similar cluster to that query. In order to evaluate the system, the WePS-2007 testing corpus is used for relevancy and accuracy.
Description: B.Sc. IT (Hons)(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/95508
Appears in Collections:Dissertations - FacICT - 2012
Dissertations - FacICTCS - 2010-2015

Files in This Item:
File Description SizeFormat 
BSC(HONS)ICT_Zammit, Matthew Joseph_2012.PDF
  Restricted Access
4.63 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.