Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/95508
Full metadata record
DC FieldValueLanguage
dc.date.accessioned2022-05-11T09:57:12Z-
dc.date.available2022-05-11T09:57:12Z-
dc.date.issued2012-
dc.identifier.citationZammit, M. J. (2012). Cross document coreference resolution and disambiguation for named entities in user web history documents (Bachelor's dissertation).en_GB
dc.identifier.urihttps://www.um.edu.mt/library/oar/handle/123456789/95508-
dc.descriptionB.Sc. IT (Hons)(Melit.)en_GB
dc.description.abstractAt present, search engine technology does not measure relevance according to the information needs of the user, but rather to the query searched. This is not an ideal approach since different users use identical queries for different information needs. One of the reasons this may happen is because of ambiguity between named entities such as persons, organisations, locations, etc. This dissertation attempts to solve the problem from the client's side by using a baseline streaming cross document coreference resolution approach to discover and disambiguate named entities from the user's web history. Several orthographic and contextual similarity measures are used for this task, including tests involving dice score and topic features. Cosine similarity measure is then used to calculate the similarity between the named entity and the clusters. The final score dictates whether the named entity is to be merged into a cluster or to be formed into a new one. Queries submitted to the search engine are then expanded by using coreference from the most similar cluster to that query. In order to evaluate the system, the WePS-2007 testing corpus is used for relevancy and accuracy.en_GB
dc.language.isoenen_GB
dc.rightsinfo:eu-repo/semantics/restrictedAccessen_GB
dc.subjectComputational linguisticsen_GB
dc.subjectSearch enginesen_GB
dc.subjectNatural language processing (Computer science)en_GB
dc.titleCross document coreference resolution and disambiguation for named entities in user web history documentsen_GB
dc.typebachelorThesisen_GB
dc.rights.holderThe copyright of this work belongs to the author(s)/publisher. The rights of this work are as defined by the appropriate Copyright Legislation or as modified by any successive legislation. Users may access this work and can make use of the information contained in accordance with the Copyright Legislation provided that the author must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the prior permission of the copyright holder.en_GB
dc.publisher.institutionUniversity of Maltaen_GB
dc.publisher.departmentFaculty of Information and Communication Technology. Department of Computer Scienceen_GB
dc.description.reviewedN/Aen_GB
dc.contributor.creatorZammit, Matthew Joseph (2012)-
Appears in Collections:Dissertations - FacICT - 2012
Dissertations - FacICTCS - 2010-2015

Files in This Item:
File Description SizeFormat 
BSC(HONS)ICT_Zammit, Matthew Joseph_2012.PDF
  Restricted Access
4.63 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.