Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/36735
Full metadata record
DC FieldValueLanguage
dc.date.accessioned2018-11-27T08:30:43Z-
dc.date.available2018-11-27T08:30:43Z-
dc.date.issued2018-
dc.identifier.citationCamilleri, C. (2018). Extracting semantically similar words : building a distributional semantic model for Maltese (Bachelor's dissertation).en_GB
dc.identifier.urihttps://www.um.edu.mt/library/oar//handle/123456789/36735-
dc.descriptionB.SC.(HONS)HUMAN LANGUAGE TECH.en_GB
dc.description.abstractThis dissertation describes the process of building a Distributional Semantic Model (DSM) for Maltese, with the aim of generating semantically similar words for a large list of target words automatically. Input to the system is a sub-corpus of the MLRS corpus, which consists of "news" article. A program is created to extract the neighbouring words as well as their frequencies from this corpus, extracting the top 110,000 context words and the top 15000 nouns from those context words. This model uses the bag-of-words and selects contexts from a window of one word proceeding and one word following the target word. To evaluate the quality of the DSM, a well-known evaluation data set for English, the WordSim-353 dataset was translated from English to Maltese and similarity scores were given by 8 participants. The DSM is evaluated using the Spearman and Pearson tests to see how accurate it is compared to the human judgements when it comes to generating the semantic similarity between two words. Results show that the performance of the DSM lies within the range of performances obtained with DSMs for the English language using different setups. Its performance is, admittedly, closer to the lower end of the scale. The rich morphology of the Maltese language, the nature of the evaluation set, and the fact that we only experimented with a limited amount of settings are factors that probably played a major role in this outcome.en_GB
dc.language.isoenen_GB
dc.rightsinfo:eu-repo/semantics/restrictedAccessen_GB
dc.subjectMaltese language -- Semanticsen_GB
dc.subjectMaltese language -- Morphologyen_GB
dc.subjectMaltese language -- Synonyms and antonymsen_GB
dc.titleExtracting semantically similar words : building a distributional semantic model for Malteseen_GB
dc.typebachelorThesisen_GB
dc.rights.holderThe copyright of this work belongs to the author(s)/publisher. The rights of this work are as defined by the appropriate Copyright Legislation or as modified by any successive legislation. Users may access this work and can make use of the information contained in accordance with the Copyright Legislation provided that the author must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the prior permission of the copyright holder.en_GB
dc.publisher.institutionUniversity of Maltaen_GB
dc.publisher.departmentInstitute of Linguistics and Language Technologyen_GB
dc.description.reviewedN/Aen_GB
dc.contributor.creatorCamilleri, Christabelle-
Appears in Collections:Dissertations - InsLin - 2018

Files in This Item:
File Description SizeFormat 
18BSCHLT002.pdf
  Restricted Access
1.84 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.