Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/36735
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.date.accessioned | 2018-11-27T08:30:43Z | - |
dc.date.available | 2018-11-27T08:30:43Z | - |
dc.date.issued | 2018 | - |
dc.identifier.citation | Camilleri, C. (2018). Extracting semantically similar words : building a distributional semantic model for Maltese (Bachelor's dissertation). | en_GB |
dc.identifier.uri | https://www.um.edu.mt/library/oar//handle/123456789/36735 | - |
dc.description | B.SC.(HONS)HUMAN LANGUAGE TECH. | en_GB |
dc.description.abstract | This dissertation describes the process of building a Distributional Semantic Model (DSM) for Maltese, with the aim of generating semantically similar words for a large list of target words automatically. Input to the system is a sub-corpus of the MLRS corpus, which consists of "news" article. A program is created to extract the neighbouring words as well as their frequencies from this corpus, extracting the top 110,000 context words and the top 15000 nouns from those context words. This model uses the bag-of-words and selects contexts from a window of one word proceeding and one word following the target word. To evaluate the quality of the DSM, a well-known evaluation data set for English, the WordSim-353 dataset was translated from English to Maltese and similarity scores were given by 8 participants. The DSM is evaluated using the Spearman and Pearson tests to see how accurate it is compared to the human judgements when it comes to generating the semantic similarity between two words. Results show that the performance of the DSM lies within the range of performances obtained with DSMs for the English language using different setups. Its performance is, admittedly, closer to the lower end of the scale. The rich morphology of the Maltese language, the nature of the evaluation set, and the fact that we only experimented with a limited amount of settings are factors that probably played a major role in this outcome. | en_GB |
dc.language.iso | en | en_GB |
dc.rights | info:eu-repo/semantics/restrictedAccess | en_GB |
dc.subject | Maltese language -- Semantics | en_GB |
dc.subject | Maltese language -- Morphology | en_GB |
dc.subject | Maltese language -- Synonyms and antonyms | en_GB |
dc.title | Extracting semantically similar words : building a distributional semantic model for Maltese | en_GB |
dc.type | bachelorThesis | en_GB |
dc.rights.holder | The copyright of this work belongs to the author(s)/publisher. The rights of this work are as defined by the appropriate Copyright Legislation or as modified by any successive legislation. Users may access this work and can make use of the information contained in accordance with the Copyright Legislation provided that the author must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the prior permission of the copyright holder. | en_GB |
dc.publisher.institution | University of Malta | en_GB |
dc.publisher.department | Institute of Linguistics and Language Technology | en_GB |
dc.description.reviewed | N/A | en_GB |
dc.contributor.creator | Camilleri, Christabelle | - |
Appears in Collections: | Dissertations - InsLin - 2018 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
18BSCHLT002.pdf Restricted Access | 1.84 MB | Adobe PDF | View/Open Request a copy |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.