Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/119150
Title: | Towards content accessibility through lexical simplification for Maltese as a low-resource language |
Authors: | Meli, Martina Tanti, Marc Porter, Chris |
Keywords: | Natural language processing (Computer science) Computational linguistics -- Malta Maltese language -- Lexicology Semantic computing Computer architecture -- Evaluation |
Issue Date: | 2024-03 |
Publisher: | European Chapter of the Association for Computational Linguistics (EACL) |
Citation: | Meli, M., Tanti, M., & Porter, C. (2024, March). Towards Content Accessibility Through Lexical Simplification for Maltese as a Low-Resource Language. Fourth Workshop on Language Technology for Equality, Diversity, Inclusion (LT-EDI 2024) at the 18th Conference of the European Chapter of the Association for Computational Linguistics, St. Julian's, Malta. 41-51. |
Abstract: | Natural Language Processing techniques have been developed to assist in simplifying online content while preserving meaning. However, for low-resource languages, like Maltese, there are still numerous challenges and limitations. Lexical Simplification (LS) is a core technique typically adopted to improve content accessibility, and has been widely studied for highresource languages such as English and French. Motivated by the need to improve access to Maltese content and the limitations in this context, this work set out to develop and evaluate an LS system for Maltese text. An LS pipeline was developed consisting of (1) potential complex word identification, (2) substitute generation, (3) substitute selection, and (4) substitute ranking. An evaluation data set was developed to assess the performance of each step. Results are encouraging and will lead to numerous future work. Finally, a single-blind study was carried out with over 200 participants, where the system’s perceived quality in text simplification was evaluated. Results suggest that meaning is retained about 50% of the time, and when meaning is retained, about 70% of system-generated sentences are either perceived as simpler or of equal simplicity to the original. Challenges remain, and this study proposes a number of areas that may benefit from further research. |
URI: | https://www.um.edu.mt/library/oar/handle/123456789/119150 https://aclanthology.org/2024.ltedi-1.5 |
Appears in Collections: | Scholarly Works - FacICTCIS |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Towards content accessibility through lexical simplification for Maltese as a low resource language 2024.pdf | 420.95 kB | Adobe PDF | View/Open |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.