Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/119150
Title: Towards content accessibility through lexical simplification for Maltese as a low-resource language
Authors: Meli, Martina
Tanti, Marc
Porter, Chris
Keywords: Natural language processing (Computer science)
Computational linguistics -- Malta
Maltese language -- Lexicology
Semantic computing
Computer architecture -- Evaluation
Issue Date: 2024-03
Publisher: European Chapter of the Association for Computational Linguistics (EACL)
Citation: Meli, M., Tanti, M., & Porter, C. (2024, March). Towards Content Accessibility Through Lexical Simplification for Maltese as a Low-Resource Language. Fourth Workshop on Language Technology for Equality, Diversity, Inclusion (LT-EDI 2024) at the 18th Conference of the European Chapter of the Association for Computational Linguistics, St. Julian's, Malta. 41-51.
Abstract: Natural Language Processing techniques have been developed to assist in simplifying online content while preserving meaning. However, for low-resource languages, like Maltese, there are still numerous challenges and limitations. Lexical Simplification (LS) is a core technique typically adopted to improve content accessibility, and has been widely studied for highresource languages such as English and French. Motivated by the need to improve access to Maltese content and the limitations in this context, this work set out to develop and evaluate an LS system for Maltese text. An LS pipeline was developed consisting of (1) potential complex word identification, (2) substitute generation, (3) substitute selection, and (4) substitute ranking. An evaluation data set was developed to assess the performance of each step. Results are encouraging and will lead to numerous future work. Finally, a single-blind study was carried out with over 200 participants, where the system’s perceived quality in text simplification was evaluated. Results suggest that meaning is retained about 50% of the time, and when meaning is retained, about 70% of system-generated sentences are either perceived as simpler or of equal simplicity to the original. Challenges remain, and this study proposes a number of areas that may benefit from further research.
URI: https://www.um.edu.mt/library/oar/handle/123456789/119150
https://aclanthology.org/2024.ltedi-1.5
Appears in Collections:Scholarly Works - FacICTCIS



Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.