Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/126776
Title: | The applicability of Wav2Vec2 and whisper for low-resource Maltese ASR |
Authors: | Williams, Aiden DeMarco, Andrea Borg, Claudia |
Keywords: | Automatic speech recognition Low-resource languages Natural language processing (Computer science) Computational linguistics Speech processing systems |
Issue Date: | 2023 |
Publisher: | SIGUL |
Citation: | Williams, A., Demarco, A., & Borg, C. (2023). The Applicability of Wav2Vec2 and Whisper for Low-Resource Maltese ASR. 2nd Annual Meeting of the ELRA/ISCA SIG on Under-resourced Languages (SIGUL 2023), Dublin. 39-43. |
Abstract: | Maltese is a low-resource language with limited digital tools, including automatic speech recognition. With very limited datasets of Maltese speech available, a recent project, MASRI, developed further speech datasets and produced an initial prototype trained using the Jasper architecture. The best system achieved 55.05% WER on the MASRI test set. Our work builds upon this, producing a further two-and-a half-hour annotated speech corpus from a domain in which no data was previously available (Parliament of Malta). Moreover, we experiment with existing pre-trained self-supervised models (Wav2Vec2.0 and Whisper) and further fine-tune these models on Maltese annotated data. A total of 30 Maltese ASR models are trained and evaluated using the WER and the CER. The results indicate that the performance of the models scales with the quantity of data, although not linearly. The best model achieves state-of-the-art results of 8.53% WER and 1.93% CER on a test set extracted from the CommonVoice project and 24.98% WER and 8.37% CER on the MASRI test set. |
URI: | https://www.um.edu.mt/library/oar/handle/123456789/126776 |
Appears in Collections: | Scholarly Works - InsSSA |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
The_applicability_of_Wav2Vec2_and_whisper_for_low_resource_Maltese_ASR.pdf | 333.52 kB | Adobe PDF | View/Open |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.