Please use this identifier to cite or link to this item:
https://www.um.edu.mt/library/oar/handle/123456789/120551
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.date.accessioned | 2024-04-09T05:50:52Z | - |
dc.date.available | 2024-04-09T05:50:52Z | - |
dc.date.issued | 2023 | - |
dc.identifier.citation | Samin, A.M. (2023). Exploring parameter-efficient adapters for low-resource automatic speech recognition (Master's dissertation). | en_GB |
dc.identifier.uri | https://www.um.edu.mt/library/oar/handle/123456789/120551 | - |
dc.description | M.Sc. (HLST)(Melit.) | en_GB |
dc.description.abstract | Parameter-efficient adapter modules have been leveraged in pre-trained speech models for speech processing tasks such as automatic speech recognition (ASR) in recent years. An adapter, integrated into these pre-trained speech models, typically consists of two feed-forward layers that are trained while keeping the pre-trained backbone frozen. Despite their emergence for ASR, a comprehensive exploration of adapters remains lacking, leaving several research questions unanswered. In this thesis, we employ adapter-based tuning on two state-of-the-art pre-trained models, XLS-R and MMS, and compare it with the complete fine-tuning approach. Our study investigates the data requirements for adapter-tuning and reveals that while adapters are unsuited for few-shot learning, they exhibit competitive performance compared to full fine-tuning when at least 10 hours of labeled speech data are available. We also demonstrate that exploiting the larger XLS-R model with 2 billion parameters for adapter-tuning exhibits superior performance than fine-tuning the entire XLS-R 2B model. This phenomenon likely arises due to the susceptibility of larger models to overfitting during full fine-tuning, a challenge effectively circumvented by training only the adapters while leveraging the pre-trained knowledge. Moreover, our experiment reveals that more pre-training data might be helpful for the adapter-tuning to work well. Additionally, we perform separate experiments on transfer learning with adapters and scaling the adapter modules with more feed-forward layers, yielding valuable insights. To the best of our knowledge, this exhaustive study is pioneering in its exploration of adapters for ASR, contributing significant insights to this evolving technology. | en_GB |
dc.language.iso | en | en_GB |
dc.rights | info:eu-repo/semantics/restrictedAccess | en_GB |
dc.subject | Automatic speech recognition | en_GB |
dc.subject | Neural networks (Computer science) | en_GB |
dc.subject | Feedforward control systems | en_GB |
dc.title | Exploring parameter-efficient adapters for low-resource automatic speech recognition | en_GB |
dc.type | masterThesis | en_GB |
dc.rights.holder | The copyright of this work belongs to the author(s)/publisher. The rights of this work are as defined by the appropriate Copyright Legislation or as modified by any successive legislation. Users may access this work and can make use of the information contained in accordance with the Copyright Legislation provided that the author must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the prior permission of the copyright holder. | en_GB |
dc.publisher.institution | University of Malta | en_GB |
dc.publisher.department | Faculty of Information and Communication Technology. Department of Artificial Intelligence | en_GB |
dc.description.reviewed | N/A | en_GB |
dc.contributor.creator | Samin, Ahnaf Mozib (2023) | - |
Appears in Collections: | Dissertations - FacICT - 2023 Dissertations - FacICTAI - 2023 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2318ICTCSA531005079269_1.PDF Restricted Access | 1.12 MB | Adobe PDF | View/Open Request a copy |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.