Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/120551
Full metadata record
DC FieldValueLanguage
dc.date.accessioned2024-04-09T05:50:52Z-
dc.date.available2024-04-09T05:50:52Z-
dc.date.issued2023-
dc.identifier.citationSamin, A.M. (2023). Exploring parameter-efficient adapters for low-resource automatic speech recognition (Master's dissertation).en_GB
dc.identifier.urihttps://www.um.edu.mt/library/oar/handle/123456789/120551-
dc.descriptionM.Sc. (HLST)(Melit.)en_GB
dc.description.abstractParameter-efficient adapter modules have been leveraged in pre-trained speech models for speech processing tasks such as automatic speech recognition (ASR) in recent years. An adapter, integrated into these pre-trained speech models, typically consists of two feed-forward layers that are trained while keeping the pre-trained backbone frozen. Despite their emergence for ASR, a comprehensive exploration of adapters remains lacking, leaving several research questions unanswered. In this thesis, we employ adapter-based tuning on two state-of-the-art pre-trained models, XLS-R and MMS, and compare it with the complete fine-tuning approach. Our study investigates the data requirements for adapter-tuning and reveals that while adapters are unsuited for few-shot learning, they exhibit competitive performance compared to full fine-tuning when at least 10 hours of labeled speech data are available. We also demonstrate that exploiting the larger XLS-R model with 2 billion parameters for adapter-tuning exhibits superior performance than fine-tuning the entire XLS-R 2B model. This phenomenon likely arises due to the susceptibility of larger models to overfitting during full fine-tuning, a challenge effectively circumvented by training only the adapters while leveraging the pre-trained knowledge. Moreover, our experiment reveals that more pre-training data might be helpful for the adapter-tuning to work well. Additionally, we perform separate experiments on transfer learning with adapters and scaling the adapter modules with more feed-forward layers, yielding valuable insights. To the best of our knowledge, this exhaustive study is pioneering in its exploration of adapters for ASR, contributing significant insights to this evolving technology.en_GB
dc.language.isoenen_GB
dc.rightsinfo:eu-repo/semantics/restrictedAccessen_GB
dc.subjectAutomatic speech recognitionen_GB
dc.subjectNeural networks (Computer science)en_GB
dc.subjectFeedforward control systemsen_GB
dc.titleExploring parameter-efficient adapters for low-resource automatic speech recognitionen_GB
dc.typemasterThesisen_GB
dc.rights.holderThe copyright of this work belongs to the author(s)/publisher. The rights of this work are as defined by the appropriate Copyright Legislation or as modified by any successive legislation. Users may access this work and can make use of the information contained in accordance with the Copyright Legislation provided that the author must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the prior permission of the copyright holder.en_GB
dc.publisher.institutionUniversity of Maltaen_GB
dc.publisher.departmentFaculty of Information and Communication Technology. Department of Artificial Intelligenceen_GB
dc.description.reviewedN/Aen_GB
dc.contributor.creatorSamin, Ahnaf Mozib (2023)-
Appears in Collections:Dissertations - FacICT - 2023
Dissertations - FacICTAI - 2023

Files in This Item:
File Description SizeFormat 
2318ICTCSA531005079269_1.PDF
  Restricted Access
1.12 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.