Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/101544
Title: Implementation of an automatic phoneme recognition system
Authors: Cutajar, Michelle (2014)
Keywords: Automatic speech recognition
Phonemics
Application-specific integrated circuits
Issue Date: 2014
Citation: Cutajar, M. (2014). Implementation of an automatic phoneme recognition system (Doctoral dissertation).
Abstract: Automatic Speech Recognition (ASR) is becoming increasingly more popularly used in most applications in today's technologies. Over the past years, a lot of research has been carried out, with the current trend being speech recognition in speaker-independent continuous speech environments with large vocabulary. However, most of the research carried out so far focused on software applications, and not much work has been carried out on the design of hardware recognisers, which is also necessary in order to attain further improvement in the field of ASR. Over the past years, ASR was not used widely in potential applications due to a number of limitations, such as processing power and limited hardware resources [1]. However, with today's advances in customised hardware, the design of ASR systems on-chip has become more feasible. In this research, different phoneme recognition systems for multi-speaker continuous speech environments are being proposed. The feature extraction stage was based on the Discrete Wavelet Transform (DWT), and for the classification stage, different Artificial Neural Networks (ANNs) and Support Vector Machines (SVMs) were analysed. From the methods considered, the One-against-one SVM method provided the highest recognition rates. Furthermore, a priorities scheme was also added, so that the three most likely phoneme representations were obtained at the output. The software implementation of this phoneme recognition system has the potential to achieve an accuracy of 75.41%, for the recognition of 42 phoneme classes from the DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus (TIMIT) corpus. The results obtained were either comparable, or slightly better, than the best results found in the literature [2] - [7], which evaluated their systems on the TIMIT corpus. On average, the potential recognition rate which can be achieved with the proposed phoneme recognition system, results in an increase in accuracy of approximately 6%, when compared to the phoneme recognition systems presented in [2] - [7]. However, the proposed system is more adequate to be implemented on hardware. The phoneme recognition system was then designed on a dedicated chip, in order to evaluate its potential into becoming a portable and efficient system which can be employed in battery-powered devices. The final design can provide a speed which is approximately 4 times faster than the software-based approach, and consumes only 12.Sm W, making it appealing to mobile devices. The performance results obtained from the hardware design demonstrated that this system is a promising basis for future hardware ASR systems.
Description: PH.D.
URI: https://www.um.edu.mt/library/oar/handle/123456789/101544
Appears in Collections:Dissertations - FacICT - 2014
Dissertations - FacICTMN - 2010-2014

Files in This Item:
File Description SizeFormat 
PH.D._Cutajar_Michelle_2014.pdf
  Restricted Access
34.47 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.