Analysing diverse algorithms performing music genre recognition

Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/107188

Title:	Analysing diverse algorithms performing music genre recognition
Authors:	Buttigieg Vella, Jamie (2022)
Keywords:	Machine learning Algorithms Neural networks (Computer science) Music Application software
Issue Date:	2022
Citation:	Buttigieg Vella, J. (2022). Analysing diverse algorithms performing music genre recognition (Bachelor’s dissertation).
Abstract:	Nowadays, in order to discover music which matches our tastes we tend to rely greatly on the same applications we use to listen to the actual music, and in the process, allow algorithms to introduce us to new music genres that may be of interest to us. Because of this, it is especially important that these applications are able to achieve a good understanding of our listening habits, genre taste and connections with the performers, among other factors. A musical piece has various characteristics by which it can be described, and hence, songs with similar characteristics can be organised together in a single class, referred to as a musical genre. The challenge here is that the definition of a musical genre is itself subjective and the boundaries between one musical genre and another are not regularised and are rather based on user perception. This project seeks to determine the extent to which Music Genre Recognition can be performed by evaluating different Machine Learning algorithms used in the industry. The algorithms were applied to a curated benchmark of audio tracks with corresponding genre labels and compared to similar models documented in the literature. Therefore, to tackle this task, algorithms such as Artificial Neural Networks, Convolutional Neural Networks, Recurrent Neural Networks and Gradient Boosting Machines were created, experimented with and compared. Audio tracks were used as the dataset content of choice, to simulate as much as possible real-world applications, from which 13 Mel Frequency Cepstral Coefficients were extracted to be used as inputs for the algorithms, as these coefficients were found to be among the best features to approximate the human auditory system. The results obtained show that the Convolutional Neural Network and the relatively new Gradient Boosting Machine, namely XGBoost both have the best performance among the others. It was also discovered that small input samples of features are not only capable of training a classifying algorithm, but actually provide the best results.
Description:	B.Sc. IT (Hons)(Melit.)
URI:	https://www.um.edu.mt/library/oar/handle/123456789/107188
Appears in Collections:	Dissertations - FacICT - 2022 Dissertations - FacICTCIS - 2022

Files in This Item:

File	Description	Size	Format
2208ICTICT391205068884_1.PDF Restricted Access		1.98 MB	Adobe PDF	View/Open Request a copy

Show full item record Statistics