Please use this identifier to cite or link to this item: https://www.um.edu.mt/library/oar/handle/123456789/92180
Title: Speech based psychological distress detection
Authors: Gatt, Corinne (2021)
Keywords: Neural networks (Computer science)
Depression, Mental -- Diagnosis
Speech disorders
Data sets
Issue Date: 2021
Citation: Gatt, C. (2021). Speech based psychological distress detection (Bachelor’s dissertation).
Abstract: This project aims to explore the use of Automatic Depression Detection on different participant speech samples. The DAIC-WOZ Depression Dataset is being used throughout this study, and this contains interviews conducted by a virtual interviewer on participants with different levels of depression. PHQ-8 questionnaires attempt to provide a value for the level of depression of a person. Four different different CNN architectures will be implemented and tested, these being mainly an evaluation between 1D and 2D CNNs. The input to these models are speech spectrograms, which are Frequency/ Time graphs which plot speech signal strengths at different timestamps. Patterns in these spectrograms will be used to train the various models. The most popular approach when dealing with depression classification is to take the problem to be binary, this means only having two possible labels, depressed or not depressed. In reality, recent studies on depression revealed that rather than being a black or white problem, mental illnesses tend to lie on a spectrum. The advantages of having mental illnesses viewed on a spectrum is an increase in accuracy when administering treatment or in the prescription of medication. This new view of mental illnesses brings forward a new problem which could be attempted to be solved using AI - the use of AI models to classify a patient based on a spectrum of mental illness. In this project CNN models trained on a binary approach will be tested on how well they are able to distinguish different levels of depression. Out of the four implemented architectures, the best performing resulted to be Architecture 1, being a 1-D CNN and Architecture 2, being a 2-D CNN. For binary classification of depressed individuals, an accuracy of 0.769 was achieved while for non-depressed individuals an accuracy of 0.379 was reached. As for results pertaining to the multi-class result evaluation, Architecture 1 was seen to classify individuals more accurately to the 24 PHQ-8 classes assigned.
Description: B.Sc. IT (Hons)(Melit.)
URI: https://www.um.edu.mt/library/oar/handle/123456789/92180
Appears in Collections:Dissertations - FacICT - 2021
Dissertations - FacICTAI - 2021

Files in This Item:
File Description SizeFormat 
21BITAI024.pdf
  Restricted Access
2.95 MBAdobe PDFView/Open Request a copy


Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.