Deep Learning-Based Predictive Modeling for Assessment of Depression in Male Speakers
Conference proceedings article
Authors/Editors
Strategic Research Themes
Publication Details
Author list: Hugo Goncalves, Thaweewong Akkaralaertsest, Thaweesak Yingthawornsuk
Publication year: 2024
URL: https://gcmm2024.rmutk.ac.th/
Languages: English-United States (EN-US)
Abstract
This research investigates the use of deep learning techniques to develop predictive models for male depression detection using audio recordings. The dataset comprises speech samples categorized into three classes: Depressed (DPR), High Risk of Suicide (HRK), and Remitted (RMT). We extracted Mel-frequency cepstral coefficients (MFCCs) as features and employed several deep learning models, including 1D and 2D Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM) networks, and Support Vector Machines (SVMs). The study analyzed the models' performance with different MFCC configurations and data segment overlaps. Our results show that the 2D CNN model achieved the highest accuracy, with an F1-score of 0.94 in distinguishing between DPR and RMT classes. The 1D CNN model also performed well, with an F1-score of 0.91. The LSTM model demonstrated moderate success, achieving an F1-score of 0.88, while the SVM model showed lower performance with an F1-score of 0.85. These findings indicate that deep learning models, particularly CNNs, are effective for automatic detection of male depression from audio data. Future work will focus on optimizing these models and incorporating additional features to enhance predictive accuracy.
Keywords
No matching items found.