Deep Learning-Based Predictive Modeling for Assessment of Depression in Male Speakers

Conference proceedings article


Authors/Editors


Strategic Research Themes


Publication Details

Author listHugo Goncalves, Thaweewong Akkaralaertsest, Thaweesak Yingthawornsuk

Publication year2024

URLhttps://gcmm2024.rmutk.ac.th/

LanguagesEnglish-United States (EN-US)


Abstract

This research investigates the use of deep learning techniques to develop predictive models for male depression detection using audio recordings. The dataset comprises speech samples categorized into three classes: Depressed (DPR), High Risk of Suicide (HRK), and Remitted (RMT). We extracted Mel-frequency cepstral coefficients (MFCCs) as features and employed several deep learning models, including 1D and 2D Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM) networks, and Support Vector Machines (SVMs). The study analyzed the models' performance with different MFCC configurations and data segment overlaps. Our results show that the 2D CNN model achieved the highest accuracy, with an F1-score of 0.94 in distinguishing between DPR and RMT classes. The 1D CNN model also performed well, with an F1-score of 0.91. The LSTM model demonstrated moderate success, achieving an F1-score of 0.88, while the SVM model showed lower performance with an F1-score of 0.85. These findings indicate that deep learning models, particularly CNNs, are effective for automatic detection of male depression from audio data. Future work will focus on optimizing these models and incorporating additional features to enhance predictive accuracy.


Keywords

No matching items found.


Last updated on 2025-06-03 at 00:00