Alhamdulillah, our paper has been accepted for a satellite workshop of ICASSP 2025. This paper discusses "Pathological Voice Detection From Sustained Vowels: Handcrafted vs. Self-supervised Learning". We proposed to examine pathological voice detection from sustained vowels (/a/, /i/, /u/) both using acoustic features and self-supervised learning (SSL) models. We also evaluated early fusion (feature concatenation) and decision-level ensemble learning for both types of features.
Our work is highly beneficial to society, as it will help to improve the performance of pathological voice detection.
Several aspects were evaluated in this research project: evaluation of different vowels (which one leads to better results), evaluation of different acoustic and SSL features, and ensemble learning results.
Future work could tackle the limitations of the F1 score AUC by using more recent metrics like the Matthew correlation coefficient (MCC), which considers true and false positives and negatives.
Since the nature of the problem of detecting pathological voices can be classified as anomaly detection, future work can also be accomplished to observe the effectiveness of anomaly detection methods for pathological voice detection.
We extend our gratitude to AIST for their full support of our research, and to NEDO and JST for research funding.
Happy reading. We welcome your feedback. See you in Hyderabad!
URL for downloading the paper: (will be given after it is available or contact me to get the accepted version).
Project repository: https://github.com/bagustris/svd-exploration