Speech & language for mental health
Broader context
Mental health is probably the only medical domain where speech outperforms all other modalities in terms of predictive power. What people say & how they speak tells a lot about their current emotional state, but also about how they view themselves and their relationship to others. Thus, analyzing their speech can be incredibly powerful in detecting mental health disorders (i.e. classifying a disorder vs some “healthy” control) or tracking the symptoms of a particular disorder over time.
Similarly, speech is central to mental health treatment. For instance, the practice of psychotherapy is largely centered on patient-therapist conversations. Given the recent advances in generative methods, we can adopt them to deliver targeted interventions.
Current status
There are plenty of ongoing collaborations that I am involved in. These are reflected in the list of (self-)references below. Most of those projects are associated with existing code and analysis.
Open research questions
The main topics I am working on right now are the following:
- Major depression disorder (known colloquially as, simply, “depression”)
- Stress
- ADHD
I am also interested in:
- Schizophrenia spectrum disorders
- PTSD
Within those application domains, I investigate the following things from an AI perspective:
- Can we improve the predictive power of speech (meaning: architectural innovation)?
- Can we better explain speech-based models (explainability/interpretability)?
- Can we adapt models to individual speakers (personalization)?
- Can we ensure robust and fair outcomes?
Even though all the other topics are dealing with audio, I make an exception for mental health and will also consider works that focus solely on text.
References
Triantafyllopoulos, A., Terhorst, Y., Tsangko, I., Pokorny, F. B., Bartl-Pokorny, K. D., Seizer, L., … & Schuller, B. (2024). Large Language models for mental health. arXiv preprint arXiv:2411.11880.
Gerczuk, M., Triantafyllopoulos, A., Amiriparian, S., Kathan, A., Bauer, J., Berking, M., & Schuller, B. (2022, November). Personalised deep learning for monitoring depressed mood from speech. In 2022 E-Health and Bioengineering Conference (EHB) (pp. 1-5). IEEE.
Gerczuk, M., Triantafyllopoulos, A., Amiriparian, S., Kathan, A., Bauer, J., Berking, M., & Schuller, B. W. (2023). Zero-shot personalization of speech foundation models for depressed mood monitoring. Patterns, 4(11).
