The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026

Presentation Information

[1Yin-B-08]A Multimodal Emotion Recognition System Applicable to Both Individuals with Depressive Tendencies and Healthy Controls

〇Kaname Nakayama¹, Kazuyuki Matumoto², Minoru Yoshida², Xin Kang², Takaki Fukumori², Munenaga Koda², Keita Kiuchi³, Hidehiro Umehara⁴, Tomohiko Nakayama⁵, Masahito Nakataki⁵, Shusuke Numata⁵ (1. Graduate School of Sciences and Technology for Innovation, Tokushima University, 2. Graduate School of Technology, Industrial and Social Sciences, Tokushima University, 3. National Institute of Occupational Safety and Health, Japan Organization of Occupational Health and Safety, 4. Accessibility Support Department, 5. Department of Psychiatry, Graduate School of Biomedical Science, Tokushima University, Tokushima, Japan)

Keywords:

Emotion Estimation,Mental Health,Multimodal,Feature Fusion,Deep Learning

This study proposes a multimodal emotion estimation system applicable to both depressed and healthy individuals. Conventional models often fail on depressed subjects as their blunted affect is treated as noise. We address this by focusing on the "emotional inconsistency" between verbal context (topic) and physical expressions. Our method integrates audio, text, and video features with topic and depression indicators, using a mixed training dataset. This approach establishes a healthy "standard" to detect depressive "distortions" as significant features. Experiments demonstrate robust estimation across mental states, identifying context and vocal prosody as critical indicators.

Comment

To browse or post comments, you must log in.Log in

Back to Session information