PDF

Keywords

deep learning, multimodal systems, emotion recognition, sign language, facial expressions, deaf individuals

How to Cite

Deep learning for multimodal gesture and emotion recognition. (2025). SMART TECHNOLOGIES JOURNAL, 1(8). https://doi.org/10.62687/STJ.8.1.2025.1

Abstract

The application of deep learning in multimodal systems has shown significant progress, especially in streamlining gesture recognition and facilitating sign language interpretation for the hearing impaired. This paper explores the integration of gesture and emotion analysis using convolutional neural networks (CNNs) for facial expression recognition and long short-term memory (LSTM) networks for temporal gesture analysis. To evaluate the effectiveness of the algorithms, multimodal systems were tested on specialized datasets such as iMiGUE, which includes emotion videos that have been accurately annotated. These datasets enabled the evaluation of the model's performance on real-life tasks along with the comparison between different models.

PDF