Research on expression recognition model based on multimodal hierarchical graph comparison learning

Mo, Xiaoyao; Wang, Hairui; Zhu, Guifu

doi:70517/ijhsa463203

Research article
DOI: https://doi.org/10.70517/ijhsa463203

Volume 46, Issue 3
Pages: 2436
-2448
Open Access
Download

Research on expression recognition model based on multimodal hierarchical graph comparison learning

By: ^¹, ^¹, ^²

¹Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan, 650500, China

²Information Technology Construction Management Center, Kunming University of Science and Technology, Kunming,Yunnan, 650500, China

Published: 04/08/2025

Abstract

In response to the limitations of existing methods in dynamic modeling of complex expressions, multimodal data quality optimization, and hierarchical feature fusion, this paper proposes a hierarchical graph comparison learning model based on local and global features. This model integrates graph neural network and contrastive learning techniques. It captures expression details by constructing local graphs, models cross-modal semantic collaboration through global graphs, and introduces an automatic graph enhancement strategy to improve the model’s generalization ability. In the multimodal feature extraction stage, key features are accurately obtained from the video, audio, and text modalities respectively, and then the features are integrated through the intra-modal attention and multimodal fusion mechanisms. The experiments use the CMU-MOSI and CMU-MOSEI datasets. The results show that compared with multiple benchmark models, the model proposed in this paper performs better in terms of accuracy, recall rate, F1 score, and other indicators, and its mean square error is at a relatively good level. It can effectively integrate multimodal information, has excellent performance in the expression recognition task, and provides new ideas and methods for the development of this field.

Keywords: Multimodality, Deep Learning, Expression Recognition, Graph Convolutional Network

On this page

Research on expression recognition model based on multimodal hierarchical graph comparison learning

Abstract