On this page

Human Skeletal Joint Pose-Action Bimodal Recognition Model Based on Graph Neural Networks

By: Ge Zhang 1
1Network Information Service Center, Henan University of Economics and Law, Zhengzhou, Henan, 450000, China

Abstract

Human action recognition technology plays a crucial role in the field of computer vision, where it is constantly advancing and being widely adopted. Among the various techniques, graph neural networks are currently the mainstream method for processing unstructured skeleton sequences. However, research on action recognition based on skeleton data still faces several key challenges. This paper establishes a graph convolutional neural network (GCN) model based on the theoretical framework of GCN and key pose estimation/reduction for human skeletons. It extracts features from human skeleton data and introduces a 3D concept, approaching the task from both temporal and spatial dimensions to perform action recognition on the extracted features. The model was tested on the NTU-RGBD dataset under the CS and CV standards. The recognition accuracy rates of the GCN and 3D-GCN models under the CS standard were 75.853% and 78.251%, respectively. Under the CV standard, the recognition accuracy rates were 82.294% and 86.381%, respectively. The 3D-GCN model proposed in this paper achieved a higher recognition accuracy rate. The 3D-GCN model achieved an accuracy rate of 91.941% for recognizing four actions: falling, running, kicking, and squatting, demonstrating good performance in human action recognition.