Multi-stream Convolutional Human Action Recognition Based on the Fusion of Spatio-Temporal Domain Attention Module

WU Ziyi; CHEN Minrong

doi:10.6054/j.jscnun.2023043

WU Ziyi, CHEN Minrong. Multi-stream Convolutional Human Action Recognition Based on the Fusion of Spatio-Temporal Domain Attention Module[J]. Journal of South China Normal University (Natural Science Edition), 2023, 55(3): 119-128. DOI: 10.6054/j.jscnun.2023043

Citation:

Multi-stream Convolutional Human Action Recognition Based on the Fusion of Spatio-Temporal Domain Attention Module

Graphical Abstract

Abstract

Abstract

In order to better extract and fuse the temporal and spatial features in the human skeleton, a multi-stream convolutional neural network (AE-MCN) that integrates spatio-temporal domain attention module is constructed in this paper. Aiming at the problem that most methods ignore the human motion characteristics when mo-deling the correlation of skeleton sequences, so that the scale of the action is not properly modeled, an adaptive selection motion-scale module is introduced in this paper, which can automatically extract key temporal features from the original scale action features; in order to better model features in the temporal and spatial dimensions, an attention module integrates spatio-temporal domain is designed to help the network extract more effective action information by assigning weights to high-dimensional spatio-temporal features. Finally, the comparative experiments were conducted on three commonly used human action recognition datasets (NTU60, JHMDB and UT-Kinect) to verify the effectiveness of the network AE-MCN proposed in this paper. The experimental results proved that compared with ST-GCN, SR-TSL and other networks, the network AE-MCN has achieved better recognition results, which proved that AE-MCN can effectively extract and model the action information, so as to obtain better action recognition performance.

FullText(HTML)

References (44)

Cited By

Turn off MathJax

Article Contents

Multi-stream Convolutional Human Action Recognition Based on the Fusion of Spatio-Temporal Domain Attention Module

Abstract

Catalog

Export File

Citation

Format

Content