Citation: | WU Ziyi, CHEN Minrong. Multi-stream Convolutional Human Action Recognition Based on the Fusion of Spatio-Temporal Domain Attention Module[J]. Journal of South China Normal University (Natural Science Edition), 2023, 55(3): 119-128. DOI: 10.6054/j.jscnun.2023043 |
[1] |
BACCOUCHE M, MAMALET F, WOLF C, et al. Sequential deep learning for human action recognition[C]//International Workshop on Human Behavior Understanding. Berlin: Springer, 2011: 29-39.
|
[2] |
FEICHTENHOFER C, PINZ A, ZISSERMAN A. Convolutional two-stream network fusion for video action recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016: 1933-1941.
|
[3] |
SUN L, JIA K, YEUNG D Y, et al. Human action recognition using factorized spatio-temporal convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision. Santiago: IEEE Computer Society, 2015: 4597-4605.
|
[4] |
LIU Z, ZHANG C, TIAN Y. 3D-based deep convolutional neural network for action recognition with depth sequences[J]. Image and Vision Computing, 2016, 55: 93-100. doi: 10.1016/j.imavis.2016.04.004
|
[5] |
KIM T S, REITER A. Interpretable 3D human action ana-lysis with temporal convolutional networks[C]//Procee-dings of the IEEE Conference on Computer Vision and Pa-ttern Recognition Workshops. Honolulu: IEEE, 2017: 1623-1631.
|
[6] |
MOON G, CHANG J Y, LEE K M. Posefix: Model-agnostic general human pose refinement network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 7773-7781.
|
[7] |
CAO Z, HIDALGO G, SIMON T, et al. OpenPose: realtime multi-person 2D pose estimation using part affinity fields[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 43(1): 172-186.
|
[8] |
CHEN Y L, WANG Z C, PENG Y X, et al. Cascaded pyramid network for multi-person pose estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7103-7112.
|
[9] |
CAO Z, SIMON T, WEI S E, et al. Realtime multi-person 2D pose estimation using part affinity fields[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 7291-7299.
|
[10] |
GREFF K, SRIVASTAVA R K, KOUTNÍK J, et al. LSTM: a search space odyssey[J]. IEEE Transactions on Neural Networks and Learning Systems, 2016, 28(10): 2222-2232.
|
[11] |
LEE I, KIM D, KANG S, et al. Ensemble deep learning for skeleton-based action recognition using temporal sliding LSTM networks[C]//Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 1012-1020.
|
[12] |
YAN S J, XIONG Y J, LIN D H. Spatial temporal graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence. New Orleans, Louisiana: AAAI Press, 2018: 7444-7452.
|
[13] |
LI S J, YI J H, FARHA Y A, et al. Pose refinement graph convolutional network for skeleton-based action recognition[J]. IEEE Robotics and Automation Letters, 2021, 6(2): 1028-1035. doi: 10.1109/LRA.2021.3056361
|
[14] |
刘芳, 乔建忠, 代钦, 等. 基于双流多关系GCNs的骨架动作识别方法[J]. 东北大学学报(自然科学版), 2021, 42(6): 768-774. https://www.cnki.com.cn/Article/CJFDTOTAL-DBDX202106002.htm
LIU F, QIAO J Z, DAI Q, et al. Skeleton-based action recognition method with two-stream multi-relational GCNs[J]. Journal of Northeastern University(Natural Science), 2021, 42(6): 768-774. https://www.cnki.com.cn/Article/CJFDTOTAL-DBDX202106002.htm
|
[15] |
兰红, 何璠, 张蒲芬. 基于增强型图卷积的骨架识别模型[J/OL]. 计算机应用研究, 2021, 38(12): 3791-3795;3825.
LAN H, HE F, ZHANG P F. Skeleton recognition model based on enhanced graph convolution[J]. Application Research of Computers, 2021, 38(12): 3791-3795;3825.
|
[16] |
ZHANG P F, LAN C L, ZENG W J, et al. Semantics-Guided neural networks for efficient Skeleton-based human action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle: IEEE, 2020: 1109-1118.
|
[17] |
SI C Y, JING Y, WANG W, et al. Skeleton-based action recognition with hierarchical spatial reasoning and temporal stack learning network[J]. Pattern Recognition, 2020, 107: 107511/1-16. doi: 10.1016/j.patcog.2020.107511
|
[18] |
LUDL D, GULDE T, CURIO C. Simple yet efficient real-time pose-based action recognition[C]//Proceedings of the IEEE Intelligent Transportation Systems Conference. Auckland: IEEE, 2019: 581-588.
|
[19] |
PAVLLO D, FEICHTENHOFER C, GRANGIER D, et al. 3D human pose estimation in video with temporal convolutions and semi-supervised training[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 7753-7762.
|
[20] |
LI C, ZHONG Q Y, XIE D, et al. Co-occurrence feature learning from Skeleton data for action recognition and detection with hierarchical aggregation[C]//Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. New Orleans: AAAI Press, 2018: 786-792.
|
[21] |
YANG F, WU Y, SAKTI S, et al. Make skeleton-based action recognition model smaller, faster and better[C]//Proceedings of the ACM Multimedia Asia. New York: ACM, 2019: 1-6.
|
[22] |
JIE H, LI S, GANG S, et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(8): 2011-2023. doi: 10.1109/TPAMI.2019.2913372
|
[23] |
HEIDARI N, IOSIFIDIS A. Temporal attention-augmented graph convolutional network for efficient skeleton-based human action recognition[C]//Proceedings of the 25th International Conference on Pattern Recognition. Milan: IEEE, 2021: 7907-7914.
|
[24] |
FAN Y B, WENG S C, ZHANG Y, et al. Context-aware cross-attention for skeleton-based human action recognition[J]. IEEE Access, 2020, 8: 15280-15290. doi: 10.1109/ACCESS.2020.2968054
|
[25] |
SI C Y, CHEN W T, WANG W, et al. An attention enhanced graph convolutional LSTM network for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 1227-1236.
|
[26] |
ZHANG S, LIU X, XIAO J. On geometric features for skeleton-based action recognition using multilayer LSTM networks[C]//Proceedings of the IEEE Winter Confe-rence on Applications of Computer Vision. Santa Rosa: IEEE, 2017: 148-157.
|
[27] |
ZHANG S, YANG Y, XIAO J, et al. Fusing geometric features for skeleton-based action recognition using multilayer LSTM networks[J]. IEEE Transactions on Multimedia, 2018, 20(9): 2330-2343. doi: 10.1109/TMM.2018.2802648
|
[28] |
WANG H, WANG L. Modeling temporal dynamics and spatial configurations of actions using two-stream recurrent neural networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017: 499-508.
|
[29] |
SONG S, LAN C, XING J, et al. An end-to-end spatio-temporal attention model for human action recognition from skeleton data[C]//Proceedings of the AAAI Confe-rence on Artificial Intelligence. San Francisco: AAAI, 2017: 4263-4270.
|
[30] |
HOU J, WANG G, CHEN X, et al. Spatial-temporal attention res-TCN for skeleton-based dynamic hand gesture recognition[C]//Proceedings of the European Conference on Computer Vision (ECCV) Workshops. Berlin: Springer, 2018: 273-286.
|
[31] |
SHAHROUDY A, LIU J, NG T T, et al. Ntu rgb+ d: A large scale dataset for 3d human activity analysis[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Lasvegas: IEEE, 2016: 1010-1019.
|
[32] |
JHUANG H, GALL J, ZUFFI S, et al. Towards understanding action recognition[C]//Proceedings of the IEEE International Conference on Computer Vision. Sydney: IEEE, 2013: 3192-3199.
|
[33] |
XIA L, CHEN C C, AGGARWAL J K. View invariant human action recognition using histograms of 3d joints[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. Providence: IEEE, 2012: 20-27.
|
[34] |
PASZKE A, GROSS S, CHINTALA S, et al. Automatic differentiation in Pytorch[J/OL]. NIPS-W 2017 Workshop Autodiff Submission, (2017-10-29)[2022-03-20]. https://openreview.net/forum?id=BJJsrmfCZ¬eId=rkK3fzZJz.
|
[35] |
KINGMA D, BA J. Adam: A method for stochastic optimization[J]. Computer Science, 2015, 5: 7-9.
|
[36] |
HE T, ZHANG Z, ZHANG H, et al. Bag of tricks for image classification with convolutional neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2019: 558-567.
|
[37] |
ZHANG P F, LAN C L, XING J L, et al. View adaptive recurrent neural networks for high performance human action recognition from skeleton data[C]//Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2136-2145.
|
[38] |
ZHANG P F, XUE J R, LAN C L, et al. Adding attentiveness to the neurons in recurrent neural networks[C]//Proceedings of the 15th Computer Vision-ECCV European Conference. Berlin: Springer, 2018: 136-152.
|
[39] |
TANG Y S, TIAN Y, JIWEN L. et al. Deep progressive reinforcement learning for skeleton-based action recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 5323-5332.
|
[40] |
ZOLFAGHARI M, OLIVEIRA G L, SEDAGHAT N, et al. Chained multi-stream networks exploiting pose, motion, and appearance for action classification and detection[C]//Proceedings of the IEEE International Conference on Computer Vision. Venice: IEEE, 2017: 2904-2913.
|
[41] |
CHOUTAS V, WEINZAEPFEL P, REVAUD J, et al. Potion: Pose motion representation for action recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Salt Lake City: IEEE, 2018: 7024-7033.
|
[42] |
ZHU Y, CHEN W, GUO G. Fusing spatiotemporal features and joints for 3D action recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. Portland: IEEE, 2013: 486-491.
|
[43] |
ANIRUDH R, TURAGA P, SU J, et al. Elastic functional coding of human actions: from vector-fields to latent variables[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 3147-3155.
|
[44] |
KAO J Y, ORTEGA A, TIAN D, et al. Graph based skeleton modeling for human activity analysis[C]//Procee-dings of the IEEE International Conference on Image Processing. Taipei: IEEE, 2019: 2025-2029.
|
1. |
陈威,葛士顺. 竞技武术套路中难度动作智能识别方法. 新乡学院学报. 2024(06): 72-76 .
![]() | |
2. |
徐静. 基于感知学习算法的啦啦操动作风格识别与性能分析. 景德镇学院学报. 2024(03): 48-52 .
![]() | |
3. |
廖民玲. 基于显著性特征的多视角人体动作图像识别研究. 现代电子技术. 2024(24): 143-147 .
![]() |