版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Xidian Univ Sch Elect Engn Xian 710071 Shaanxi Peoples R China
出 版 物:《NEUROCOMPUTING》 (神经计算)
年 卷 期:2019年第325卷
页 面:90-100页
核心收录:
学科分类:08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:National Natural Science Foundation of China International Cooperation Project of Shaanxi Province [2016KW-042] Shaanxi Province key project of Research and Development Plan research project [S2018-YF-ZDGY-0187] International Cooperation Project of Shaanxi Province research project [S2018-YF-GHMS-0061]
主 题:Action recognition Two-stream framework Feature fusion 3D convolution Spatio-temporal feature
摘 要:In order to improve recognition accuracy, a two-stream framework which incorporates deep-learned stream and hand-crafted stream is proposed. Firstly, a discriminant nonlinear feature fusion method is proposed, which introduces the category structure information and obtains the nonlinear relationships between features. Secondly, the global features and local features are extracted from deep network, both of them are fused by the proposed fusion method to obtain a discriminant deep descriptor. Thirdly, to capture the spatio-temporal characteristics of video, the temporal derivatives of gradient, optical flow and motion boundary are extracted in the space-time cube centered at trajectory and taken as low-level features. Subsequently, the covariance and kernelized covariance of low-level features are respectively computed to obtain the Covariance Matrix based on Dense Trajectory (CMDT) and Kernelized Covariance Matrix based on Dense Trajectory (KCMDT) descriptors. Finally, a two-stream framework with discriminant deep descriptor, linear CMDT and nonlinear KCMDT descriptors (D-3-LND) is presented, which shares the benefits of both deep-learned and hand-crafted features, and further improves recognition accuracy. Experiments on challenging HMDB51 and UCF101 datasets verify the effectiveness of our method. (C) 2018 Elsevier B.V. All rights reserved.