版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Tianjin Univ Technol Tianjin Key Lab Intelligence Comp & Novel Softwar Key Lab Comp Vis & Syst Minist Educ Tianjin 300384 Peoples R China Qilu Univ Technol Shandong Artif Intelligence Inst Shandong Acad Sci Jinan 250014 Peoples R China Jiangxi Vocat Tech Coll Ind Trade Nanchang 330038 Jiangxi Peoples R China China Unicorn Yantai Branch Yantai 264006 Peoples R China
出 版 物:《MULTIMEDIA TOOLS AND APPLICATIONS》 (多媒体工具和应用)
年 卷 期:2020年第79卷第45-46期
页 面:34011-34027页
核心收录:
学科分类:0808[工学-电气工程] 08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:National Natural Science Foundation of China [61872270, 61572357, 61971309] National Key R&D Program of China [2019YFBB1404700] Young creative team in universities of Shandong Province [2020KJN012] Tianjin Key Laboratory of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, China Tianjin Municipal Natural Science Foundation [18JCYBJC85500] NSF project of Tianjin [17JCYBJC15600] Jinan 20 projects in universities [2018GXRC014]
主 题:3D object retrieval Non-local graph neural network 3D shape descriptors
摘 要:3D object retrieval is a hot research field in computer vision and multimedia analysis domain. Since the appearance feature and points of view of 3D objects are very different, thus, the distribution of the training set and test set are variant which is very suitable for transfer learning or cross-domain learning. In the transfer learning or cross-domain learning, the feature extraction is very important which should have good robust for different domains. Thus, in this work, we pay attention to the feature extraction of 3D objects. So far, different feature representations and object retrieval approaches have been proposed. Among them, view-based deep learning retrieval methods achieve state-of-the-art performance, but the existing deep learning retrieval methods only simply use a deep neural network to extract features from each view and directly obtain the view-level shape descriptors without utilizing the spatial relationship between the views. In order to mine the spatial relationship among different views and obtain more discriminative 3D shape descriptors, in this work, 3D object retrieval based on non-local graph neural networks (NGNN) is proposed. In detail, the residual network is firstly utilized as the infrastructure, and then the non-local structure is embedded in the resnet to learn the intrinsic relationship between the views. Finally, the view pooling layer is employed to further fuse the information from different views, and obtain the discriminate feature for the 3D object. Experimental results on two public MVRED and NTU 3D datasets show that the non-local graph network is very efficient for exploring the latent relationship among different views, and the performance ofNGNNsignificantly outperforms state-of-the-art approaches whose improvement can reaches 12.4%-22.7% on ANMRR.