检索结果-内蒙古大学图书馆

Dual Interactive Graph Convolutional Networks for Hyperspectral image Classification

IEEE Transactions on Geoscience and Remote Sensing 2022年 60卷 1页

作者： Wan, Sheng Pan, Shirui Zhong, Ping Chang, Xiaojun Yang, Jian Gong, Chen Pca Laboratory Key Lab. of Intelligent Percept. and Syst. for High-Dimensional Information of Ministry of Education Jiangsu Key Laboratory of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology Nanjing210094 China Faculty of Information Technology Monash University ClaytonVIC3800 Australia National Key Laboratory of Science and Technology on Atr National University of Defense Technology Changsha410073 China Pca Laboratory Key Lab. of Intelligent Percept. and Syst. for High-Dimensional Information of Ministry of Education Nanjing University of Science and Technology Nanjing210094 China Department of Computing Hong Kong Polytechnic University Hong Kong Hong Kong

Recently, graph convolutional network (GCN) has progressed significantly and gained increasing attention in hyperspectral image (HSI) classification due to its impressive representation power. However, existing GCN-based methods do not give full consideration to the multiscale spatial information, since the convolution operations are governed by fixed neighborhood. As a result, their performances can be limited, particularly in the regions with diverse land cover appearances. In this article, we develop a new dual interactive GCN (DIGCN) which introduces the dual GCN branches to capture spatial information at different scales. More significantly, the dual interactive module is embedded across the GCN branches, so that the correlation of multiscale spatial information can be leveraged to refine the graph information. To be concrete, the edge information contained in one GCN branch can be refined by incorporating the feature representations from the other branch. Analogously, improved feature representations can be generated in one GCN branch by fusing the edge information from the other branch. As such, the refined graph information can help enhance the representation power of the model. Furthermore, to avoid the negative effects of the manually constructed graph, our proposed model adaptively learns a discriminative region-induced graph, which also accelerates the convolution operation. We comprehensively evaluate the proposed method on four commonly used HSI benchmark data sets, and the state-of-the-art results can be achieved when compared with several typical HSI classification methods. © 1980-2012 IEEE.

关键词： Convolution

来源：评论

学校读者我要写书评

暂无评论

Loss Decomposition and Centroid Estimation for Positive and Unlabeled Learning

引用

IEEE transactions on pattern analysis and machine intelligence 2021年第3期43卷 918-932页

作者： Chen Gong Hong Shi Tongliang Liu Chuang Zhang Jian Yang Dacheng Tao PCA Lab the Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education Nanjing University of Science and Technology Nanjing P.R. China PCA Lab the Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education Jiangsu Key Laboratory of Image and Video Understanding for Social Security the School of Computer Science and Engineering Nanjing University of Science and Technology Nanjing P.R. China UBTECH Sydney Artificial Intelligence Centre School of Computer Science Faculty of Engineering University of Sydney Darlington NSW Australia

This paper studies Positive and Unlabeled learning (PU learning), of which the target is to build a binary classifier where only positive data and unlabeled data are available for classifier training. To deal with the absence of negative training data, we first regard all unlabeled data as negative examples with false negative labels, and then convert PU learning into the risk minimization problem in the presence of such one-side label noise. Specifically, we propose a novel PU learning algorithm dubbed "Loss Decomposition and Centroid Estimation" (LDCE). By decomposing the loss function of corrupted negative examples into two parts, we show that only the second part is affected by the noisy labels. Thereby, we may estimate the centroid of corrupted negative set via an unbiased way to reduce the adverse impact of such label noise. Furthermore, we propose the "Kernelized LDCE" (KLDCE) by introducing the kernel trick, and show that KLDCE can be easily solved by combining Alternative Convex Search (ACS) and Sequential Minimal Optimization (SMO). Theoretically, we derive the generalization error bound which suggests that the generalization risk of our model converges to the empirical risk with the order of O(1/√k+1/√{n-k}+1/√n) ( n and k are the amounts of training data and positive data correspondingly). Experimentally, we conduct intensive experiments on synthetic dataset, UCI benchmark datasets and real-world datasets, and the results demonstrate that our approaches (LDCE and KLDCE) achieve the top-level performance when compared with both classic and state-of-the-art PU learning methods.

关键词： Learning Artificial Intelligence Minimisation Pattern Classification Positive Data KLDCE Loss Decomposition Centroid Estimation Positive Learning Unlabeled Learning Binary Classifier Unlabeled Data Classifier Training Negative Training Data False Negative labels Risk Minimization Problem PU Learning Algorithm Loss Function Corrupted Negative Examples Noisy labels Corrupted Negative Set Generalization Risk Empirical Risk Kernelized LDCE One Side label Noise Alternative Convex Search ACS Sequential Minimal Optimization SMO Generalization Error Bound Estimation Training Supervised Learning Noise Measurement Analytical Models Risk Management Kernel PU Learning Loss Decomposition Centroid Estimation Kernel Extension Generalization Bound

来源：评论

学校读者我要写书评

暂无评论

Dynamic thresholding networks for schizophrenia diagnosis

引用

Artificial Intelligence in Medicine 2019年 96卷 25-32页

作者： Zou, Hongliang Yang, Jian PCA Lab Key Lab of Intelligent Perception and systems for High-Dimensional Information of Ministry of Education Jiangsu Key Lab of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology Nanjing 210094 China

Background and objective: Functional connectivity (FC) based on resting-state functional magnetic resonance imaging (rs-fMRI) is an effective approach to describe the neural interaction between distributed brain regions. Recent progress in neuroimaging study reported that the connection between regions is time-varying, which may enhance understanding of normal cognition and alterations that result from brain disorders. However, conventional sliding window based dynamic FC (DFC) analysis has several drawbacks, including arbitrary choice of window length, inaccurate descriptor of FC, and the fact that many spurious connections were included in the fully-connected networks due to noise. This study aims to develop an effective dynamic thresholding brain networks method to diagnose schizophrenia. Methods: In this study, we proposed a time-varying window length DFC method based on dynamic time warping to construct brain functional networks. To further eliminate the influence of spurious connections caused by noise, orthogonal minimum spanning tree was applied in these networks to generate time-varying window length dynamic thresholding FC (TVWDTFC) networks. To validate the effectiveness of our proposed method, experiments were conducted on a dataset, which including 56 individuals with schizophrenia and 74 healthy controls. Results: We achieved a classification accuracy of 0.8077 (p < 0.001, permutation test) using support vector machine. Experimental results demonstrated that the proposed method outperforms several state-of-the-art approaches, which verified the effectiveness of our proposed TVWDTFC method in schizophrenia diagnosis. Additionally, we also found that the selected discriminative features were mostly distributed in frontal, parietal, and limbic area. Conclusions: The results suggest that our approach may be a promising tool for computer-aided diagnosis of schizophrenia. © 2019 Elsevier B.V.

关键词： Dynamic time warping Orthogonal minimum spanning tree rs-fMRI Schizophrenia Time-varying window length DFC

来源：评论

学校读者我要写书评

暂无评论

RTM3D: Real-time monocular 3D detection from object keypoints for autonomous driving

arXiv

引用

arXiv 2020年

作者： Li, Peixuan Zhao, Huaici Liu, Pengfei Cao, Feidao Shenyang Institute of Automation Chinese Academy of Sciences Institutes for Robotics and Intelligent Manufacturing Chinese Academy of Sciences University of Chinese Academy of Sciences Key Laboratory of Opto-Electronic Information Processing Chinese Academy of Sciences Key Lab of Image Understanding and Computer Vision Liaoning Province

In this work, we propose an efficient and accurate monocular 3D detection framework in single shot. Most successful 3D detectors take the projection constraint from the 3D bounding box to the 2D box as an important component. Four edges of a 2D box provide only four constraints and the performance deteriorates dramatically with the small error of the 2D detector. Different from these approaches, our method predicts the nine perspective keypoints of a 3D bounding box in image space, and then utilize the geometric relationship of 3D and 2D perspectives to recover the dimension, location, and orientation in 3D space. In this method, the properties of the object can be predicted stably even when the estimation of keypoints is very noisy, which enables us to obtain fast detection speed with a small architecture. Training our method only uses the 3D properties of the object without the need for external networks or supervision data. Our method is the first real-time system for monocular image 3D detection while achieves state-ofthe-art performance on the KITTI benchmark. Code will be released at https://***/Banconxuan/RTM3D. Copyright © 2020, The Authors. All rights reserved.

关键词： Real time systems

来源：评论

学校读者我要写书评

暂无评论

Learning the redundancy-free features for generalized zero-shot object recognition

arXiv

引用

arXiv 2020年

作者： Han, Zongyan Fu, Zhenyong Yang, Jian PCA Lab Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education Jiangsu Key Lab of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology

Zero-shot object recognition or zero-shot learning aims to transfer the object recognition ability among the semantically related categories, such as fine-grained animal or bird species. However, the images of different fine-grained objects tend to merely exhibit subtle differences in appearance, which will severely deteriorate zero-shot object recognition. To reduce the superfluous information in the fine-grained objects, in this paper, we propose to learn the redundancy-free features for generalized zero-shot learning. We achieve our motivation by projecting the original visual features into a new (redundancy-free) feature space and then restricting the statistical dependence between these two feature spaces. Furthermore, we require the projected features to keep and even strengthen the category relationship in the redundancy-free feature space. In this way, we can remove the redundant information from the visual features without losing the discriminative information. We extensively evaluate the performance on four benchmark datasets. The results show that our redundancy-free feature based generalized zero-shot learning (RFF-GZSL) approach can achieve competitive results compared with the state-of-the-arts. Copyright © 2020, The Authors. All rights reserved.

关键词： Object recognition

来源：评论

学校读者我要写书评

暂无评论

Progressive point cloud deconvolution generation network

arXiv

引用

arXiv 2020年

作者： Hui, Le Xu, Rui Xie, Jin Qian, Jianjun Yang, Jian Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education Jiangsu Key Lab of Image Video Understanding for Social Security PCA Lab School of Computer Science and Engineering Nanjing University of Science and Technology China

In this paper, we propose an effective point cloud generation method, which can generate multi-resolution point clouds of the same shape from a latent vector. Specifically, we develop a novel progressive deconvolution network with the learning-based bilateral interpolation. The learning-based bilateral interpolation is performed in the spatial and feature spaces of point clouds so that local geometric structure information of point clouds can be exploited. Starting from the low-resolution point clouds, with the bilateral interpolation and max-pooling operations, the deconvolution network can progressively output high-resolution local and global feature maps. By concatenating different resolutions of local and global feature maps, we employ the multi-layer perceptron as the generation network to generate multi-resolution point clouds. In order to keep the shapes of different resolutions of point clouds consistent, we propose a shape-preserving adversarial loss to train the point cloud deconvolution generation network. Experimental results demonstrate the effectiveness of our proposed method. Copyright © 2020, The Authors. All rights reserved.

关键词： Interpolation

来源：评论

学校读者我要写书评

暂无评论

Sequential 3D Human Pose and Shape Estimation From Point Clouds

Sequential 3D Human Pose and Shape Estimation From Point Clo...

引用

Conference on computer vision and Pattern Recognition (CVPR)

作者： Kangkan Wang Jin Xie Guofeng Zhang Lei Liu Jian Yang Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education Jiangsu Key Lab of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology China State Key Laboratory of CAD&CG Zhejiang University China ZJU-SenseTime Joint Lab of 3D Vision

ISBN: (数字)9781728171685

ISBN: (纸本)9781728171692

This work addresses the problem of 3D human pose and shape estimation from a sequence of point clouds. Existing sequential 3D human shape estimation methods mainly focus on the template model fitting from a sequence of depth images or the parametric model regression from a sequence of RGB images. In this paper, we propose a novel sequential 3D human pose and shape estimation framework from a sequence of point clouds. Specifically, the proposed framework can regress 3D coordinates of mesh vertices at different resolutions from the latent features of point clouds. Based on the estimated 3D coordinates and features at the low resolution, we develop a spatial-temporal mesh attention convolution (MAC) to predict the 3D coordinates of mesh vertices at the high resolution. By assigning specific attentional weights to different neighboring points in the spatial and temporal domains, our spatial-temporal MAC can capture structured spatial and temporal features of point clouds. We further generalize our framework to the real data of human bodies with a weakly supervised fine-tuning method. The experimental results on SURREAL, Human3.6M, DFAUST and the real detailed data demonstrate that the proposed approach can accurately recover the 3D body model sequence from a sequence of point clouds.

关键词： Three-dimensional displays Solid modeling Shape Biological system modeling Estimation Spatial resolution Feature extraction

来源：评论

学校读者我要写书评

暂无评论

Instance-aware graph convolutional network for multi-label classification

arXiv

引用

arXiv 2020年

作者： Wang, Yun Zhang, Tong Cui, Zhen Xu, Chunyan Yang, Jian PCA Lab Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education Jiangsu Key Lab of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology Nanjing China

Graph convolutional neural network (GCN) has effectively boosted the multi-label image recognition task by introducing label dependencies based on statistical label co-occurrence of data. However, in previous methods, label correlation is computed based on statistical information of data and therefore the same for all samples, and this makes graph inference on labels insufficient to handle huge variations among numerous image instances. In this paper, we propose an instance-aware graph convolutional neural network (IA-GCN) framework for multi-label classification. As a whole, two fused branches of sub-networks are involved in the framework: a global branch modeling the whole image and a region-based branch exploring dependencies among regions of interests (ROIs). For label diffusion of instance-awareness in graph convolution, rather than using the statistical label correlation alone, an image-dependent label correlation matrix (LCM), fusing both the statistical LCM and an individual one of each image instance, is constructed for graph inference on labels to inject adaptive information of label-awareness into the learned features of the model. Specifically, the individual LCM of each image is obtained by mining the label dependencies based on the scores of labels about detected ROIs. In this process, considering the contribution differences of ROIs to multi-label classification, variational inference is introduced to learn adaptive scaling factors for those ROIs by considering their complex distribution. Finally, extensive experiments on MS-COCO and VOC datasets show that our proposed approach outperforms existing state-of-the-art methods. Copyright © 2020, The Authors. All rights reserved.

关键词： Convolution

来源：评论

学校读者我要写书评

暂无评论

Cascaded Non-local Neural Network for Point Cloud Semantic Segmentation

Cascaded Non-local Neural Network for Point Cloud Semantic S...

引用

2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

作者： Mingmei Cheng Le Hui Jin Xie Jian Yang Hui Kong PCA Lab Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education and Jiangsu Key Lab of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology Nanjing China

ISBN: (数字)9781728162126

ISBN: (纸本)9781728162133

In this paper, we propose a cascaded non-local neural network for point cloud segmentation. The proposed network aims to build the long-range dependencies of point clouds for the accurate segmentation. Specifically, we develop a novel cascaded non-local module, which consists of the neighborhood-level, superpoint-level and global-level non-local blocks. First, in the neighborhood-level block, we extract the local features of the centroid points of point clouds by assigning different weights to the neighboring points. The extracted local features of the centroid points are then used to encode the superpoint-level block with the non-local operation. Finally, the global-level block aggregates the non-local features of the superpoints for semantic segmentation in an encoder-decoder framework. Benefiting from the cascaded structure, geometric structure information of different neighborhoods with the same label can be propagated. In addition, the cascaded structure can largely reduce the computational cost of the original non-local operation on point clouds. Experiments on different indoor and outdoor datasets show that our method achieves state-of-the-art performance and effectively reduces the time consumption and memory occupation.

关键词： Three-dimensional displays Semantics Neural networks Stacking Memory management Feature extraction Computational efficiency

来源：评论

学校读者我要写书评

暂无评论

Multi-task learning for object keypoints detection and classification

引用

Pattern Recognition Letters 2020年 130卷 182-188页

作者： Jie Xu Lin Zhao Shanshan Zhang Chen Gong Jian Yang PCA Lab Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education and Jiangsu Key Lab of Image and Video Understanding for Social Security School of Computer Science and Engineering Nanjing University of Science and Technology Nanjing 210094 China State Key Laboratory of Integrated Services Networks Xidian Univeristy Xi’an 710071 China

Object keypoints detection and classification are both central research topics in computer vision . Due to their wide range potential applications in the real world, substantial efforts have been taken to advance their performance. However, these two related tasks are mainly treated separately in previous works. We argue that keypoints detection and classification can be complementary tasks and beneficial to each other. Knowing the category of a object is able to reduce the searching space of keypoints detection models and facilitate more precise localization . On the other hand, having the knowledge of object keypoints can make classification models pay more attention on areas that are more associated with the object, which will inevitably promote classification accuracy . Embracing this observation, we propose to model keypoints detection and classification in a multi-task learning framework. Specifically, a multi-task deep network is designed and trained to conduct both tasks, where we devise the model structure delicately to carry out sufficient training of both tasks. Extensive experiments are set up on the AIFASHION DATASET and Human3.6M DATASET to validate our proposal, we show that our algorithm outperforms separate models trained individually on each task.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：