Deep learning methods have been successfully applied to various computer vision tasks. However, existing neural network architectures do not per se incorporate domain knowledge about the addressed problem, thus, under...
详细信息
ISBN:
(纸本)9789082797039
Deep learning methods have been successfully applied to various computer vision tasks. However, existing neural network architectures do not per se incorporate domain knowledge about the addressed problem, thus, understanding what the model has learned is an open research topic. In this paper, we rely on the unfolding of an iterative algorithm for sparse approximation with side information, and design a deep learning architecture for multimodal image super-resolution that incorporates sparse priors and effectively utilizes information from another image modality. We develop two deep models performing reconstruction of a high-resolution image of a target image modality from its low-resolution variant with the aid of a high-resolution image from a second modality. We apply the proposed models to super-resolve near-infrared images using as side information high-resolution RGB images. Experimental results demonstrate the superior performance of the proposed models against state-of-the-art methods including unimodal and multimodal approaches.
At present, dynamic facial expression feature learning methods are mainly divided into traditional hand-crafted features and self-learning features. The process of feature extraction in the hand-crafted features metho...
详细信息
ISBN:
(纸本)9781538681787
At present, dynamic facial expression feature learning methods are mainly divided into traditional hand-crafted features and self-learning features. The process of feature extraction in the hand-crafted features method is cumbersome and the recognition accuracy mainly depends on the quality of the features. Although deep learning method overcomes this problem, it is computationally intensive and poorly interpretable. In order to reduce the complexity of LSTM for sequence processing, this paper introduced a sparse representation method for feature learning, and the network structure termed SLSTM had the theoretical basis of ISTA. The SLSTM were applied on several facial expression recognition tasks, such as pain detection on Biovid database, and facial micro-expression recognition on CASME II and SMIC. The experimental results were compared with the classic LSTM network and most of the popular applied methods. It showed that our proposed method showed superiority in the comprehensive results with low computational complexity.
A palmprint recognition method is proposed by local sparse representation. The method consists of two problems: palmprint feature extraction problem and palmprint recognition problem. In the aspect of feature extracti...
详细信息
ISBN:
(纸本)9781728140940
A palmprint recognition method is proposed by local sparse representation. The method consists of two problems: palmprint feature extraction problem and palmprint recognition problem. In the aspect of feature extraction, the weighted shape index feature is adopted to describe three-dimensional surface. As for the recognition problem, the two-stage classification method is proposed by local sparse representation. Firstly, the similarity is used to construct the sample subset, which reserves candidate classed of the test data set. Secondly, the sparse coding classifier is used to obtain the palmprint category. The experimental results and comparisons on the Hong Polytechnic University palm data set verify that the proposed approach has better effectiveness than the traditional methods.
With the prevalence of depth sensors and 3D scanning devices, point clouds have attracted increasing attention as a format for 3D object representation, with applications in various fields such as tele-presence, navig...
详细信息
ISBN:
(纸本)9781538662496
With the prevalence of depth sensors and 3D scanning devices, point clouds have attracted increasing attention as a format for 3D object representation, with applications in various fields such as tele-presence, navigation for autonomous driving and heritage reconstruction. However, point clouds usually exhibit holes of missing data, mainly due to the limitation of acquisition techniques and complicated structure. Hence, we propose an efficient inpainting method for the attribute (e.g., color) of point clouds, exploiting non-local selfsimilarity in graph spectral domain. Specifically, we represent irregular point clouds naturally on graphs, and split a point cloud into fixed-sized cubes as the processing unit. We then globally search for the most similar cubes to the target cube with holes inside, and compute the graph Fourier transform (GFT) basis from the similar cubes, which will be leveraged for the GFT representation of the target patch. We then formulate attribute inpainting as a sparse coding problem, imposing sparsity on the GFT representation of the attribute for hole filling. Experimental results demonstrate the superiority of our method.
This paper proposes to learn a discriminative dictionary for saliency detection. In addition to the conventional sparse coding mechanism that learns a representational dictionary of natural images for saliency predict...
详细信息
ISBN:
(纸本)9781479906505
This paper proposes to learn a discriminative dictionary for saliency detection. In addition to the conventional sparse coding mechanism that learns a representational dictionary of natural images for saliency prediction, this work uses supervised information from eye tracking experiments in training to enhance the discriminative power of the learned dictionary. Furthermore, we explicitly model saliency at multi-scale by formulating it as a multi-class problem, and a label consistency term is incorporated into the framework to encourage class (salient vs. non-salient) and scale consistency in the learned sparse codes. K-SVD is employed as the central computational module to efficiently obtain the optimal solution. Experiments demonstrate the superior performance of the proposed algorithm compared with the state-of-the-art in saliency prediction.
Background: Brain tissue segmentation plays an important role in biomedical research and clinical applications. Traditional segmentation is performed on T1-weighted and/or T2-weighted MRI images. Recently, brain segme...
详细信息
Background: Brain tissue segmentation plays an important role in biomedical research and clinical applications. Traditional segmentation is performed on T1-weighted and/or T2-weighted MRI images. Recently, brain segmentation based on diffusion weighted imaging (DWI) has attracted research interest due to its advantage in diffusion MRI image processing and anatomically-constrained tractography. New method: We propose a fully automated brain segmentation method based on sparse representation of DWI signals and applied it on nine healthy subjects of Human Connectome Project aged 25-35 years. Learning a dictionary from DWI signals of each subject, brain voxels are classified into gray matter (GM), white matter (WM), and cerebrospinal fluid (CSF) according to their sparse representation of clustered dictionary atoms, achieving good agreement with the segmentation on T1-weighted images using SPM12, as assessed by the DICE score. Results: The average DICE score for all nine subjects was 0.814 for CSF, 0.850 for GM, and 0.890 for WM. The proposed method is very fast and robust for a wide range of sparse coding parameter selection. It also works well on DWI data with less number of shells or gradient directions. Comparison with existing methods: On average, our segmentation results are superior to previous methods for all three brain tissue classes in terms of DICE scores. Conclusion: The proposed method demonstrates the feasibility of segmenting the brain solely based on the tissue response to diffusion encoding.
Recovering high quality images from microscopic observations is an essential technology in biological imaging. Existing recovery methods require solving an optimization problem by using iterative algorithms, which are...
详细信息
Recovering high quality images from microscopic observations is an essential technology in biological imaging. Existing recovery methods require solving an optimization problem by using iterative algorithms, which are computationally expensive and time consuming. The focus of this study is to accelerate the image recovery by using deep neural networks (DNNs). In our approach, we first train a certain type of DNN by using some observations from microscopes, so that it can well approximate the image recovery process. The recovery of a new observation is then computed thorough a single forward propagation in the trained DNN. In this study, we specifically focus on observations obtained by SPoD (Super-resolution by Polarization Demodulation), a recently developed microscopic technique, and accelerate the image recovery for SPoD by using DNNs. To this end, we propose SPoD-Net, a specifically tailored DNN for fast recovery of SPoD images. Unlike general DNNs, SPoD-Net can be parameterized using a small number of parameters, which is helpful in two ways: (i) it can be stored in a small memory, and (ii) it can be trained efficiently. We also propose a method to stabilize the training of SPoD-Net. In the experiments with the real SPoD observations, we confirmed the effectiveness of SPoD-Net over existing recovery methods. Specifically, we observed that SPoD-Net could recover images with more than a hundred times faster than the existing method.
This paper investigates and compares two different transfer learning methods for the purpose of classifying underwater objects in sonar data from different environments and operating conditions. The popular efficient ...
详细信息
ISBN:
(纸本)9781728108247
This paper investigates and compares two different transfer learning methods for the purpose of classifying underwater objects in sonar data from different environments and operating conditions. The popular efficient lifelong learning algorithm (ELLA) is used to perform this classification task. Two different learning strategies are then proposed for ELLA. As a benchmark the Matched Subspace Classifier (MSC) was also used together with an incremental dictionary learning and sparse coding. The comparison is carried out on low frequency sonar spectral features extracted from different underwater unexploded ordnances (UXOs) and non-UXO objects in two different environmental conditions.
Deep neural networks have led to a series of breakthroughs in computer vision given sufficient annotated training datasets. For novel tasks with limited labeled data, the prevalent approach is to transfer the knowledg...
详细信息
ISBN:
(纸本)9783030208707;9783030208691
Deep neural networks have led to a series of breakthroughs in computer vision given sufficient annotated training datasets. For novel tasks with limited labeled data, the prevalent approach is to transfer the knowledge learned in the pre-trained models to the new tasks by fine-tuning. Classic model fine-tuning utilizes the fact that well trained neural networks appear to learn cross domain features. These features are treated equally during transfer learning. In this paper, we explore the impact of feature selection in model fine-tuning by introducing a transfer module, which assigns weights to features extracted from pre-trained models. The proposed transfer module proves the importance of feature selection for transferring models from source to target domains. It is shown to significantly improve upon fine-tuning results with only marginal extra computational cost. We also incorporate an auxiliary classifier as an extra regularizer to avoid over-fitting. Finally, we build a Gated Transfer Network (GTN) based on our transfer module and achieve state-of-the-art results on six different tasks.
暂无评论