This work devises a locality-constrained max-margin sparse coding (LC-MMSC) framework, which jointly considers reconstruction loss and hinge loss simultaneously. Traditional sparse coding algorithms use l(1) constrain...
详细信息
This work devises a locality-constrained max-margin sparse coding (LC-MMSC) framework, which jointly considers reconstruction loss and hinge loss simultaneously. Traditional sparse coding algorithms use l(1) constraint to force the representation to be sparse, leading to computational expensive process to optimize the objective function. This work uses locality constraint in the framework to preserve information of data locality and avoid the optimization of l(1). The obtained representation can achieve the goal of data locality and sparsity. Additionally, this work optimizes coefficients, dictionaries and classification parameters simultaneously, and uses block coordinate descent to learn all the components of the proposed model. This work uses semi supervised learning approach in the proposed framework, and the goal is to use both labeled data and unlabeled data to achieve accurate classification performance and improve the generalization of the model. We provide theoretical analysis on the convergence of the proposed LC-MMSC algorithm based on Zangwill's global convergence theorem. This work conducts experiments on three real datasets, including Extended YaleB dataset, AR face dataset and Caltech101 dataset. The experimental results indicate that the proposed algorithm outperforms other comparison algorithms.
In an underdetermined mixture system with n unknown sources, it is a challenging task to separate these sources from their m observed mixture signals, where m < n. By exploiting the technique of sparse coding, we p...
详细信息
In an underdetermined mixture system with n unknown sources, it is a challenging task to separate these sources from their m observed mixture signals, where m < n. By exploiting the technique of sparse coding, we propose an effective approach to discover some 1-D subspaces from the set consisting of all the time-frequency (TF) representation vectors of observed mixture signals. We show that these 1-D subspaces are associated with TF points where only single source possesses dominant energy. By grouping the vectors in these subspaces via hierarchical clustering algorithm, we obtain the estimation of the mixing matrix. Finally, the source signals could be recovered by solving a series of least squares problems. Since the sparse coding strategy considers the linear representation relations among all the TF representation vectors of mixing signals, the proposed algorithm can provide an accurate estimation of the mixing matrix and is robust to the noises compared with the existing underdetermined blind source separation approaches. Theoretical analysis and experimental results demonstrate the effectiveness of the proposed method.
sparse coding (SC) has recently become a widely used tool in signal and image processing. The sparse linear combination of elements from an appropriately chosen over-complete dictionary can represent many signal patch...
详细信息
sparse coding (SC) has recently become a widely used tool in signal and image processing. The sparse linear combination of elements from an appropriately chosen over-complete dictionary can represent many signal patches. SC applications have been explored in many fields such as image super resolution (SR), image-feature extraction, image reconstruction, and segmentation. In most of these applications, learning-based SC has provided an excellent image quality. SC involves two steps: dictionary construction and searching the dictionary using quadratic programming. This study focuses on the searching step and a new adaptive variation of genetic algorithm is proposed to search and find the optimum closest match in the dictionary. Also, inspired by the proposed evolutionary SC (ESC), a single-image SR algorithm is proposed. A sparse representation for each patch of the low-resolution input image is obtained by ESC and it is used to generate the high-resolution output image. Experimental results show that the proposed ESC-based method would lead to a better SR image quality.
RGB-D human action recognition is a very active research topic in computer vision and robotics. In this paper, an action recognition method that combines gradient information and sparse coding is proposed. First of al...
详细信息
RGB-D human action recognition is a very active research topic in computer vision and robotics. In this paper, an action recognition method that combines gradient information and sparse coding is proposed. First of all, we leverage depth gradient information and distance of skeleton joints to extract coarse Depth-Skeleton (DS) feature. Then, the sparse coding and max pooling are combined to refine the coarse DS feature. Finally, the Random Decision Forests (RDF) is utilized to perform action recognition. Experimental results on three public datasets show the superior performance of our method.
This paper introduces a deep model called Deep sparse-coding Network (DeepSCNet) to combine the advantages of Convolutional Neural Network (CNN) and sparse-coding techniques for image feature representation. DeepSCNet...
详细信息
This paper introduces a deep model called Deep sparse-coding Network (DeepSCNet) to combine the advantages of Convolutional Neural Network (CNN) and sparse-coding techniques for image feature representation. DeepSCNet consists of four type of basic layers:" The sparse-coding layer performs generalized linear coding for local patch within the receptive field by replacing the convolution operation in CNN into sparse-coding. The Pooling layer and the Normalization layer perform identical operations as that in CNN. And finally the Map reduction layer reduces CPU/memory consumption by reducing the number of feature maps before stacking with the following layers. These four type of layers can be easily stacked to construct a deep model for image feature learning. The paper further discusses the multi-scale, multi-locality extension to the basic DeepSCNet, and the overall approach is fully unsupervised. Compared to CNN, training DeepSCNet is relatively easier even with training set of moderate size. Experiments show that DeepSCNet can automatically discover highly discriminative feature directly from raw image pixels.
Human gait recognition is a behavioral biometrics method that aims to determine the identity of individuals through the manner and style of their distinctive walk. It is still a very challenging problem because natura...
详细信息
Human gait recognition is a behavioral biometrics method that aims to determine the identity of individuals through the manner and style of their distinctive walk. It is still a very challenging problem because natural human gait is affected by many covariate factors such as changes in the clothing, variations in viewing angle, and changes in carrying condition. This paper evaluates the most important features of gait under the carrying and clothing conditions. We find that the intra-class variations of the features that remain static during the gait cycle affect the recognition accuracy adversely. Thus, we introduce an effective and robust feature selection method based on the gait energy image. The new gait representation is less sensitive to these covariate factors. We also propose an augmentation technique to overcome some of the problems associated with the intra-class gait fluctuations, as well as if the amount of the training data is relatively small. Finally, we use dictionary learning with sparse coding and linear discriminant analysis to seek the best discriminative data representation before feeding it to the Nearest Centroid classifier. When our method is applied on the large CASIA-B gait data set, we are able to outperform existing gait methods by achieving the highest average result.
In this paper, we present a novel bottom-up saliency detection algorithm from the perspective of covariance matrices on a Riemannian manifold. Each superpixel is described by a region covariance matrix on Riemannian M...
详细信息
In this paper, we present a novel bottom-up saliency detection algorithm from the perspective of covariance matrices on a Riemannian manifold. Each superpixel is described by a region covariance matrix on Riemannian Manifolds. We carry out a two-stage sparse coding scheme via Log-Euclidean kernels to extract salient objects efficiently. In the first stage, given background dictionary on image borders, sparse coding of each region covariance via Log-Euclidean kernels is performed. The reconstruction error on the background dictionary is regarded as the initial saliency of each superpixel. In the second stage, an improvement of the initial result is achieved by calculating reconstruction errors of the superpixels on foreground dictionary, which is extracted from the first stage saliency map. The sparse coding in the second stage is similar to the first stage, but is able to effectively highlight the salient objects uniformly from the background. Finally, three post-processing methods - highlight-inhibition function, contextbased saliency weighting, and the graph cut - are adopted to further refine the saliency map. Experiments on four public benchmark datasets show that the proposed algorithm outperforms the state-of-the-art methods in terms of precision, recall and mean absolute error, and demonstrate the robustness and efficiency of the proposed method. (C) 2017 Elsevier Ltd. All rights reserved.
This letter presents sparse coding for interferometric synthetic aperture radar (InSAR) patch categorization. Motivated by the fact that an optimal dual based l(1) analysis can achieve better recognition rates, this l...
详细信息
This letter presents sparse coding for interferometric synthetic aperture radar (InSAR) patch categorization. Motivated by the fact that an optimal dual based l(1) analysis can achieve better recognition rates, this letter proposes sparse coding with optimal dual-based l(1) analysis, which is applied to the amplitude and phase of the InSAR patches. The minimization of cost functions for amplitude and phase was designed and solved differently. The cost function for the amplitude part of InSAR data was modeled using the optimal dual-based l(1) analysis, and the minimization of cost function was solved using the forward-backward splitting algorithm. The phase was coded sparsely using the l(1) minimization approach and it was solved using the gradient descent algorithm. The experimental results showed that the proposed method outperforms the complex-valued methods for SAR patch categorization and outperforms the bag of visual words method as well.
Gene coexpression patterns carry rich information regarding enormously complex brain structures and functions. Characterization of these patterns in an unbiased, integrated, and anatomically comprehensive manner will ...
详细信息
Gene coexpression patterns carry rich information regarding enormously complex brain structures and functions. Characterization of these patterns in an unbiased, integrated, and anatomically comprehensive manner will illuminate the higher-order transcriptome organization and offer genetic foundations of functional circuitry. Here using dictionary learning and sparse coding, we derived coexpression networks from the space-resolved anatomical comprehensive in situ hybridization data from Allen Mouse Brain Atlas dataset. The key idea is that if two genes use the same dictionary to represent their original signals, then their gene expressions must share similar patterns, thereby considering them as "coexpressed." For each network, we have simultaneous knowledge of spatial distributions, the genes in the network and the extent a particular gene conforms to the coexpression pattern. Gene ontologies and the comparisons with published gene lists reveal biologically identified coexpression networks, some of which correspond to major cell types, biological pathways, and/or anatomical regions.
The single sample per person (SSPP) face recognition is a major problem and it is also an important challenge for practical face recognition systems due to the lack of sample data information. To solve SSPP problem, s...
详细信息
The single sample per person (SSPP) face recognition is a major problem and it is also an important challenge for practical face recognition systems due to the lack of sample data information. To solve SSPP problem, some existing methods have been proposed to overcome the effect of variances to test samples in illumination, expression and pose. However, they are not robust when the test samples are with different kinds of occlusions. In this paper, we propose a discriminative multi-scale sparse coding (DMSC) model to address this problem. We model the possible occlusion variations via the learned dictionary from the subjects not of interest. Together with the single training sample per person, most of types of occlusion variations can be effectively tackled. In order to detect and disregard outlier pixels due to occlusion, we develop a multi-scale error measurements strategy, which produces sparse, robust and highly discriminative coding. Extensive experiments on the benchmark databases show that our DMSC is more robust and has higher breakdown point in dealing with the SSPP problem for face recognition with occlusion as compared to the related state-of-the-art methods.
暂无评论