作者:
Sun, YankuiLi, ShanSun, ZhongyangTsinghua Univ
Dept Comp Sci & Technol 30 Shuangqing Rd Beijing 100084 Peoples R China Beihang Univ
Sch Software 37 Xueyuan Rd Beijing 100191 Peoples R China Sun Yat Sen Univ
Sch Data & Comp Sci Guangzhou Higher Educ Mega Ctr Univ Town 132 East Waihuan Rd Guangzhou 510006 Guangdong Peoples R China
We propose a framework for automated detection of dry age-related macular degeneration (AMD) and diabetic macular edema (DME) from retina optical coherence tomography (OCT) images, based on sparse coding and dictionar...
详细信息
We propose a framework for automated detection of dry age-related macular degeneration (AMD) and diabetic macular edema (DME) from retina optical coherence tomography (OCT) images, based on sparse coding and dictionary learning. The study aims to improve the classification performance of state-of-the-art methods. First, our method presents a general approach to automatically align and crop retina regions;then it obtains global representations of images by using sparse coding and a spatial pyramid;finally, a multiclass linear support vector machine classifier is employed for classification. We apply two datasets for validating our algorithm: Duke spectral domain OCT (SD-OCT) dataset, consisting of volumetric scans acquired from 45 subjects-15 normal subjects, 15 AMD patients, and 15 DME patients;and clinical SD-OCT dataset, consisting of 678 OCT retina scans acquired from clinics in Beijing-168, 297, and 213 OCT images for AMD, DME, and normal retinas, respectively. For the former dataset, our classifier correctly identifies 100%, 100%, and 93.33% of the volumes with DME, AMD, and normal subjects, respectively, and thus performs much better than the conventional method;for the latter dataset, our classifier leads to a correct classification rate of 99.67%, 99.67%, and 100.00% for DME, AMD, and normal images, respectively. (C) 2017 Society of Photo-Optical Instrumentation Engineers (SPIE)
Human action recognition (HAR) is a challenging problem because of the complexity and similarity in different actions. In recent years, many methods have been proposed for HAR. sparse coding-based approaches have been...
详细信息
Human action recognition (HAR) is a challenging problem because of the complexity and similarity in different actions. In recent years, many methods have been proposed for HAR. sparse coding-based approaches have been widely used in this field. Also, many works have been done based on manifold learning theory. When the videos are similar but from different classes, their sparse codes may be similar and the actions might be classified mistakenly. In this paper, a multi-modal affine graph regularized sparse coding approach is proposed for solving this problem in HAR. At first, HOG3D, HOG/Hof and SURF3D descriptors were extracted from the action datasets, then the sparse codes have been obtained for each descriptor using the proposed method. The dictionary learning method used in this step has more discrimination power in respect to the traditional methods. Then, these codes are scored differently using SVM classifier and at last a Naive Bayes leads to a final decision. Experiments on KTH, Weizmann and UCF Sport action datasets show that the proposed method can significantly outperform several previous methods in human action classification especially in real-world data.
We present the learning algorithm orthogonal sparse coding (OSC) to find an orthogonal basis in which a given data set has a maximally sparse representation. OSC is based on stochastic descent by Hebbian-like updates ...
详细信息
We present the learning algorithm orthogonal sparse coding (OSC) to find an orthogonal basis in which a given data set has a maximally sparse representation. OSC is based on stochastic descent by Hebbian-like updates and Gram-Schmidt orthogonalizations, and is motivated by an algorithm that we introduce as the canonical approach (CA). First, we evaluate how well OSC can recover a generating basis from synthetic data. We show that, in contrast to competing methods, OSC can recover the generating basis for quite low and, remarkably, unknown sparsity levels. Moreover, on natural image patches and on images of handwritten digits, OSC learns orthogonal bases that attain significantly sparser representations compared to alternative orthogonal transforms. Furthermore, we demonstrate an application of OSC for image compression by showing that the rate-distortion performance can be improved relative to the JPEG standard. Finally, we demonstrate the state-of-the-art image denoising performance of OSC dictionaries. Our results demonstrate the potential of OSC for feature extraction, data compression, and image denoising, which is due to two important aspects: 1) the learned bases are adapted to the signal class, and 2) the sparse approximation problem can be solved efficiently and exactly.
Local spatio-temporal features are popular in the human action recognition task. In practice, they are usually coupled with a feature encoding approach, which helps to obtain the video-level vector representations tha...
详细信息
Local spatio-temporal features are popular in the human action recognition task. In practice, they are usually coupled with a feature encoding approach, which helps to obtain the video-level vector representations that can be used in learning and recognition. In this paper, we present an efficient local feature encoding approach, which is called Approximate sparse coding (ASC). ASC computes the sparse codes for a large collection of prototype local feature descriptors in the off-line learning phase using sparse coding (SC) and look up the nearest prototype's precomputed sparse code for each to-be-encoded local feature in the encoding phase using Approximate Nearest Neighbour (ANN) search. It shares the low dimensionality of SC and the high speed of ANN, which are both desired properties for a local feature encoding approach. ASC has been excessively evaluated on the KTH dataset and the HMDB51 dataset. We confirmed that it is able to encode large quantity of local video features into discriminative low dimensional representations efficiently.
As a promising technique, sparse coding has been widely used for the analysis, representation, compression, denoising and separation of speech. This technique needs a good dictionary which contains atoms to represent ...
详细信息
As a promising technique, sparse coding has been widely used for the analysis, representation, compression, denoising and separation of speech. This technique needs a good dictionary which contains atoms to represent speech signals. Although many methods have been proposed to learn such a dictionary, there are still two problems. First, unimportant atoms bring a heavy computational load to sparse decomposition and reconstruction, which prevents sparse coding from real-time application. Second, in speech denoising and separation, harmful atoms have no or ignorable contributions to reducing the sparsity degree but increase the source confusion, resulting in severe distortions. To solve these two problems, we first analyze the inherent assumptions of sparse coding and show that distortion can be caused if the assumptions do not hold true. Next, we propose two methods to optimize a given dictionary by removing unimportant atoms and harmful atoms, respectively. Experiments show that the proposed methods can further improve the performance of dictionaries. (C) 2015 Elsevier B.V. All rights reserved.
Human activity analysis in videos has increasingly attracted attention in computer vision research with the massive number of videos now accessible online. Although many recognition algorithms have been reported recen...
详细信息
Human activity analysis in videos has increasingly attracted attention in computer vision research with the massive number of videos now accessible online. Although many recognition algorithms have been reported recently, activity representation is challenging. Recently, manifold regularized sparse coding has obtained promising performance in action recognition, because it simultaneously learns the sparse representation and preserves the manifold structure. In this paper, we propose a generalized version of Laplacian regularized sparse coding for human activity recognition called p-Laplacian regularized sparse coding (pLSC). The proposed method exploits p-Laplacian regularization to preserve the local geometry. The p-Laplacian is a nonlinear generalization of standard graph Laplacian and has tighter isoperimetric inequality. As a result, pLSC provides superior theoretical evidence than standard Laplacian regularized sparse coding with a proper p. We also provide a fast iterative shrinkage-thresholding algorithm for the optimization of pLSC. Finally, we input the sparse codes learned by the pLSC algorithm into support vector machines and conduct extensive experiments on the unstructured social activity attribute dataset and human motion database (HMDB51) for human activity recognition. The experimental results demonstrate that the proposed pLSC algorithm outperforms the manifold regularized sparse coding algorithms including the standard Laplacian regularized sparse coding algorithm with a proper p.
Image registration is a basic task in medical image processing applications like group analysis and atlas construction. Similarity measure is a critical ingredient of image registration. Intensity distortion of medica...
详细信息
Image registration is a basic task in medical image processing applications like group analysis and atlas construction. Similarity measure is a critical ingredient of image registration. Intensity distortion of medical images is not considered in most previous similarity measures. Therefore, in the presence of bias field distortions, they do not generate an acceptable registration. In this paper, we propose a sparse based similarity measure for mono-modal images that considers non-stationary intensity and spatially-varying distortions: The main idea behind this measure is that the aligned image is constructed by an analysis dictionary trained using the image patches. For this purpose, we use "Analysis K-SVD" to train the dictionary and find the sparse coefficients. We utilize image patches to construct the analysis dictionary and then we employ the proposed sparse similarity measure to find a non-rigid transformation using free form deformation (FFD). Experimental results show that the proposed approach is able to robustly register 2D and 3D images in both simulated and real cases. The proposed method outperforms other state-of-the-art similarity measures and decreases the transformation error compared to the previous methods. Even in the presence of bias field distortion, the proposed method aligns images without any preprocessing. (C) 2016 Elsevier Ltd. All rights reserved.
Spike sorting is in fact clustering analysis,which will face a problem in high dimensions: the "curse of dimensionality".This paper proposes a new dimension-reduction approach for sorting spikes from multiun...
详细信息
ISBN:
(纸本)9781509001668
Spike sorting is in fact clustering analysis,which will face a problem in high dimensions: the "curse of dimensionality".This paper proposes a new dimension-reduction approach for sorting spikes from multiunit *** new approach applies the principle of sparse coding to analyzing high-dimensional spike data within the frameworks of clustering,classification and dimension ***,the sparse representation adopts fast parallel active-set optimization to speed up the convergence of the *** our experiment,we compare the proposed approach with traditional *** is shown that our approach has fewer extracted features and better classification accuracy especially in similar spike waveforms.
We consider the problem of image representation for visual analysis. When representing images as vectors, the feature space is of very high dimensionality, which makes it difficult for applying statistical techniques ...
详细信息
We consider the problem of image representation for visual analysis. When representing images as vectors, the feature space is of very high dimensionality, which makes it difficult for applying statistical techniques for visual analysis. One then hope to apply matrix factorization techniques, such as Singular Vector Decomposition (SVD) to learn the low dimensional hidden concept space. Among various matrix factorization techniques, sparse coding receives considerable interests in recent years because its sparse representation leads to an elegant interpretation. However, most of the existing sparse coding algorithms are computational expensive since they compute the basis vectors and the representations iteratively. In this paper, we propose a novel method, called Orthogonal Projective sparse coding (OPSC), for efficient and effective image representation and analysis. Integrating the techniques from manifold learning and sparse coding, OPSC provides a sparse representation which can capture the intrinsic geometric structure of the image space. Extensive experimental results on real world applications demonstrate the effectiveness and efficiency of the proposed approach. (C) 2015 Elsevier B.V. All rights reserved.
Autoimmune Diseases (AD) are among the top 10 leading causes of death in female children and women in all age groups up to 64 years. They are widely diagnosed by various antibody tests that typically apply the Indirec...
详细信息
Autoimmune Diseases (AD) are among the top 10 leading causes of death in female children and women in all age groups up to 64 years. They are widely diagnosed by various antibody tests that typically apply the Indirect Immunofluorescence (IIF) to the Human Epithelial Type-2 (HEp-2) cells. Automated classification of Hep-2 cells has attracted much research interest in recent years, and many of these approaches employ patch-based models and the Bag of Words (BoW) scheme, but often face several typical constraints such as the need to process a huge number of overlapped image patches, tuning of various parameters and etc. We propose a superpixel based Hep-2 cell classification technique by calculating the sparse codes of image patches which are prepared in a more intelligent way. In particular, the super-pixel approach guides the determination of the right image patches while aggregating the neighboring pixels of similar patterns. In addition, we introduce "extended superpixels" which is able to capture the most discriminative gradient information across the boundary of the HEp-2 cell images. The proposed technique has been evaluated over two public datasets (ICPR2012 and ICIP2013) and experiments show superior performance in both classification accuracy and speed of model training and cell classification. (C) 2016 Elsevier B.V. All rights reserved.
暂无评论