Blood vessel images can provide considerable information of many diseases, which are widely used by ophthalmologists for disease diagnosis and surgical planning. In this paper, we propose a novel method for the blood ...
详细信息
Blood vessel images can provide considerable information of many diseases, which are widely used by ophthalmologists for disease diagnosis and surgical planning. In this paper, we propose a novel method for the blood Vessel Enhancement via Multi-dictionary and sparse coding (VE-MSC). In the proposed method, two dictionaries are utilized to gain the vascular structures and details, including the Representation Dictionary (RD) generated from the original vascular images and the Enhancement Dictionary (ED) extracted from the corresponding label images. The sparse coding technology is utilized to represent the original target vessel image with RD. After that, the enhanced target vessel image can be reconstructed using the obtained sparse coefficients and ED. The proposed method has been evaluated for the retinal vessel enhancement on the DRIVE and STARE databases. Experimental results indicate that the proposed method can not only effectively improve the image contrast but also enhance the retinal vascular structures and details. (C) 2016 Elsevier B.V. All rights reserved.
Diverse studies have shown the efficiency of sparse coding in feature quantization. However, its major drawback is that it neglects the relationships among features. To reach the spatial context, we proposed in this p...
详细信息
Diverse studies have shown the efficiency of sparse coding in feature quantization. However, its major drawback is that it neglects the relationships among features. To reach the spatial context, we proposed in this paper, a novel sparse coding method called Extended Laplacian sparse coding. Two successive stages are required in this method. In the first stage, the sparse visual phrases based on Laplacian sparse coding are generated from the local regions in order to represent the geometric information in the image space. The second stage aims to incorporate the spatial relationships among local features in the image space into the objective function of the Laplacian sparse coding. It takes into account the similarity among local regions in the Laplacian sparse coding process. The matching between the local regions is based on the Hungarian method as well as the histogram intersection measure between sparse visual phrases already assigned to the local regions in the first stage. Furthermore, we suggested to improve the pooling step that succeeds the encoding step by introducing the discretized max pooling method that estimates the distribution of the responses of each local feature to the dictionary of basis vectors. Our experimental results prove that our method outperforms the existing background results. (C) 2015 Elsevier B.V. All rights reserved.
RGB-D human action recognition is a very active research topic in computer vision and robotics. In this paper, an action recognition method that combines gradient information and sparse coding is proposed. First of al...
详细信息
RGB-D human action recognition is a very active research topic in computer vision and robotics. In this paper, an action recognition method that combines gradient information and sparse coding is proposed. First of all, we leverage depth gradient information and distance of skeleton joints to extract coarse Depth-Skeleton (DS) feature. Then, the sparse coding and max pooling are combined to refine the coarse DS feature. Finally, the Random Decision Forests (RDF) is utilized to perform action recognition. Experimental results on three public datasets show the superior performance of our method.
We present a multi-layer group sparse coding framework for concurrent single-label image classification and annotation. By leveraging the dependency between image class label and tags, we introduce a multi-layer group...
详细信息
We present a multi-layer group sparse coding framework for concurrent single-label image classification and annotation. By leveraging the dependency between image class label and tags, we introduce a multi-layer group sparse structure of the reconstruction coefficients. Such structure fully encodes the mutual dependency between the class label, which describes image content as a whole, and tags, which describe the components of the image content. Therefore we propose a multi-layer group based tag propagation method, which combines the class label and subgroups of instances with similar tag distribution to annotate test images. To make our model more suitable for nonlinear separable features, we also extend our multi-layer group sparse coding in the Reproducing Kernel Hilbert Space (RKHS), which further improves performances of image classification and annotation. Moreover, we also integrate our multi-layer group sparse coding with kNN strategy, which greatly improves the computational efficiency. Experimental results on the LabelMe, UIUC-Sports and NUS-WIDE-Object databases show that our method outperforms the baseline methods, and achieves excellent performances in both image classification and annotation tasks.
Recognizing traffic signs is a challenging problem;and it has captured the attention of the computer vision community for several decades. Essentially, traffic sign recognition is a multi-class classification problem ...
详细信息
Recognizing traffic signs is a challenging problem;and it has captured the attention of the computer vision community for several decades. Essentially, traffic sign recognition is a multi-class classification problem that has become a real challenge for computer vision and machine learning techniques. Although many machine learning approaches are used for traffic sign recognition, they are primarily used for classification, not feature design. Identifying rich features using modern machine learning methods has recently attracted attention and has achieved success in many benchmarks. However these approaches have not been fully implemented in the traffic sign recognition problem. In this paper, we propose a new approach to tackle the traffic sign recognition problem. First, we introduce a new feature learning approach using group sparse coding. The primary goal is to exploit the intrinsic structure of the pre-learned visual codebook. This new coding strategy preserves locality and encourages similar descriptors to share similar sparse representation patterns. Second, we use a non-uniform quantization approach based on log-polar mapping. Using the log-polar mapping of the traffic sign image, rotated and scaled patterns are converted into shifted patterns in the new space. We extract the local descriptors from these patterns to learn the features. Finally, by evaluating the proposed approach using the German Traffic Sign Recognition Benchmark dataset, we show that the proposed coding strategy outperforms existing coding methods and the obtained results are comparable to the state-of-the-art. (C) 2014 Elsevier Inc. All rights reserved.
3D object retrieval from user-drawn (sketch) queries is one of the important research issues in the areas of pattern recognition and computer graphics for simulation, visualization, and Computer Aided Design. The perf...
详细信息
3D object retrieval from user-drawn (sketch) queries is one of the important research issues in the areas of pattern recognition and computer graphics for simulation, visualization, and Computer Aided Design. The performance of any content-based 3D object retrieval system crucially depends on the availability of effective descriptors and similarity measures for this kind of data. We present a sketch-based approach for improving 3D object retrieval effectiveness by optimizing the representation of one particular type of features (oriented gradients) using a sparse coding approach. We perform experiments, the results of which show that the retrieval quality improves over alternative features and codings. Based our findings, the coding can be proposed for sketch-based 3D object retrieval systems relying on oriented gradient features.
A configurable neuroinspired inference accelerator is designed as an array of neurons, each operating in an independent clock domain. The accelerator implements a recurrent network using a novel sparse convolution for...
详细信息
A configurable neuroinspired inference accelerator is designed as an array of neurons, each operating in an independent clock domain. The accelerator implements a recurrent network using a novel sparse convolution for feedforward operations and sparse spike-driven reconstruction for feedback operations. The proposed sparse convolution efficiently skips zero-patches, and can be made to support practically any image and kernel size. A globally asynchronous locally synchronous architecture enables scalable design and load balancing to achieve 22% reduction in power. Fabricated in 40-nm CMOS, the 2.56-mm(2) inference accelerator integrates 48 neurons, a hub, and an OpenRISC processor. The chip achieves 718GOPS at 380 MHz, and demonstrates applications in feature extraction from images and depth extraction from stereo images.
Structural health monitoring has received remarkable attention due to the arising structural safety problems. Most of these structural health problems are accumulative damages such as slight changes in structural defo...
详细信息
Structural health monitoring has received remarkable attention due to the arising structural safety problems. Most of these structural health problems are accumulative damages such as slight changes in structural deformations which are very hard to be detected. In addition, the complexity of real structure and environmental noises make structural health monitoring more difficult. Existing methods largely use various types of sensors to collect useful parameters and then train a machine learning model to diagnose damage level and location, in which a large amount of training data are needed for the model training, while the labeled data are rare in the real world. To overcome this problem, sparse coding is employed in this paper to achieve structural health monitoring of a bridge equipped with a wireless sensor network, so that a large amount of unlabeled examples can be used to train a feature extractor based on the sparse coding algorithm. Features learned from sparse coding are then used to train a neural network classifier to distinguish different statuses of the bridge. Experimental results show the sparse coding-based deep learning algorithm achieves higher accuracy for structural health monitoring under the same level of environmental noises, compared with some existing methods.
Land cover segmentation can be viewed as topic assignment that the pixels are grouped into homogeneous regions according to different semantic topics in topic model. In this paper, we propose a novel topic model based...
详细信息
ISBN:
(纸本)9781479911141
Land cover segmentation can be viewed as topic assignment that the pixels are grouped into homogeneous regions according to different semantic topics in topic model. In this paper, we propose a novel topic model based on sparse coding for segmenting different kinds of land covers. Different from conventional topic models which generally assume each local feature descriptor is related to only one visual word of the codebook, our method utilizes sparse coding to characterize the potential correlation between the descriptor and multiple words. Therefore each descriptor can be represented by a small set of words. Furthermore, in this paper probabilistic Latent Semantic Analysis (pLSA) is applied to learn the latent relation among word, topic and document due to its simplicity and low computational cost. Experimental results on remote sensing image segmentation demonstrate the excellent superiority of our method over k-means clustering and conventional pLSA model.
In this paper, we address the problem of human action recognition by representing image sequences as a sparse collection of patch-level spatiotemporal events that are salient in both space and time domain. Our method ...
详细信息
ISBN:
(纸本)9781479983391
In this paper, we address the problem of human action recognition by representing image sequences as a sparse collection of patch-level spatiotemporal events that are salient in both space and time domain. Our method uses a multi-scale volumetric representation of video and adaptively selects an optimal space-time scale under which the saliency of a patch is most significant. The input image sequences are first partitioned into non-overlapping patches. Then, each patch is represented by a vector of coefficients that can linearly reconstruct the patch from a learned dictionary of basis patches. We propose to measure the spatiotemporal saliency of patches using Shannon's self-information entropy, where a patch's saliency is determined by information variation in the contents of the patch's spatiotemporal neighborhood. Experimental results on two benchmark datasets demonstrate the effectiveness of our proposed method.
暂无评论