Pedestrian detection is a problem of considerable practical interest. Adding to the list of successful applications of deep learning methods to vision, we report state-of-the-art and competitive results on all major p...
详细信息
ISBN:
(纸本)9781467364102
Pedestrian detection is a problem of considerable practical interest. Adding to the list of successful applications of deep learning methods to vision, we report state-of-the-art and competitive results on all major pedestrian datasets with a convolutional network model. The model uses a few new twists, such as multi-stage features, connections that skip layers to integrate global shape information with local distinctive motif information, and an unsupervised method based on convolutional sparse coding to pre-train the filters at each stage.
Dictionary learning aims at finding a frame (called dictionary) in which some training data admits a sparse representation. Traditional dictionary learning is limited to relatively small-scale problems, because high-d...
详细信息
ISBN:
(纸本)9781467369985
Dictionary learning aims at finding a frame (called dictionary) in which some training data admits a sparse representation. Traditional dictionary learning is limited to relatively small-scale problems, because high-dimensional dense dictionaries can be costly to manipulate, both at the learning stage and when used for tasks such as sparse coding. In this paper, inspired by usual fast transforms, we consider a multi-layer sparse dictionary structure allowing cheaper manipulation, and propose a learning algorithm imposing this structure. The approach is demonstrated experimentally with a factorization of the Hadamard matrix and on image denoising.
In this work we present a sparse dictionary learning method, specifically tuned to solve the tracking problem. Recently, sparse representation has drawn much attention because of its genuineness and strong mathematica...
详细信息
ISBN:
(纸本)9781467356053
In this work we present a sparse dictionary learning method, specifically tuned to solve the tracking problem. Recently, sparse representation has drawn much attention because of its genuineness and strong mathematical background. In this paper we present an online method for dictionary learning which is desirable for problems such as tracking. Online learning methods are preferable because the whole data are not available at the current time. The presented method tries to use the advantages of the generative and discriminative models to achieve better performance. The experimental results show our method can overcome many tracking challenges.
In this paper, the key-frame extraction problem for user-generated-videos which are captured by smart phones is investigated. A collaborative sparse coding model which incorporates the L_(2,1) and L_(1,2) regularizati...
详细信息
ISBN:
(纸本)9781479947607
In this paper, the key-frame extraction problem for user-generated-videos which are captured by smart phones is investigated. A collaborative sparse coding model which incorporates the L_(2,1) and L_(1,2) regularization terms are proposed to select few key-frames while attenuating the influences of the outlier frames. Further, the sensors embedded in the smart phone is used to collect the acceleration values, which can be used to improve the performance of outlier-attenuations. Finally, a real dataset is constructed to test the proposed method and the experimental validation shows promising results.
To satisfy the requirement of image classification in the application of image retrieval, a novel method of image representation based on bag-of-visual-words model is proposed in the paper to describe the spatial sema...
详细信息
ISBN:
(纸本)9781728105529
To satisfy the requirement of image classification in the application of image retrieval, a novel method of image representation based on bag-of-visual-words model is proposed in the paper to describe the spatial semantic distribution of associated features. Firstly, the extracted SIFT features are mapped into visual words including certain semantic information. According to spatial pyramid hierarchy, the specific region is divided with local features, and the spatial distribution of associated features is analyzed from different aspects and in various regions. In this way, the semantic phrases are established with local features. Next, the spatial semantic lexicon is constructed with sparse encoding of spatial semantic phrases, and the images are described with the form of sparse statistical histogram vectors. Finally, the vectors of images are classified with the classifier embedded with the improved bag-of-visual-words model. The experimental results show that the accuracy of image classification is significantly enhanced which is benefited from the Bag-of-visual-words model with spatial semantic distribution.
We study the application of a strongly non-linear generative model to image patches. As in standard approaches such as sparse coding or Independent Component Analysis, the model assumes a sparse prior with independent...
详细信息
ISBN:
(纸本)9781617823800
We study the application of a strongly non-linear generative model to image patches. As in standard approaches such as sparse coding or Independent Component Analysis, the model assumes a sparse prior with independent hidden variables. However, in the place where standard approaches use the sum to combine basis functions we use the maximum. To derive tractable approximations for parameter estimation we apply a novel approach based on variational Expectation Maximization. The derived learning algorithm can be applied to large-scale problems with hundreds of observed and hidden variables. Furthermore, we can infer all model parameters including observation noise and the degree of sparseness. In applications to image patches we find that Gabor-Hke basis functions are obtained. Gabor-like functions are thus not a feature exclusive to approaches assuming linear superposition. Quantitatively, the inferred basis functions show a large diversity of shapes with many strongly elongated and many circular symmetric functions. The distribution of basis function shapes reflects properties of simple cell receptive fields that are not reproduced by standard linear approaches. In the study of natural image statistics, the implications of using different superposition assumptions have so far not been investigated systematically because models with strong non-linearities have been found analytically and computationally challenging. The presented algorithm represents the first large-scale application of such an approach.
Recently, the literature has witnessed an increasing interest in the study of medical image watermarking and recovery techniques. In this article, a novel image tamper localization and recovery technique for medical i...
详细信息
ISBN:
(纸本)9781424479276
Recently, the literature has witnessed an increasing interest in the study of medical image watermarking and recovery techniques. In this article, a novel image tamper localization and recovery technique for medical image authentication is proposed. The sparse coding of the Electronic Patient Record (EPR) and the reshaped region of Interest (ROI) is embedded in the transform domain of the Region of Non-Interest (RONI). The first part of the sparse coded watermark is use for saving the patient information along with the image, whereas the second part is used for authentication purpose. When the watermarked image is tampered during transmission between hospitals and medical clinics, the embedded sparse coded ROI can be extracted to recover the tampered image. The experimental results demonstrate the efficiency of the proposed technique in term of tamper correction capability, robustness to attacks, and imperceptibility.
We propose a method and a prototype imaging system for real-time reconstruction of volumetric piecewise-smooth scattering media. The volume is illuminated by a sequence of structured binary patterns emitted from a fan...
详细信息
ISBN:
(纸本)9781479957521
We propose a method and a prototype imaging system for real-time reconstruction of volumetric piecewise-smooth scattering media. The volume is illuminated by a sequence of structured binary patterns emitted from a fan beam projector, and the scattered light is collected by a two-dimensional sensor, thus creating an under-complete set of compressed measurements. We show a fixed-complexity and latency reconstruction algorithm capable of estimating the scattering coefficients in real-time. We also show a simple greedy algorithm for learning the optimal illumination patterns. Our results demonstrate faithful reconstruction from highly compressed measurements. Furthermore, a method for compressed registration of the measured volume to a known template is presented, showing excellent alignment with just a single projection. Though our prototype system operates in visible light, the presented methodology is suitable for fast x-ray scattering imaging, in particular in real-time vascular medical imaging.
This paper proposes a dictionary learning framework that combines the proposed block/group (BGSC) or reconstructed block/group (R-BGSC) sparse coding schemes with the novel Intra-block Coherence Suppression Dictionary...
详细信息
ISBN:
(纸本)9781467364102
This paper proposes a dictionary learning framework that combines the proposed block/group (BGSC) or reconstructed block/group (R-BGSC) sparse coding schemes with the novel Intra-block Coherence Suppression Dictionary Learning (ICS-DL) algorithm. An important and distinguishing feature of the proposed framework is that all dictionary blocks are trained simultaneously with respect to each data group while the intra-block coherence being explicitly minimized as an important objective. We provide both empirical evidence and heuristic support for this feature that can be considered as a direct consequence of incorporating both the group structure for the input data and the block structure for the dictionary in the learning process. The optimization problems for both the dictionary learning and sparse coding can be solved efficiently using block-gradient descent, and the details of the optimization algorithms are presented. We evaluate the proposed methods using well-known datasets, and favorable comparisons with state-of-the-art dictionary learning methods demonstrate the viability and validity of the proposed framework.
Computer aided diagnosis (CAD) has been important more than ever for accurate diagnosis of liver tumors. The paper presents a novel image representation method for classifying normal livers and livers with tumors. It ...
详细信息
Computer aided diagnosis (CAD) has been important more than ever for accurate diagnosis of liver tumors. The paper presents a novel image representation method for classifying normal livers and livers with tumors. It starts by capturing region of interesting (ROI) for individual livers, on which patches are extracted densely. Histogram of oriented gradients (HOG) and intensity are then extracted as patch features. Taking the feature clustering centers in the training images as coding dictionary, sparse coding is used as a coding scheme for the patch extracted from both train and test images. And an effective image representation is then generated based on bag of features (BOF). In this study, an optimized coding method based on the dictionary elements nearby are utilized, which accelerate the coding procedure. The experimental results demonstrate that the proposed image representation method achieves higher classification rate.
暂无评论