Given an image region of pixels, second order statistics can be used to construct a descriptor for object representation. One example is the covariance matrix descriptor, which shows high discriminative power and good...
详细信息
Given an image region of pixels, second order statistics can be used to construct a descriptor for object representation. One example is the covariance matrix descriptor, which shows high discriminative power and good robustness in many computer vision applications. However, operations for the covariance matrix on Riemannian manifolds are usually computationally demanding. This paper proposes a novel second order statistics based region descriptor, named “Sigma Set”, in the form of a small set of vectors, which can be uniquely constructed through Cholesky decomposition on the covariance matrix. Sigma Set is of low dimension, powerful and robust. Moreover, compared with the covariance matrix, Sigma Set is not only more efficient in distance evaluation and average calculation, but also easier to be enriched with first order statistics. Experimental results in texture classification and object tracking verify the effectiveness and efficiency of this novel object descriptor.
In this paper we discuss several progress of semi-supervised learning, making emphasis on semi-supervised classification. First, we introduce the history and basic concept and methods of semi-supervised learning, then...
详细信息
The paper presents a novel approach for contour-based shape retrieval. The contour is firstly sampled and smoothed by a Gaussian filter. Then, the sampled points are classified into salient (convex or concave) or smoo...
详细信息
The paper presents a novel approach for contour-based shape retrieval. The contour is firstly sampled and smoothed by a Gaussian filter. Then, the sampled points are classified into salient (convex or concave) or smooth type points by their internal angles. In addition to distance histograms (DH), two new descriptors named relative address distribution (RAD) and relative unit entropy (RUE) are introduced to describe the correlation of each type of contour points. These descriptors have powerful descriptive ability for contour with more spatial information. Comparisons are conducted between the proposed method and several other feature descriptors. The results show that the new method is efficient and it provides noticeable improvement to the performance of shape retrieval.
A statistical parametric approach to speech synthesis based HMMs has grown in popularity over the last few years. In this approach, spectrum, excitation, and duration of speech are simultaneously modeled by context-de...
详细信息
A statistical parametric approach to speech synthesis based HMMs has grown in popularity over the last few years. In this approach, spectrum, excitation, and duration of speech are simultaneously modeled by context-dependent HMMs, and speech waveforms are generated from the HMMs themselves. Since December 2002, we have publicly released an opensource software toolkit named "HMM-based speech synthesis system (HTS)" to provide a research and development toolkit for statistical parametric speech synthesis. This paper describes recent developments of HTS in detail, as well as future release plans.
Sparse coding has high-performance encoding and ability to express images, sparse encoding basis vector plays a crucial role. The computational complexity of the most existing sparse coding basis vectors of is relativ...
详细信息
Sparse coding has high-performance encoding and ability to express images, sparse encoding basis vector plays a crucial role. The computational complexity of the most existing sparse coding basis vectors of is relatively large. In order to reduce the computational complexity and save the time to train basis vectors. A new Hebbian rules based method for computation of sparse coding basis vectors is proposed in this paper. A two-layer neural network is constructed to implement the task. The main idea of our work is to learn basis vectors by removing the redundancy of all initial vectors using Hebbian rules. The experiments on natural images prove that the proposed method is effective for sparse coding basis learning. It has the smaller computational complexity compared with the previous work.
We present a computational model of human eye movements based on a genetic algorithm (GA). The model can generate elemental raw eye movement data in a four-second eye viewing window with a 25 Hz sampling rate. Based o...
详细信息
We present a computational model of human eye movements based on a genetic algorithm (GA). The model can generate elemental raw eye movement data in a four-second eye viewing window with a 25 Hz sampling rate. Based on the physiology and psychology characters of human vision system, the fitness function of the GA model is constructed by taking into consideration of five factors including the saliency map, short time memory, saccades distribution, Region of Interest (ROI) map, and a retina model. Our model can produce the scan path of a subject viewing an image, not just several fixations points or artificial ROI's as in the other models. We have also developed both subjective and objective methods to evaluate the model by comparing its behavior with the real eye movement data collected from an eye tracker. Tested on 18 (9 times 2) images from both an obvious-object image group and a non-obvious-object image group, the subjective evaluations shows very close scores between the scan paths generated by the GA model and those real scan paths; for the objective evaluation, experimental results show that the distance between GA's scan paths and human scan paths of the same image has no significant difference by a probability of 78.9% on average.
While computing is entering a new phase in which CPU improvements are driven by the addition of multiple cores on a single chip, rather than higher frequencies. Parallel processing on these systems is in a primitive s...
详细信息
This paper proposes a Sample-Consensus method for viewpoint independent sign language recognition under data deficiency (matched features are possibly deficient with regard to some frame pairs). The proposed method is...
详细信息
This paper proposes a Sample-Consensus method for viewpoint independent sign language recognition under data deficiency (matched features are possibly deficient with regard to some frame pairs). The proposed method is based on the epipolar geometry and inspired by RANSAC. The basic idea is that all corresponded frames between two sequences of the same sign can be roughly considered as captured synchronously by a virtual stereo vision system and thus they will satisfy the same fundamental matrix. In addition, the fundamental matrix can be estimated from point correspondences contained by some part of corresponding frames. Experimental results demonstrate the efficiency of the proposed method. Moreover, this Sample-Consensus method can be easily extended to some similar problems, such as viewpoint independent activity analysis and rigid-motion analysis.
With the fast development of World Wide Web, the quantity of web information is increasing in an unprecedented pace, a great many of which are generated dynamically from background databases, and can't be indexed ...
详细信息
With the fast development of World Wide Web, the quantity of web information is increasing in an unprecedented pace, a great many of which are generated dynamically from background databases, and can't be indexed by traditional search engine, so we call them Deep Web. For the heterogeneous and dynamic features of Deep Web sources, classifying the Deep Web source by domain effectively is a significant precondition of Deep Web sources integration. In this paper, we consider the visible features of Deep Web and Maximum Entropy approach, and then on the basis of binary classification, we propose a new multivariate classification approach based on Maximum Entropy towards Deep Web sources. In addition, we propose a Feedback algorithm to improve the accuracy of classification. An experimental evaluation over real Web data shows that, our approach could provide an effective and general solution to the multivariate classification of Deep Web sources.
暂无评论