Many computervision applications rely on the efficient optimization of challenging, so-called non-submodular, binary pairwise MRFs. A promising graph cut based approach for optimizing such MRFs known as "roof du...
详细信息
ISBN:
(纸本)9781424411795
Many computervision applications rely on the efficient optimization of challenging, so-called non-submodular, binary pairwise MRFs. A promising graph cut based approach for optimizing such MRFs known as "roof duality" was recently introduced into computervision. We study two methods which extend this approach. First, we discuss an efficient implementation of the "probing" technique introduced recently by Boros et al. [5]. It simplifies the MRF while preserving the global optimum. Our code is 400-700 faster on some graphs than the implementation of [5]. Second, we present a new technique which takes an arbitrary input labeling and tries to improve its energy. We give theoretical characterizations of local minima of this procedure. We applied both techniques to many applications, including image segmentation, new view synthesis, superresolution, diagram recognition, parameter learning, texture restoration, and image deconvolution. For several applications we see that we are able to find the global minimum very efficiently, and considerably outperform the original roof duality approach. In comparison to existing techniques, such as graph cut, TRW, BP, ICM, and simulated annealing, we nearly always find a lower energy.
Many computervision and patternrecognition problems involve the use of finite Gaussian mixture models. Finite mixture model using generalized Dirichlet distribution has been shown as a robust alternative of normal m...
详细信息
ISBN:
(纸本)9781424411795
Many computervision and patternrecognition problems involve the use of finite Gaussian mixture models. Finite mixture model using generalized Dirichlet distribution has been shown as a robust alternative of normal mixtures. In this paper, we adopt a Bayesian approach for generalized Dirichlet mixture estimation and selection. This approach, offers a solid theoretical framework for combining both the statistical model learning and the knowledge acquisition. The estimation of the parameters is based on the Monte Carlo simulation technique of Gibbs sampling mixed with a Metropolis-Hastings step. For the selection of the number of clusters, we used Bayes factors. We have successfully applied the proposed Bayesian framework to model IR eyes. Experimental results are shown to demonstrate the robustness, efficiency, and accuracy of the algorithm.
A fast and low-cost method of acquiring 3D point cloud data is proposed in this paper, which can solve the problems of lack of texture information and low efficiency of acquiring point cloud data with only one pair of...
详细信息
ISBN:
(纸本)9781510617223;9781510617216
A fast and low-cost method of acquiring 3D point cloud data is proposed in this paper, which can solve the problems of lack of texture information and low efficiency of acquiring point cloud data with only one pair of cheap cameras and projector. Firstly, we put forward a scene adaptive design method of random encoding pattern, that is, a coding pattern is projected onto the target surface in order to form texture information, which is favorable for image matching. Subsequently, we design an efficient dense matching algorithm that fits the projected texture. After the optimization of global algorithm and multi-kernel parallel development with the fusion of hardware and software, a fast acquisition system of point-cloud data is accomplished. Through the evaluation of point cloud accuracy, the results show that point cloud acquired by the method proposed in this paper has higher precision. What's more, the scanning speed meets the demand of dynamic occasion and has better practical application value.
Building recognition is an important field in computervision. Building target line features which represent the target geometry information are stable features in infrared images. In this paper, the stable building l...
详细信息
Gait is an attractive biometric for vision-based human identification. Previous work on existing public data sets has shown that shape cues yield improved recognition rates compared to pure motion cues. However, shape...
详细信息
ISBN:
(纸本)9781424411795
Gait is an attractive biometric for vision-based human identification. Previous work on existing public data sets has shown that shape cues yield improved recognition rates compared to pure motion cues. However, shape cues are fragile to gross appearance variations of an individual, for example, walking while carrying a ball or a backpack. We introduce a novel, spatio temporal Shape Variation-Based Frieze pattern (SVB frieze pattern) representation for gait, which captures motion information over time. The SVB frieze pattern represents normalized frame difference over gait cycles. Rows/columns of the vertical/horizontal SVB frieze pattern contain motion variation information augmented by key frame information with body shape. A temporal symmetry map of gait patterns is also constructed and combined with vertical/horizontal SVB frieze patterns for measuring the dissimilarity between gait sequences. Experimental results show that our algorithm improves gait recognition performance on sequences with and without gross differences in silhouette shape. We demonstrate superior performance of this computational framework over previous algorithms using shape cues alone on both CMU MoBo and UoS HumanID gait databases.
In this paper, we present a novel method for learning complex concepts/hypotheses directly from raw training data. The task addressed here concerns data-driven synthesis of recognition procedures for real-world object...
详细信息
In this paper, we present a novel method for learning complex concepts/hypotheses directly from raw training data. The task addressed here concerns data-driven synthesis of recognition procedures for real-world object recognition. The method uses linear genetic programming to encode potential solutions expressed in terms of elementary operations, and handles the complexity of the learning task by applying cooperative coevolution to decompose the problem automatically at the genotype level. The training coevolves feature extraction procedures, each being a sequence of elementary image processing and computervision operations applied to input images. Extensive experimental results show that the approach attains competitive performance for three-dimensional object recognition in real synthetic aperture radar imagery.
The active appearance model (AAM) is a powerful method for modeling deformable visual objects. One of the major drawbacks of the AAM is that it requires a training set of pseudo-dense correspondences over the whole da...
详细信息
ISBN:
(纸本)9781424411795
The active appearance model (AAM) is a powerful method for modeling deformable visual objects. One of the major drawbacks of the AAM is that it requires a training set of pseudo-dense correspondences over the whole database. In this work, we investigate the utility of stereo constraints for automatic model building from video. First, we propose a new method for automatic correspondence finding in monocular images which is based on an adaptive template tracking paradigm. We then extend this method to take the scene geometry into account, proposing three approaches, each accounting for the availability of the fundamental matrix and calibration parameters or the lack thereof The performance of the monocular method was first evaluated on a pre-annotated database of a talking face . We then compared the monocular method against its three stereo extensions using a stereo database.
Belief propagation over pairwise connected Markov Random Fields has become a widely used approach, and has been successfully applied to several important computervision problems. However, pairwise interactions are of...
详细信息
ISBN:
(纸本)9781424411795
Belief propagation over pairwise connected Markov Random Fields has become a widely used approach, and has been successfully applied to several important computervision problems. However, pairwise interactions are often insufficient to capture the full statistics of the problem. Higher-order interactions are sometimes required. Unfortunately, the complexity of belief propagation is exponential in the size of the largest clique. In this paper, we introduce a new technique to compute belief propagation messages in time linear with respect to clique size for a large class of potential functions over real-valued variables. We demonstrate this technique in two applications. First, we perform efficient inference in graphical models where the spatial prior of natural images is captured by 2 x 2 cliques. This approach shows significant improvement over the commonly used pairwise-connected models, and may benefit a variety of applications using belief propagation to infer images or range images. Finally, we apply these techniques to shape from-shading and demonstrate significant improvement over previous methods, both in quality and in flexibility.
Markov Random Field (MRF) models are a popular tool for vision and image processing. Gaussian MRF models are particularly convenient to work with because they can be implemented using matrix and linear algebra routine...
详细信息
ISBN:
(纸本)9781424411795
Markov Random Field (MRF) models are a popular tool for vision and image processing. Gaussian MRF models are particularly convenient to work with because they can be implemented using matrix and linear algebra routines. However, recent research has focused on on discrete-valued and non-convex MRF models because Gaussian models tend to over-smooth images and blur edges. In this paper, we show how to train a Gaussian Conditional Random Field (GCRF) model that overcomes this weakness and can outperform the non-convex Field of Experts model on the task of denoising images. A key advantage of the GCRF model is that the parameters of the model can be optimized efficiently on relatively large images. The competitive performance of the GCRF model and the ease of optimizing its parameters make the GCRF model an attractive option for vision and image processing applications.
We introduce a new framework, namely Tensor Canonical Correlation Analysis (TCCA) which is an extension of classical Canonical Correlation Analysis (CCA) to multidimensional data arrays (or tensors) and apply this for...
详细信息
ISBN:
(纸本)9781424411795
We introduce a new framework, namely Tensor Canonical Correlation Analysis (TCCA) which is an extension of classical Canonical Correlation Analysis (CCA) to multidimensional data arrays (or tensors) and apply this for action/gesture classification in videos. By Tensor CCA, joint space-time linear relationships of two video volumes are inspected to yield flexible and descriptive similarity features of the two videos. The TCCA features are combined with a discriminative feature selection scheme and a Nearest Neighbor classifier for action classification. In addition, we propose a time-efficient action detection method based on dynamic learning of subspaces for Tensor CCA for the case that actions are not aligned in the space-time domain. The proposed method delivered significantly better accuracy and comparable detection speed over state-of-the-art methods on the KTH action data set as well as self-recorded hand gesture data sets.
暂无评论