Recent results on stereo indicate that an accurate segmentation is crucial for obtaining faithful depth maps. Variational methods have successfully been applied to both image segmentation and computational stereo. In ...
详细信息
ISBN:
(纸本)9781424411795
Recent results on stereo indicate that an accurate segmentation is crucial for obtaining faithful depth maps. Variational methods have successfully been applied to both image segmentation and computational stereo. In this paper we propose a combination in a unified framework. In particular, we use a Mumford-Shah-like functional to compute a piecewise smooth depth map of a stereo pair. Our approach has two novel features: First, the regularization term of the functional combines edge information obtained from the color segmentation with flow-driven depth discontinuities emerging during the optimization procedure. Second, we propose a robust data term which adaptively selects the best matches obtained from different weak stereo algorithms. We integrate these features in a theoretically consistent framework. The final depth map is the minimizer of the energy functional, which can be solved by the associated functional derivatives. The underlying numerical scheme allows an efficient implementation on modern graphics hardware. We illustrate the performance of our algorithm using the Middlebury database as well as on real imagery.
In this paper we propose a technique to detect anomalies in individual and interactive event sequences. We categorize anomalies into two classes: abnormal event, and abnormal context, and model them in the Sequential ...
详细信息
ISBN:
(纸本)9781424411795
In this paper we propose a technique to detect anomalies in individual and interactive event sequences. We categorize anomalies into two classes: abnormal event, and abnormal context, and model them in the Sequential Monte Carlo framework which is extended by Markov Random Field for tracking interactive events. Firstly, we propose a novel pixel-wise event representation method to construct feature images, in which each blob corresponds to a visual event. Then we transform the original blob-level features into subspaces to model probabilistic appearance manifolds for each event-class. With the probability of an observation associated with each event-class (or state) derived from probabilistic manifolds, and state transitional probability, the prior and posterior state distributions can be estimated. We demonstrate in experiments that the approach can reliably detect such anomalies with low false alarm rates.
We formulate single-image multi-label segmentation into regions coherent in texture and color as a MAX-SUM problem for which efficient linear programming based solvers have recently appeared. By handling more than two...
详细信息
ISBN:
(纸本)9781424411795
We formulate single-image multi-label segmentation into regions coherent in texture and color as a MAX-SUM problem for which efficient linear programming based solvers have recently appeared. By handling more than two labels, we go beyond widespread binary segmentation methods, e.g., MIN-CUT or normalized cut based approaches. We show that the MAX-SUM solver is a very powerful tool for obtaining the MAP estimate of a Markov random field (MRF). We build the MRF on superpixels to speed up the segmentation while preserving color and texture. We propose new quality functions for setting the MRF, exploiting priors from small representative image seeds, provided either manually or automatically. We show that the proposed automatic segmentation method outperforms previous techniques in terms of the Global Consistency Error evaluated on the Berkeley segmentation database.
Generalized correlation filters are proposed to improve recognition of a linearly distorted object embedded in a nonoverlapping background when the input scene is degraded with a linear system and additive noise. Seve...
详细信息
Generalized correlation filters are proposed to improve recognition of a linearly distorted object embedded in a nonoverlapping background when the input scene is degraded with a linear system and additive noise. Several performance criteria defined for the nonoverlapping signal model are used for the design of filters. The derived filters take into account information about an object to be recognized, disjoint background, noise, and linear degradations of the target and the input scene. computer simulation results obtained with the proposed filters are discussed and compared with those of various correlation filters in terms of discrimination capability, location errors, and tolerance to input noise. (c) 2007 Optical Society of America.
New-view synthesis (NVS) using texture priors (as opposed to surface-smoothness priors) can yield high quality results, but the standard formulation is in terms of large-clique Markov Random Fields (MRFs). Only local ...
详细信息
ISBN:
(纸本)9781424411795
New-view synthesis (NVS) using texture priors (as opposed to surface-smoothness priors) can yield high quality results, but the standard formulation is in terms of large-clique Markov Random Fields (MRFs). Only local optimization methods such as iterated conditional modes, which are prone to fall into local minima close to the initial estimate, are practical for solving these problems. In this paper we replace the large-clique energies with pairwise potentials, by restricting the patch dictionary for each clique to image regions suitable for that clique. This enables for the first time the use of a global optimization method, such as tree-reweighted message passing, to solve the NVS problem with image-based priors. We employ a robust, truncated quadratic kernel to reject outliers caused by occlusions, specularities and moving objects, within our global optimization. Because the MRF optimization is thus fast, computing the unary potentials becomes the new performance bottleneck. An additional contribution of this paper is a novel, fast method for enumerating color modes of the per-pixel unary potentials, despite the non-convex nature of our robust kernel. We compare the results of our technique with other rendering methods, and discuss the relative merits and flaws of regularizing color, and of local versus global dictionaries.
We present a novel approach for analyzing two-dimensional (2D) flow field data based on the idea of invariant moments. Moment invariants have traditionally been used in computervision applications, and we have adapte...
详细信息
We present a novel approach for analyzing two-dimensional (2D) flow field data based on the idea of invariant moments. Moment invariants have traditionally been used in computervision applications, and we have adapted them for the purpose of interactive exploration of flow field data. The new class of moment invariants we have developed allows us to extract and visualize 2D flow patterns, invariant under translation, scaling, and rotation. With our approach one can study arbitrary flow patterns by searching a given 2D flow data set for any type of pattern as specified by a user. Further, our approach supports the computation of moments at multiple scales, facilitating fast pattern extraction and recognition. This can be done for critical point classification, but also for patterns with greater complexity. This multi-scale moment representation is also valuable for the comparative visualization of flow field data. The specific novel contributions of the work presented are the mathematical derivation of the new class of moment invariants, their analysis regarding critical point features, the efficient computation of a novel feature space representation, and based upon this the development of a fast patternrecognition algorithm for complex flow structures.
Information given by a single-waveband imaging sensor doesn't satisfy the battlefield's needs. Many excellent characteristics of biotic vision are gradually applied in the design of intelligent imaging-guided ...
详细信息
ISBN:
(纸本)9780819469502
Information given by a single-waveband imaging sensor doesn't satisfy the battlefield's needs. Many excellent characteristics of biotic vision are gradually applied in the design of intelligent imaging-guided missile. Combined with the function of lateral inhibition network in vision, a schematic diagram of intelligent infrared imaging-guided guidance head is proposed. Enlightened by the large field (LF) and small field (SF) of fly, the paralleled implement scheme of spatial double mode is proposed. A physical model of infrared imaging guidance head is given. The guidance head simulates the whole imaging progress of fly's ommateum. And its field of view is 360 degrees. Then the synthetical application flow on imaging guidance referenced fly's vision system is proposed. The exploring study is beneficial and referenced to the design of intending imaging-guided system.
Recently, a number of empirical studies have compared the performance of PCA and ICA as feature extraction methods in appearance-based object recognition systems, with mixed and seemingly contradictory results. In thi...
详细信息
Recently, a number of empirical studies have compared the performance of PCA and ICA as feature extraction methods in appearance-based object recognition systems, with mixed and seemingly contradictory results. In this paper, we briefly describe the connection between the two methods and argue that whitened PCA may yield identical results to ICA in some cases. Furthermore, we describe the specific situations in which ICA might significantly improve on PCA.
A critical function in both machine vision and biological vision systems is attentional selection of scene regions worthy of further analysis by higher-level processes such as object recognition. Here we present the f...
详细信息
ISBN:
(纸本)9781424411795
A critical function in both machine vision and biological vision systems is attentional selection of scene regions worthy of further analysis by higher-level processes such as object recognition. Here we present the first model of spatial attention that (1) can be applied to arbitrary static and dynamic image sequences with interactive tasks and (2) combines a general computational implementation of both bottom-up (BU) saliency and dynamic top-down (TD) task relevance;the claimed novelty lies in the combination of these elements and in the fully computational nature of the model. The BU component computes a saliency map from 12 low-level multi-scale visual features. The TD component computes a low-level signature of the entire image, and learns to associate different classes of signatures with the different gaze patterns recorded from human subjects performing a task of interest. We measured the ability of this model to predict the eye movements of people playing contemporary video games. We found that the TD model alone predicts where humans look about twice as well as does the BU model alone;in addition, a combined BU*TD model performs significantly better than either individual component. Qualitatively, the combined model predicts some easy-to-describe but hard-to-compute aspects of attentional selection, such as shifting attention leftward when approaching a left turn along a racing track. Thus, our study demonstrates the advantages of integrating BU factors derived from a saliency map and TD factors learned from image and task contexts in predicting where humans look while performing complex visually-guided behavior.
In this paper we investigate the effect of substantial inter-image intensity changes and changes in modality on the performance of keypoint detection, description, and matching algorithms in the context of image regis...
详细信息
ISBN:
(纸本)9781424411795
In this paper we investigate the effect of substantial inter-image intensity changes and changes in modality on the performance of keypoint detection, description, and matching algorithms in the context of image registration. In doing so, we modify widely-used keypoint descriptors such as SIFT and shape contexts, attempting to capture the insight that some structural information is indeed preserved between images despite dramatic appearance changes. These extensions include (a) pairing opposite-direction gradients in the formation of orientation histograms and (b) focusing on edge structures only. We also compare the stability of MSER, Laplacian-of-Gaussian, and Harris corner keypoint location detection and the impact of detection errors on matching results. Our experiments on multimodal image pairs and on image pairs with significant intensity differences show that indexing based on our modified descriptors produces more correct matches on difficult pairs than current techniques at the cost of a small decrease in performance on easier pairs. This extends the applicability of image registration algorithms such as the Dual-Bootstrap which rely on correctly matching only a small number of keypoints.
暂无评论