In foggy weather, the contrast of images grabbed by in-vehicle cameras in the visible light range is drastically degraded, which makes the current applications very sensitive to weather conditions. An onboard vision s...
详细信息
ISBN:
(纸本)9781424411795
In foggy weather, the contrast of images grabbed by in-vehicle cameras in the visible light range is drastically degraded, which makes the current applications very sensitive to weather conditions. An onboard vision system should take fog effects into account. The effects of fog varies across the scene and are exponential with respect to the depth of scene points. Because it is not possible in this context to compute the road scene structure beforehand contrary to fixed camera surveillance, a new scheme is proposed. Weather conditions are first estimated and then used to restore the contrast according to a scene structure which is inferred a priori and refined during the restoration process. Based on the aimed application, different algorithms with increasing complexities are proposed. Results are presented using sample road scenes under foggy weather and assessed by computing the contrast before and after restoration.
It is well known that how to extract dynamical features is a key issue for video based face analysis. In this paper, we present a novel approach of facial action units (AU) and expression recognition based on coded dy...
详细信息
ISBN:
(纸本)9781424411795
It is well known that how to extract dynamical features is a key issue for video based face analysis. In this paper, we present a novel approach of facial action units (AU) and expression recognition based on coded dynamical features. In order to capture the dynamical characteristics of facial events, we design the dynamical haar-like features to represent the temporal variations of facial events. Inspired by the binary pattern coding, we further encode the dynamic haar-like features into binary pattern features, which are useful to construct weak classifiers for boosting learning. Finally the Adaboost is performed to learn a set of discriminating coded dynamic features for facial active units and expression recognition. Experiments on the CMU expression database and our own facial AU database show its encouraging performance.
A major shortcoming of discriminative recognition and detection methods is their noise sensitivity, both during training and recognition. This may lead to very sensitive and brittle recognition systems focusing on irr...
详细信息
ISBN:
(纸本)9781424411795
A major shortcoming of discriminative recognition and detection methods is their noise sensitivity, both during training and recognition. This may lead to very sensitive and brittle recognition systems focusing on irrelevant information. This paper proposes a method that selects generative and discriminative features. In particular, we boost classical Haar-likefeatures and use the same features to approximate a generative model (i.e., eigenimages). A modified error function for boosting ensures that only fea- tures are selected that show a good discrimination and reconstruction. This allows a robust feature selection using boosting. Thus, we can handle problems where discriminant classifiers fail while still retaining the discriminative power Our experiments show that we can significantly improve the recognition performance when learning from noisy data. Moreover, the feature type used allows efficient recognition and reconstruction.
A visual word lexicon can be constructed by clustering primitive visual features, and a visual object can be described by a set of visual words. Such a "bag-of-words" representation has led to many significa...
详细信息
ISBN:
(纸本)9781424411795
A visual word lexicon can be constructed by clustering primitive visual features, and a visual object can be described by a set of visual words. Such a "bag-of-words" representation has led to many significant results in various vision tasks including object recognition and categorization. However, in practice, the clustering of primitive visual features tends to result in synonymous visual words that over-represent visual patterns, as well as polysemous visual words that bring large uncertainties and ambiguities in the representation. This paper aims at generating a higher-level lexicon, i.e. visual phrase lexicon, where a visual phrase is a meaningful spatially co-occurrent pattern of visual words. This higher-level lexicon is much less ambiguous than the lower-level one. The contributions of this paper include: (1) a fast and principled solution to the discovery of significant spatial co-occurrent patterns using frequent itemset mining;(2) a pattern summarization method that deals with the compositional uncertainties in visual phrases;and (3) a top-down refinement scheme of the visual word lexicon by feeding back discovered phrases to tune the similarity measure through metric learning.
Stereo correspondence methods rely on matching costs for computing the similarity of image locations. In this paper we evaluate the insensitivity of different matching costs with respect to radiometric variations of t...
详细信息
ISBN:
(纸本)9781424411795
Stereo correspondence methods rely on matching costs for computing the similarity of image locations. In this paper we evaluate the insensitivity of different matching costs with respect to radiometric variations of the input images. We consider both pixel-based and window-based variants and measure their performance in the presence of global intensity changes (e.g., due to gain and exposure differences), local intensity changes (e.g., due to vignetting, non-Lambertian surfaces, and varying lighting), and noise. Using existing stereo datasets with ground-truth disparities as well as six new datasets taken under controlled changes of exposure and lighting, we evaluate the different costs with a local, a semi-global, and a global stereo method.
We present a new formulation to multi-view stereo that treats the problem as probabilistic 3D segmentation. Previous work has used the stereo photo-consistency criterion as a detector of the boundary between the 3D sc...
详细信息
ISBN:
(纸本)9781424411795
We present a new formulation to multi-view stereo that treats the problem as probabilistic 3D segmentation. Previous work has used the stereo photo-consistency criterion as a detector of the boundary between the 3D scene and the surrounding empty space. Here we show how the same criterion can also provide a foreground/background model that can predict if a 3D location is inside or outside the scene. This model replaces the commonly used naive foreground model based on ballooning which is known to perform poorly in concavities. We demonstrate how the probabilistic visibility is linked to previous work on depth-map fusion and we present a multi-resolution graph-cut implementation using the new ballooning term that is very efficient both in terms of computation time and memory requirements.
Many computervision and patternrecognition problems involve the use of finite Gaussian mixture models. Finite mixture model using generalized Dirichlet distribution has been shown as a robust alternative of normal m...
详细信息
ISBN:
(纸本)9781424411795
Many computervision and patternrecognition problems involve the use of finite Gaussian mixture models. Finite mixture model using generalized Dirichlet distribution has been shown as a robust alternative of normal mixtures. In this paper, we adopt a Bayesian approach for generalized Dirichlet mixture estimation and selection. This approach, offers a solid theoretical framework for combining both the statistical model learning and the knowledge acquisition. The estimation of the parameters is based on the Monte Carlo simulation technique of Gibbs sampling mixed with a Metropolis-Hastings step. For the selection of the number of clusters, we used Bayes factors. We have successfully applied the proposed Bayesian framework to model IR eyes. Experimental results are shown to demonstrate the robustness, efficiency, and accuracy of the algorithm.
Belief propagation over pairwise connected Markov Random Fields has become a widely used approach, and has been successfully applied to several important computervision problems. However, pairwise interactions are of...
详细信息
ISBN:
(纸本)9781424411795
Belief propagation over pairwise connected Markov Random Fields has become a widely used approach, and has been successfully applied to several important computervision problems. However, pairwise interactions are often insufficient to capture the full statistics of the problem. Higher-order interactions are sometimes required. Unfortunately, the complexity of belief propagation is exponential in the size of the largest clique. In this paper, we introduce a new technique to compute belief propagation messages in time linear with respect to clique size for a large class of potential functions over real-valued variables. We demonstrate this technique in two applications. First, we perform efficient inference in graphical models where the spatial prior of natural images is captured by 2 x 2 cliques. This approach shows significant improvement over the commonly used pairwise-connected models, and may benefit a variety of applications using belief propagation to infer images or range images. Finally, we apply these techniques to shape from-shading and demonstrate significant improvement over previous methods, both in quality and in flexibility.
Applications in computervision involve statistically analyzing an important class of constrained, nonnegative functions, including probability density functions (in texture analysis), dynamic time-warping functions (...
详细信息
ISBN:
(纸本)9781424411795
Applications in computervision involve statistically analyzing an important class of constrained, nonnegative functions, including probability density functions (in texture analysis), dynamic time-warping functions (in activity analysis), and re-parametrization or non-rigid registration functions (in shape analysis of curves). For this one needs to impose a Riemannian structure on the spaces formed by these functions. We propose a "spherical" version of the Fisher-Rao metric that provides closed-form expressions for geodesics and distances, and allows fast computation of sample statistics. To demonstrate this approach, we present an application in planar shape classification.
The ability of human visual system to detect visual saliency is extraordinarily fast and reliable. However, computational modeling of this basic intelligent behavior still remains a challenge. This paper presents a si...
详细信息
ISBN:
(纸本)9781424411795
The ability of human visual system to detect visual saliency is extraordinarily fast and reliable. However, computational modeling of this basic intelligent behavior still remains a challenge. This paper presents a simple method for the visual saliency detection. Our model is independent of features, categories, or other forms of prior knowledge of the objects. By analyzing the log-spectrum of an input image, we extract the spectral residual of an image in spectral domain, and propose a fast method to construct the corresponding saliency map in spatial domain. We test this model on both natural pictures and artificial images such as psychological patterns. The result indicate fast and robust saliency detection of our method.
暂无评论