We present a new algorithm that does motion segmentation by tracking small textured patches and then clustering them using EM. A small patch has the advantage that its motion is well modeled by uniform flow and runs a...
详细信息
We propose a novel approach to point matching under large viewpoint and illumination changes that is suitable for accurate object pose estimation at a much lower computational cost than state-of-the-art methods. Most ...
详细信息
We propose a novel approach to point matching under large viewpoint and illumination changes that is suitable for accurate object pose estimation at a much lower computational cost than state-of-the-art methods. Most of these methods rely either on using ad hoc local descriptors or on estimating local affine deformations. By contrast, we treat wide baseline matching of keypoints as a classification problem, in which each class corresponds to the set of all possible views of such a point. Given one or more images of a target object, we train the system by synthesizing a large number of views of individual keypoints and by using statistical classification tools to produce a compact description of this view set At run-time, we rely on this description to decide to which class, if any, an observed feature belongs. This formulation allows us to use a classification method to reduce matching error rates, and to move some of the computational burden from matching to training, which can be performed beforehand. In the context of pose estimation, we present experimental results for both planar and non-planar objects in the presence of occlusions, illumination changes, and cluttered backgrounds. We will show that our method is both reliable and suitable for initializing real-time applications.
Skin detection is an important preliminary process in human motion analysis. It is commonly performed in three steps: transforming the pixel color to a non-RGB colorspace, dropping the illuminance component of skin co...
详细信息
Skin detection is an important preliminary process in human motion analysis. It is commonly performed in three steps: transforming the pixel color to a non-RGB colorspace, dropping the illuminance component of skin color, and classifying by modeling the skin color distribution. In this paper, we evaluate the effect of these three steps on the skin detection performance. The importance of this study is a new comprehensive colorspace and color modeling testing methodology that would allow for making the best choices for skin detection. Combinations of nine colorspaces, the presence of the absence of the illuminance component, and the two color modeling approaches are compared. The performance is measured by using a receiver operating characteristic (ROC) curve on a large dataset of 805 images with manual ground truth. The results reveal that (1) colorspace transformations can improve performance in certain instances, (2) the absence of the illuminance component decreases performance, and (3) skin color modeling has a greater impact than colorspace transformation. We found that the best performance was obtained by transforming the pixel color to the SCT or HSI colorspaces, keeping the illuminance component, and modeling the color with the histogram approach.
The appearance of surface texture as it varies with angular changes of view and illumination is becoming an increasingly important research topic. The bidirectional texture function (BTF) is used in surface modeling b...
详细信息
The appearance of surface texture as it varies with angular changes of view and illumination is becoming an increasingly important research topic. The bidirectional texture function (BTF) is used in surface modeling because it describes observed image texture as a function of imaging parameters. The BTF has no geometric information, as it is based solely on observed texture appearance. Computational tasks such as recognizing or rendering typically require projecting a sampled BTF to a lower dimensional subspace or clustering to extract representative testons. However, mere is a serious drawback to this approach. Specifically, cast shadowing and occlusions are not fully captured. When recovering the full BTF from a sampled BTF with interpolation, the following two characteristics are difficult or impossible to reproduce: (1) the position and contrast of the shadow border, (2) the movement of the shadow border when the imaging parameters are changed continuously. For a textured surface, the nonlinear effects of cast shadows and occlusions are not negligible. On the contrary, these effects occur throughout the surface and are important perceptual cues to infer surface type. In this paper we present a texture representation that integrates appearance-based information from the sampled BTF with concise geometric information inferred from the sampled BTF. The model is a hybrid of geometric and image-based models and has key advantages in a wide range of tasks, including texture prediction, recognition, and synthesis.
Selecting salient points from two or more images for computing correspondences is a fundamental problem in image analysis. Three methods originally proposed by Harris et al. in [A combined corner and edge detector], b...
详细信息
In this paper, a novel system for real time face tracking and animation is presented. The system is composed of two major components: (1) real time infra-red (IR) based active facial feature tracking, and (2) real tim...
详细信息
Face recognition is an important issue on video indexing and retrieval applications. Usually, supervised learning is used to build face models for various specific named individuals. However, a huge amount of labeling...
详细信息
Manifold mosaicing is a fast and robust way to summarize video sequences captured by a moving camera. It is also useful for rendering compelling 3D visualizations from a video without estimating the 3D structure of th...
详细信息
In this paper we present a method to recognize an object class by learning a statistical model of the class. The probabilistic model decomposes the appearance of an object class into a set of local parts and models th...
详细信息
This paper presents a simple but robust visual tracking algorithm based on representing the appearances of objects using affine warps of learned linear subspaces of the image space. The tracker adaptively updates this...
详细信息
This paper presents a simple but robust visual tracking algorithm based on representing the appearances of objects using affine warps of learned linear subspaces of the image space. The tracker adaptively updates this subspace while tracking by finding a linear subspace that best approximates the observations made in the previous frames. Instead of the traditional L2- reconstruction error norm which leads to subsapce estimation using PCA or SVD, we argue that a variant of it, the uniform L2-reconstruction error norm, is the right one for tracking. Under this framework, we provide a simple and a computationally inexpensive algorithm for finding a subspace whose uniform L2-reconstruction error norm for a given collection of data samples is below some threshold, and a simple tracking algorithm is an immediate consequence. We show experimental results on a variety of image sequences of people and man-made objects moving under challenging imaging conditions, which include drastic illumination variation, partial occlusion and extreme pose variation.
暂无评论