Hyperspectral cameras provide useful discriminants for human face recognition that cannot be obtained by other imaging methods. We examine the utility of using near-infrared hyperspectral images for the recognition of...
详细信息
Hyperspectral cameras provide useful discriminants for human face recognition that cannot be obtained by other imaging methods. We examine the utility of using near-infrared hyperspectral images for the recognition of faces over a database of 200 subjects. The hyperspectral images were collected using a CCD camera equipped with a liquid crystal tunable filter. Spectral measurements over the near-infrared allow the sensing of subsurface tissue structure which is significantly different from person to person but relatively stable over time. The local spectral properties of human tissue are nearly invariant to face orientation and expression which allows hyperspectral discriminants to be used for recognition over a large range of poses and expressions. We describe a face recognition algorithm that exploits spectral measurements for multiple facial tissue types. We demonstrate experimentally that this algorithm can be used to recognize faces over time in the presence of changes in facial pose and expression.
The shock scaffold is a hierarchical organization of the medial axis in 3D consisting of special medial points and curves connecting these points, thereby forming a geometric directed graph, which is key in applicatio...
详细信息
The shock scaffold is a hierarchical organization of the medial axis in 3D consisting of special medial points and curves connecting these points, thereby forming a geometric directed graph, which is key in applications such as object recognition. In this paper we describe a method for computing the shock scaffold of realistic datasets, which involve tens or hundreds of thousands of points, in a practical time-frame. Our approach is based on propagation along the scaffold from initial sources of flow by considering pairs of input points. We present seven principles which avoid the consideration of those pairs of points which cannot possibly lead to a shock flow;they involve: (i) the "visibility" of a point from another, (ii) the clustering of points, (iii) the visibility of a cluster from another, (iv) the convex hull of a cluster, (v) the vertices of such convex hulls as "virtual" points, (vi) a mult-resolution framework, and, finally, (vii) a search strategy-organized in layers.
This paper presents a method to perform 2D deformable object tracking using the boundary element method (BEM). BEM, like the finite element method (FEM), is a technique to model an elastic solid. BEM differs from FEM ...
详细信息
This paper presents a method to perform 2D deformable object tracking using the boundary element method (BEM). BEM, like the finite element method (FEM), is a technique to model an elastic solid. BEM differs from FEM in that only the contour of an object needs to be meshed for BEM, making this method attractive for computervision problems. For FEM, the interior of the object must be meshed also. In order to track deformable objects, a deformable template is defined that uses BEM to model displacements. The template is registered to the image by applying a force field that deforms the template to match the image. This force field is found using an energy minimization approach. Even though the deformable template uses a linear elastic model, it can be used to track the deformations of objects with non-linear material properties or in cases where there are large deformations. We demonstrate the performance of this method on objects with linear and non-linear elastic properties. In addition, it is discussed how this method can be readily extended to 3D deformable object tracking.
Many vision applications require precise measurement of scene radiance. The function relating scene radiance to image brightness is called the camera response. We analyze the properties that all camera responses share...
详细信息
Many vision applications require precise measurement of scene radiance. The function relating scene radiance to image brightness is called the camera response. We analyze the properties that all camera responses share. This allows us to find the constraints that any response function must satisfy. These constraints determine the theoretical space of all possible camera responses. We have collected a diverse database of real-world camera response functions (DoRF). Using this database we show that real-world responses occupy a small part of the theoretical space of all possible responses. We combine the constraints from our theoretical space with the data from DoRF to create a low-parameter Empirical Model of Response (EMoR). This response model allows us to accurately interpolate the complete response function of a camera from a small number of measurements obtained using a standard chart. We also show that the model can be used to accurately estimate the camera response from images of an arbitrary scene taken using different exposures. The DoRF database and the EMoR model can be downloaded at http://***/CAVE.
Libraries and other institutions are interested in providing access to scanned versions of their large collections of handwritten historical manuscripts on electronic media. Convenient access to a collection requires ...
详细信息
Libraries and other institutions are interested in providing access to scanned versions of their large collections of handwritten historical manuscripts on electronic media. Convenient access to a collection requires an index, which is manually created at great labour and expense. Since current handwriting recognizers do not perform well on historical documents, a technique called word spotting has been developed: clusters with occurrences of the same word in a collection are established using image matching. By annotating "interesting" clusters, an index can be built automatically. We present an algorithm for matching handwritten words in noisy historical documents. The segmented word images are reprocessed to create sets of 1-dimensional features, which are then compared using dynamic time warping. We present experimental results on two different data sets from the George Washington collection. Our experiments show that this algorithm performs better and is faster than competing matching techniques.
A "graphics for vision" approach is proposed to address the problem of reconstruction from a large and imperfect data set: reconstruction on demand by tensor voting, or ROD-TV. ROD-TV simultaneously delivers...
详细信息
A "graphics for vision" approach is proposed to address the problem of reconstruction from a large and imperfect data set: reconstruction on demand by tensor voting, or ROD-TV. ROD-TV simultaneously delivers good efficiency and robustness, by adapting to a continuum of primitive connectivity, view dependence, and levels of detail (LOD). Locally inferred surface elements are robust to noise and better capture local shapes. By inferring per-vertex normals at sub-voxel precision on the fly, we can achieve interpolative shading. Since these missing details can be recovered at the current level of detail, our result is not upper bounded by the scanning resolution. By relaxing the mesh connectivity requirement, we extend ROD-TV and propose a simple but effective multiscale feature extraction algorithm. ROD-TV consists of a hierarchical data structure that encodes different levels of detail. The local reconstruction algorithm is tensor voting. It is applied on demand to the visible subset of data at a desired level of detail, by traversing the data hierarchy and collecting tensorial support in a neighborhood. We compare our approach and present encouraging results.
We present a method for online rigid object tracking using an adaptive view-based appearance model. When the object's pose trajectory crosses itself, our tracker has bounded drift and can track objects undergoing ...
详细信息
We present a method for online rigid object tracking using an adaptive view-based appearance model. When the object's pose trajectory crosses itself, our tracker has bounded drift and can track objects undergoing large motion for long periods of time. Our tracker registers each incoming frame against the views of the appearance model using a two-frame registration algorithm. Using a linear Gaussian filter, we simultaneously estimate the pose of the object and adjust the view-based model as pose-changes are recovered from the registration algorithm. The adaptive view-based model is populated online with views of the object as it undergoes different orientations in pose space, allowing us to capture non-Lambertian effects. We tested our approach on a real-time rigid object tracking task using stereo cameras and observed an RMS error within the accuracy limit of an attached inertial sensor.
The likelihood models used in probabilistic visual tracking applications are often complex non-linear and/or non-Gaussian functions, leading to analytically intractable inference. Solutions then require numerical appr...
详细信息
The likelihood models used in probabilistic visual tracking applications are often complex non-linear and/or non-Gaussian functions, leading to analytically intractable inference. Solutions then require numerical approximation techniques, of which the particle filter is a popular choice. Particle filters, however, degrade in performance as the dimensionality of the state space increases and the support of the likelihood decreases. As an alternative to particle filters this paper introduces a variational approximation to the tracking recursion. The variational inference is intractable in itself, and is combined with an efficient importance sampling procedure to obtain the required estimates. The algorithm is shown to compare favourably with particle filtering techniques on a synthetic example and two real tracking problems. The first involves the tracking of a designated object in a video sequence based on its colour properties, whereas the second involves contour extraction in a single image.
Scene content understanding facilitates a large number of applications, ranging from content-based image retrieval to other multimedia applications. Material detection refers to the problem of identifying key semantic...
详细信息
Scene content understanding facilitates a large number of applications, ranging from content-based image retrieval to other multimedia applications. Material detection refers to the problem of identifying key semantic material types (such as sky, grass, foliage, water, and snow) in images. In this paper, we present a holistic approach to determining scene content, based on a set of individual material detection algorithms, as well as probabilistic spatial context models. A major limitation of individual material detectors is the significant number of misclassifications that occur because of the similarities in color and texture characteristics of various material types. We have developed a spatial context-aware material detection system that reduces misclassification by constraining the beliefs to conform to the probabilistic spatial context models. Experimental results show that the accuracy of materials detection is improved by 13% using the spatial context models over the individual material detectors themselves.
This paper addresses the problem of computing visual hulls from image contours. We propose a new hybrid approach which overcomes the precision-complexity trade-off inherent to voxel based approaches by taking advantag...
详细信息
This paper addresses the problem of computing visual hulls from image contours. We propose a new hybrid approach which overcomes the precision-complexity trade-off inherent to voxel based approaches by taking advantage of surface based approaches. To this aim, we introduce a space discretization which does not rely on a regular grid, where most cells are ineffective, but rather on an irregular grid where sample points lie on the surface of the visual hull. Such a grid is composed of tetrahedral cells obtained by applying a Delaunay triangulation on the sample points. These cells are carved afterward according to image silhouette information. The proposed approach keeps the robustness of volumetric approaches while drastically improving their precision and reducing their time and space complexities. It thus allows modeling of objects with complex geometry, and it also makes real time feasible for precise models. Preliminary results with synthetic and real data are presented.
暂无评论