Orientation-based representations are well-suited to vision tasks including viewpoint independent object recognition and 3D attitude determination. The key property that orientation-based representations share is that...
详细信息
Orientation-based representations are well-suited to vision tasks including viewpoint independent object recognition and 3D attitude determination. The key property that orientation-based representations share is that they rotate in the same way as the object rotates. Combinations of orientation-based representations of a model and of a sensed object determine inequalities that become equalities if and only if the object and model match both in identity and in attitude. This results in optimization problems that can be solved by standard numerical methods. The paper unifies and extends previous work based on the Extended Gaussian Image (EGI) representation. It provides the theoretical basis for new approaches to object recognition and attitude determination using dense surface data. It extends results on convex polyhedra to the domain of smooth, strictly convex surfaces. The class of shapes covered also is extended to include starshaped sets. The theoretical results lead to feasible algorithms that are both accurate and robust. A proof-of-concept system has been implemented and experiments conducted both on synthesized data and on data obtained from real objects.< >
This paper introduces a sensor placement measure called resolvability. The measure provides a technique for estimating the relative ability of various visual sensors, including monocular systems, stereo pairs, multi-b...
详细信息
This paper introduces a sensor placement measure called resolvability. The measure provides a technique for estimating the relative ability of various visual sensors, including monocular systems, stereo pairs, multi-baseline stereo systems, and 3D rangefinders, to accurately control visually manipulated objects. The resolvability ellipsoid illustrates the directional nature of resolvability, and can be used to direct camera motion and adjust camera intrinsic parameters in real-time so that the servoing accuracy of the visual servoing system improves with camera-lens motion. The Jacobian mapping from task space to sensor space is derived for a monocular system, a stereo pair with parallel optical axes, and a stereo pair with perpendicular optical axes. Resolvability ellipsoids based on these mappings for various sensor configurations are presented. Visual servoing experiments demonstrate that resolvability can be used to direct camera-lens motion in order to increase the ability of a visually servoed manipulator to precisely servo objects.< >
We previously presented (Sarkar and Boyer, 1993) the Perceptual Inference Network (PIN), a formalism based on Bayesian Networks, to reason among a set of object or feature hypotheses and to integrate multiple sources ...
详细信息
We previously presented (Sarkar and Boyer, 1993) the Perceptual Inference Network (PIN), a formalism based on Bayesian Networks, to reason among a set of object or feature hypotheses and to integrate multiple sources of information in the context of perceptual organization. The design of a PIN requires knowledge of the dependency structure among the organizations of interest and the specification of the conditional probabilities. This design was done manually with large doses of tedium and guesswork. In this paper we present an algorithm based on structural entropic measures and random parametric structural descriptions (RPSDs) to design a PIN automatically and in a (more) theoretically sound fashion. Experimental results present evidence of the robustness of the algorithm and make performance comparisons on real image data with a manually structured PIN. Since PINs are a form of Bayesian Network, we hope that this work will also prove useful towards structuring Bayesian Networks in other computervision contexts.< >
Deriving 3D structure in a fixed object-centered coordinate system is an increasingly popular trend in shape from multiple views. For linear approximations to perspective projection (weak/para perspective), and for th...
详细信息
Deriving 3D structure in a fixed object-centered coordinate system is an increasingly popular trend in shape from multiple views. For linear approximations to perspective projection (weak/para perspective), and for the case of image velocities, elegant linear methods have been devised for robust estimation. For reconstruction under arbitrary view transformations, linear projective methods using point correspondences have been suggested. In this paper, we formulate the problem of intrinsic 3D structure estimation through perspective projection using motion parallax, defined with respect to an arbitrary plane in the environment. It is shown that if an image coordinate system is warped using plane projective transformation with respect to a reference view, the residual image motion is dependent only on the epipoles and has a simple relation to the 3D structure. Our computational scheme avoids point/line correspondence and is based on hierarchical estimation and image warping working directly with spatio-temporal image intensities.< >
We propose a geometric smoothing method based on local curvature in shapes and images which is governed by the geometric heat equation and is a special case of the reaction-diffusion framework proposed by Faugeras (19...
详细信息
We propose a geometric smoothing method based on local curvature in shapes and images which is governed by the geometric heat equation and is a special case of the reaction-diffusion framework proposed by Faugeras (1990). For shapes, the approach is analogous to the classical heat equation smoothing, but with a renormalization by arc-length at each infinitesimal step. For images, the smoothing is similar to anisotropic diffusion in that, since the component of diffusion in the direction of the brightness gradient is nil, edge location and sharpness are left intact. We present several properties of curvature deformation smoothing of shape: it preserves inclusion order, annihilates extrema and inflection points without creating new ones, decreases total curvature, satisfies the semi-group property allowing for local iterative computations, etc. Curvature deformation smoothing of an image is based on viewing it as a collection of iso-intensity level sets, each of which is smoothed by curvature and then reassembled. This is shown to be mathematically sound and applicable to medical, aerial and range images.< >
A method for computing the 3D camera motion (the ego-motion) in a static scene is introduced, which is based on computing the 2D image motion of a single image region directly from image intensities. The computed imag...
详细信息
A method for computing the 3D camera motion (the ego-motion) in a static scene is introduced, which is based on computing the 2D image motion of a single image region directly from image intensities. The computed image motion of this image region is used to register the images so that the detected image region appears stationary. The resulting displacement field for the entire scene between the registered frames is affected only by the 3D translation of the camera. After canceling the effects of the camera rotation by using such 2D image registration, the 3D camera translation is computed by finding the focus-of-expansion in the translation-only set of registered frames. This step is followed by computing the camera rotation to complete the computation of the ego-motion. The presented method avoids the inherent problems in the computation of optical flow and of feature matching, and does not assume any prior feature detection or feature correspondence.< >
A probe based approach is presented for the recognition of targets in a cluttered background using an infrared imager. A probe is a simple mathematical function which operates locally on image grey levels and produces...
详细信息
A probe based approach is presented for the recognition of targets in a cluttered background using an infrared imager. A probe is a simple mathematical function which operates locally on image grey levels and produces an output that is more directly usable by an algorithm. A directional probe image is calculated by taking the difference in grey levels between pixels a set distance apart in a given direction, centered on the probe image pixel. A parametric statistical image background model which describes the probe images is introduced. The parameters of the probe image model can be readily estimated from the image. Knowledge of these parameters, together with target signatures obtained from computer Aided Design (CAD) models, allows the likelihood ratio for a given object pose hypothesis versus the background null hypothesis to be written. The generalized likelihood ratio test is used to accept one of the object poses or to choose the null hypothesis. Results of the method applied to a large set of terrain model board images are presented.< >
Building integrated models of existing 3-D objects is a key requirement for both reverse engineering and object recognition systems. An automatic 3-D model builder goes through three main steps: i) surface sampling fr...
详细信息
Building integrated models of existing 3-D objects is a key requirement for both reverse engineering and object recognition systems. An automatic 3-D model builder goes through three main steps: i) surface sampling from many views, ii) registration of the sampled views, and iii) integration of the registered views. The accuracy obtained depends on the acquisition and registration errors. The latter is critical since a misalignment of the range views causes their noise distributions to be centered around different means, which makes it difficult to reduce the effect of the acquisition error by simple averaging. In this paper, we propose a general algorithm that reduces significantly the level of the registration errors between all pairs in a set of range views. This algorithm refines initial estimates of the transformation matrices obtained from the calibrated acquisition setup. It considers the network of views as a whole and minimizes the registration errors of all views simultaneously. This leads to a well-balanced network of views in which the registration errors are equally distributed. Experimental results show an improvement of both the calibrated registrations and integrated models.< >
In binocular visual systems, vergence is the process of directing the gaze so that the optical axes intersect at a surface point. Correlation-based methods of disparity analysis provide fast estimates of the vergence ...
详细信息
In binocular visual systems, vergence is the process of directing the gaze so that the optical axes intersect at a surface point. Correlation-based methods of disparity analysis provide fast estimates of the vergence error. Unfortunately most correlation techniques do not provide mechanisms to determine which image locations contributed to a given correlation peak. The result is that large correlation peaks may have contributions from image arena not relevant to the vergence task. This paper presents a vergence system that applies a cepstral filter to multiscale images obtained from a dominant-eye binocular sensor. As used by this system, the cepstral filter has two main advantages: it enhances targets through narrow-band signal suppression, and it supports a back-projection operation to determine the image locations associated with particular correlation peaks. The use of multiscale images allows the system to have both high resolution for precision in the final vergence and a large field of view for a wide range of initial camera orientations without undue computational cost.< >
Thinning algorithms are an important sub-component in the construction of computervision (especially for optical character recognition (OCR)) systems. Important criteria for the choice of a thinning algorithm include...
详细信息
Thinning algorithms are an important sub-component in the construction of computervision (especially for optical character recognition (OCR)) systems. Important criteria for the choice of a thinning algorithm include the sensitivity of the algorithms to input shape complexity and to the amount of noise. In previous work, we introduced a methodology to quantitatively analyse the performance of thinning algorithms. The methodology uses an ideal world model for thinning based on the concept of Blum ribbons. In this paper we extend upon this methodology to answer these and other experimental questions of interest. We contaminate the noise-free images using a noise model that simulates the degradation introduced by the process of xerographic copying and laser printing. We then design experiments that study how each of 16 popular thinning algorithms performs relative to the Blum ribbon gold standard and relative to itself as the amount of noise varies. We design statistical data analysis procedures for various performance comparisons. We present the results obtained from these comparisons and a discussion of their implications in this paper.< >
暂无评论