Before addressing the problem of visual recognition, we need to understand what the result of visual cognition is: What is new in memory after a scene has been understood? For an agent that is to interact with the sce...
详细信息
ISBN:
(纸本)0819410268
Before addressing the problem of visual recognition, we need to understand what the result of visual cognition is: What is new in memory after a scene has been understood? For an agent that is to interact with the scene, the most important result of visual understanding is an analysis of the causal structure of the scene: How motion is originated, constrained, and prevented, and what will happen in the immediate future. With respect to the agent's goals, such an understanding describes the scene in terms of its functional properties - how the agent may interact with the scene. In order to arrive at such an understanding, a robot must have a sophisticated theory of how the world is designed. We discuss some of the consequences of this view for the construction of purposeful vision systems, and show examples from our own work in the understanding of complex scenes.
We describe a real-time vision contest that took place in January 1992 at the MIT AI lab. The task was high speed visual navigation along a 60 foot winding indoor course. The computational power available was a conven...
详细信息
ISBN:
(纸本)0819410268
We describe a real-time vision contest that took place in January 1992 at the MIT AI lab. The task was high speed visual navigation along a 60 foot winding indoor course. The computational power available was a conventional Sun Sparcstation 1 with a Sun color framegrabber. The imaging device was a standard color Pulnix CCD camera with an auto iris 4.8 mm lens. The robot base was a commercial B12 mobile robot from Real World Interface. The winning entry completed the course in 1:07 minutes about three times faster than a human controlling the robot using the same video input. Approximately ten person days of work was required to program an entry that would complete the course.
Fixation and visual attention are central themes in active vision research, and are closely related. In this paper we discuss one of several ways in which they interact. We describe filtering methods that allow an age...
详细信息
ISBN:
(纸本)0819410268
Fixation and visual attention are central themes in active vision research, and are closely related. In this paper we discuss one of several ways in which they interact. We describe filtering methods that allow an agent to selectively extract features of the object it is fixating and suppress features of foreground and background objects. The methods are essentially depth filters;they use disparity or motion information to suppress image features that are far from the fixation point in depth. They share a simple computational structure based on the Laplacian pyramid, and are readily amenable to hardware implementation. We present the filters and the properties of fixation geometry that allow them to work, and discuss their behavior. We present methods of implementing them in real time and describe ways of extending them to other features besides depth.
When the image of a moving object is equal in luminance with the background, we observe a startling change in both its apparent motion and its three-dimensional position in space. If we use biological vision as a guid...
详细信息
ISBN:
(纸本)0819410268
When the image of a moving object is equal in luminance with the background, we observe a startling change in both its apparent motion and its three-dimensional position in space. If we use biological vision as a guide for the construction of machine vision systems, this perceptual phenomenon has profound implications. Motion information can be used in a variety of visual tasks such as detection, calibration, guided movement, navigation, and recognition. Human performance at equiluminance suggests that navigation uses motion information heavily and that for recognition, motion plays only a role such as separating figure from ground or grossly defining surface in space. Equiluminant motion perception cannot tell us much about detection, calibration, or guided movement tasks. We demonstrate an adaptive model of motion perception which presents similar equiluminant responses.
Simple stereo disparity filters can provide `proximity detectors' shaped like concave shells in front of the observer. Ideally, these are isodisparity surfaces. In practice, a narrowly tuned filter results in a th...
详细信息
ISBN:
(纸本)0819410268
Simple stereo disparity filters can provide `proximity detectors' shaped like concave shells in front of the observer. Ideally, these are isodisparity surfaces. In practice, a narrowly tuned filter results in a thin shell. The special case of the zero-disparity surface is called the horopter. A disparity filter can also be useful for distinguishing an object that lies on an isodisparity surface from its surroundings. These filters are much less expensive than stereographic scene interpretation since they are local operations. Similarly, they are also less general. We analyze the expected proximity sensitivity of one simple version of the disparity filter and compare this to its empirical performance. We also present some feature based and correlation based disparity filters and compare their `segmentation' performance on various scenes.
Using the method of camera-space manipulation, high-precision, 3-dimensional rigid-body positioning tasks have been performed with a holonomic, six-axis, GMF S-400 robot. Further development, aimed at expanding the us...
详细信息
ISBN:
(纸本)0819410268
Using the method of camera-space manipulation, high-precision, 3-dimensional rigid-body positioning tasks have been performed with a holonomic, six-axis, GMF S-400 robot. Further development, aimed at expanding the usable region of the robot's workspace;and at achieving the higher precision enabled by a narrower field of view for the cameras, includes the use of cameras mounted on servoable platforms or `pan/tilt' units. The approach followed in the implementation of servoable cameras is designed to make use of information `learned' before camera repositioning to update view parameter estimates without undergoing large extraneous arm movement. The paper describes this approach and presents the first results of experimental work used for testing it.
A technique for selecting one camera viewpoint from m viewpoints containing zero mean Gaussian errors is presented. The procedure consists of a two stage analysis. First, the joint entropy of each viewpoint is found. ...
详细信息
ISBN:
(纸本)0819410268
A technique for selecting one camera viewpoint from m viewpoints containing zero mean Gaussian errors is presented. The procedure consists of a two stage analysis. First, the joint entropy of each viewpoint is found. The viewpoint with minimum entropy possesses the greatest possible lower bound reliability of meeting any quadratic specification of the pose error. Hence it is the best pose algorithm to select without further analysis. To guarantee a minimum reliability, a second stage of analysis is necessary. Methods of calculating reliability bounds for a given quadratic specification are explained. The reliability calculations require three orders of magnitude less computations than the alternative, Monte Carlo simulations. On the other hand, reliability analysis requires an order of magnitude more computations than entropy analysis. The concepts are simulated using a visual pose measurement system developed by NASA. The results indicate that entropy is very effective for selecting pose algorithms, and the reliability greatest lower bound is close to the actual reliability.
Binocular vision is the coordinated behavior of the two eyes by which a single perception of the external world is obtained and by which, the specific sensation of stereoscopic depth perception, is made possible. This...
详细信息
ISBN:
(纸本)0819410268
Binocular vision is the coordinated behavior of the two eyes by which a single perception of the external world is obtained and by which, the specific sensation of stereoscopic depth perception, is made possible. This perception, however, can be reversed by interchanging the left- and right-eye views. In this paper, the mathematical expression of the Vieth-Mueller circle is derived. A point on the line of the primary direction is found which only relates to the convergence angle and the interocular distance. A relation is developed between the position of a point in real space and its reversal if viewed pseudoscopically. It is shown that in some circumstances a concave surface is not necessarily perceived as a convex surface under pseudoscopic viewing conditions. The difference in perceiving real objects and stereograms is briefly discussed.
Pose and orientation of an object are central issues in 3-D recognition problems. Most of today's available techniques require considerable pre-processing, such as detecting edges or joints, fitting curves or surf...
详细信息
ISBN:
(纸本)0819410268
Pose and orientation of an object are central issues in 3-D recognition problems. Most of today's available techniques require considerable pre-processing, such as detecting edges or joints, fitting curves or surfaces to segment images, and trying to extract higher order features from the input images. In this paper we present a method based on analytical geometry, whereby all the rotation parameters of any quadric surface are determined and subsequently eliminated. This procedure is iterative in nature and has been found to converge to the desired results in as few as three iterations. The approach enables us to position the quadric surface in a desired coordinate system, then, utilize the presented shape information to explicitly represent and recognize the 3-D surface. Experiments were conducted with simulated data for objects such as hyperboloid of one and two sheets, elliptic and hyperbolic paraboloid, elliptic and hyperbolic cylinders, ellipsoids, and quadric cones. Real data of quadric cones and cylinders were also utilized. Both of these sets yielded excellent results.
A computervision based automated method for identifying and quantifying flaws in cast metal parts is presented. The specific defects to be isolated consist of small circular concavities in the surface (pits) and larg...
详细信息
ISBN:
(纸本)0819410268
A computervision based automated method for identifying and quantifying flaws in cast metal parts is presented. The specific defects to be isolated consist of small circular concavities in the surface (pits) and larger isolated regions (scratches) that may have been abraded due to cutting or handling operations. The approach taken identifies these anomalous features using two spatially separated light sources with different spectral characteristics to produce highly specular illumination at one wavelength and shallow diffuse illumination at a different wavelength. A bispectral image is processed to yield the sought flaws. This processing consists of identifying regions of interest in the original image that may contain potential flaws and applying a morphological region labelling operation to extract candidate pits and scratches. Geometric constraints are applied to the extracted regions in order to isolate the true flaws. The discussion that follows details the algorithmic approach used to identify flaws as well as characterizing the results obtained.
暂无评论