Techniques are presented for automatically generating optimal vision programs from high- level task descriptions. vision programs are the object models that describe strategies to recognize and locate objects in an im...
详细信息
ISBN:
(纸本)0819410276
Techniques are presented for automatically generating optimal vision programs from high- level task descriptions. vision programs are the object models that describe strategies to recognize and locate objects in an image. The effectiveness of the program depends on the features used for recognition and the order in which the features are evaluated. We describe three probabilistic feature utility measures and a cost function based on program execution time that serve as the basis of our technique. Computation of such utility measures from a statistically representative sample of images has been demonstrated. Problems encountered in computing such measures from computer-generated images are described.
In this paper we present a new architecture of neuron, called the dynamic neural unit (dNU). The topology of the proposed neuronal model embodies delay elements, feedforward and feedback signals weighted by the synapt...
详细信息
ISBN:
(纸本)0819410276
In this paper we present a new architecture of neuron, called the dynamic neural unit (dNU). The topology of the proposed neuronal model embodies delay elements, feedforward and feedback signals weighted by the synaptic weights and a time-varying nonlinear activation function, and is thus different from the conventionally and assumed architecture of neurons. The learning algorithm for the proposed neuronal structure and the corresponding implementation scheme are presented. A multi-stage dynamic neural network is developed using the dNU as the basic processing element. The performance evaluation of the dynamic neural network is presented for nonlinear dynamic systems under various situations. The capabilities of the proposed neural network model not only account for the learning and control actions emulating some of the biological control functions, but also provide a promising parallel-distributedintelligent control scheme for large-scale complex dynamic systems.
This paper presents the extraction of depth data from stereo image pairs using a nontraditional stereo algorithm taken from computational neuroscience. The technique is based on the workings of the mammalian visual sy...
详细信息
ISBN:
(纸本)0819410276
This paper presents the extraction of depth data from stereo image pairs using a nontraditional stereo algorithm taken from computational neuroscience. The technique is based on the workings of the mammalian visual system, using the Gabor representation of an image to mimic the filtering properties of simple and complex cells in the visual cortex. Gabor- transformed images afford an alternate stereo correlation method that, though computationally intensive, is well-suited for solution in parallel. This implementation computes the Gabor transform of input images by sampling at four distinct frequencies and computing correlation at each frequency. We consider four methods of combining the resulting four correlation measures and present results of testing the algorithm on random dot and real image stereograms.
There are two kinds of depth perception for robot vision systems: quantitative and qualitative. The first one can be used to reconstruct the visible surfaces numerically while the second to describe the visible surfac...
详细信息
ISBN:
(纸本)0819410276
There are two kinds of depth perception for robot vision systems: quantitative and qualitative. The first one can be used to reconstruct the visible surfaces numerically while the second to describe the visible surfaces qualitatively. In this paper, we present a qualitative vision system suitable for intelligentrobots. The goal of such a system is to perceive depth information qualitatively using monocular 2-d images. We first establish a set of propositions relating depth information, such as 3-d orientation anddistance, to the changes of image region caused by camera motion. We then introduce an approximation-based visual tracking system. Given an object, the tracking system tracks its image while moving the camera in a way dependent upon the particular depth property to be perceived. Checking the data generated by the tracking system with our propositions provides us the depth information about the object. The visual tracking system can track image regions in real-time even as implemented on a PC AT clone machine, and mobile robots can naturally provide the inputs to our visual tracking system, therefore, we are able to construct a real-time, cost effective, monocular, qualitative and 3-dimensional robot vision system. To verify our idea, we present examples of perception of planar surface orientation, distance, size, dimensionality and convexity/concavity.
In this paper, the single instruction architecture is used to construct circuitry to perform dilation and erosion of gray valued images, where the gray values are discrete but limited only by the number of bits chosen...
详细信息
ISBN:
(纸本)0819410276
In this paper, the single instruction architecture is used to construct circuitry to perform dilation and erosion of gray valued images, where the gray values are discrete but limited only by the number of bits chosen for the binary encoding. In addition, methods for minimizing the number of cells needed, using basic digital techniques, are discussed. While others have constructed architectures for gray valueddilation and erosion, these are based on non- homogeneous circuits, and typically use Umbra transformations to handle the gray values, rather than binary encoding. Finally, it is shown that the half-adder elements used in the single instruction architecture can easily be replaced with uniform multiplexer cells in deference to the McCulloch-Pitts model of the neuron. This analogy between the single instruction architecture and the neuronal construction of the brain is intentional.
Our innate ability to process and interpret large volumes of poorly defined visual data, in essence to perceive visual information, enables us to function effectively in a continually changing complex world. As knowle...
详细信息
ISBN:
(纸本)0819410276
Our innate ability to process and interpret large volumes of poorly defined visual data, in essence to perceive visual information, enables us to function effectively in a continually changing complex world. As knowledge engineers, it would be highly desirable to incorporate such flexibility into artificial systems. Fuzzy logic is a mathematical tool created to help synthesize complex systems anddecision processes that must deal with imprecise or ambiguous information. In terms of vision, this ambiguity arises from the meanings attached to the sensor inputs and the rules used to describe the relationship between the various informative visual attributes. Notions that pertain to vision perception such as fuzzy images, fuzzy mathematical operators and fuzzy inference procedures are outlined in this paper.
This paper presents a 3d multiresolution wavelet analysis that provides a tool for analyzing spatial details (e.g., horizontal and vertical edges) of moving objects contained in a sequence of images. Current multireso...
详细信息
ISBN:
(纸本)0819410276
This paper presents a 3d multiresolution wavelet analysis that provides a tool for analyzing spatial details (e.g., horizontal and vertical edges) of moving objects contained in a sequence of images. Current multiresolution wavelet analysis theory is modified to create an orthonormal wavelet basis for L2(R3) by forming the tensor product of three non- identical multiresolution wavelet analyses on L2(R). An unconventional multiresolution decomposition and reconstruction algorithm is presented which provides a new tool for analyzing moving objects in a scene. Preliminary results demonstrate the new analysis technique's potential for segmenting key characteristics of an object moving against stationary or moving backgrounds.
The selection and placement of cameras and light sources for a specific task (e.g., locating a part in a tray or inspecting an object) is one of the most important steps in creating a successful vision system, because...
详细信息
ISBN:
(纸本)0819410276
The selection and placement of cameras and light sources for a specific task (e.g., locating a part in a tray or inspecting an object) is one of the most important steps in creating a successful vision system, because obtaining high-quality images can greatly simplify the vision algorithms and improve their reliability. We will describe techniques that use a visual task description stated in terms of features to be detected, andderive a range of light-source locations that satisfy the task requirements. In particular, given a task description that specifies particular object edges to be detected with a given edge detector (e.g., a Sobel edge operator), our techniques determine the constraints on light-source location such that the edge is detected.
In this paper we present a probabilistic prediction based approach for CAd-based object recognition. Given a CAd model of an object, the PREMIO system combines techniques of analytic graphics and physical models of li...
详细信息
ISBN:
(纸本)0819410276
In this paper we present a probabilistic prediction based approach for CAd-based object recognition. Given a CAd model of an object, the PREMIO system combines techniques of analytic graphics and physical models of lights and sensors to predict how features of the object will appear in images. In nearly 4,000 experiments on analytically-generated and real images, we show that in a semi-controlled environment, predicting the detectability of features of the image can successfully guide a search procedure to make informed choices of model and image features in its search for correspondences that can be used to hypothesize the pose of the object. Furthermore, we provide a rigorous experimental protocol that can be used to determine the optimal number of correspondences to seek so that the probability of failing to find a pose and of finding an inaccurate pose are minimized.
Passively sensing three-dimensional structure by means of computational stereo has received a great deal of attention in the computervision community as well as in the traditional photogrammetric and remote sensing c...
详细信息
ISBN:
(纸本)0819410276
Passively sensing three-dimensional structure by means of computational stereo has received a great deal of attention in the computervision community as well as in the traditional photogrammetric and remote sensing communities. The first and most difficult step in recovering 3-d information from a pair of stereo images is that of matching points from one image of the pair to the corresponding points in the second image. In this paper we develop an edge-based, fast and effective stereo matching technique characterized by two matching stages: initial matching and consistency check. Several constraints (Epipolar, Uniqueness, disparity continuity, Stochastic constraint anddisparity range constraint) are used to reduce the combinatorial search and the ambiguity of the false targets. With this approach, we can obtain the global optimum matches. The algorithm has been experimentally evaluated using a set of real images. The implementation and results have shown the efficacy of the proposed stereo matching technique.
暂无评论