The selection and placement of cameras and light sources for a specific task (e.g., locating a part in a tray or inspecting an object) is one of the most important steps in creating a successful vision system, because...
详细信息
ISBN:
(纸本)0819410276
The selection and placement of cameras and light sources for a specific task (e.g., locating a part in a tray or inspecting an object) is one of the most important steps in creating a successful vision system, because obtaining high-quality images can greatly simplify the vision algorithms and improve their reliability. We will describe techniques that use a visual task description stated in terms of features to be detected, andderive a range of light-source locations that satisfy the task requirements. In particular, given a task description that specifies particular object edges to be detected with a given edge detector (e.g., a Sobel edge operator), our techniques determine the constraints on light-source location such that the edge is detected.
In this paper we present a probabilistic prediction based approach for CAd-based object recognition. Given a CAd model of an object, the PREMIO system combines techniques of analytic graphics and physical models of li...
详细信息
ISBN:
(纸本)0819410276
In this paper we present a probabilistic prediction based approach for CAd-based object recognition. Given a CAd model of an object, the PREMIO system combines techniques of analytic graphics and physical models of lights and sensors to predict how features of the object will appear in images. In nearly 4,000 experiments on analytically-generated and real images, we show that in a semi-controlled environment, predicting the detectability of features of the image can successfully guide a search procedure to make informed choices of model and image features in its search for correspondences that can be used to hypothesize the pose of the object. Furthermore, we provide a rigorous experimental protocol that can be used to determine the optimal number of correspondences to seek so that the probability of failing to find a pose and of finding an inaccurate pose are minimized.
Our innate ability to process and interpret large volumes of poorly defined visual data, in essence to perceive visual information, enables us to function effectively in a continually changing complex world. As knowle...
详细信息
ISBN:
(纸本)0819410276
Our innate ability to process and interpret large volumes of poorly defined visual data, in essence to perceive visual information, enables us to function effectively in a continually changing complex world. As knowledge engineers, it would be highly desirable to incorporate such flexibility into artificial systems. Fuzzy logic is a mathematical tool created to help synthesize complex systems anddecision processes that must deal with imprecise or ambiguous information. In terms of vision, this ambiguity arises from the meanings attached to the sensor inputs and the rules used to describe the relationship between the various informative visual attributes. Notions that pertain to vision perception such as fuzzy images, fuzzy mathematical operators and fuzzy inference procedures are outlined in this paper.
A goal of computervision is the construction of scene descriptions based on information extracted from one or more 2d images. A reconstruction strategy based on a three-level representational framework is proposed. T...
详细信息
ISBN:
(纸本)0819410276
A goal of computervision is the construction of scene descriptions based on information extracted from one or more 2d images. A reconstruction strategy based on a three-level representational framework is proposed. The first representational level, the Primal Sketch, makes explicit important information about the two-dimensional image, primarily the intensity changes and their geometrical distribution and organization. The intensity changes appear at several spatial scales and image analysis performed at multiple resolutions is therefore required. We propose a compact pyramidal neural network implementation of the multiresolution representation of the input images. Features of the scene are detected at each resolution level and feedback interaction is built between pyramid levels in order to reinforce edges which correspond to physical features of the observed scene. The second representational level, the raw 2.5 d Sketch, makes explicit the orientation and rough depth at the edge location of the visible surfaces. A multiresolution neural network stereo algorithm is designed to compute the disparity at each pixel location and at all the resolution levels. Matching is facilitated by a hierarchical focussing mechanism. The third representational level, the full 2.5 d Sketch, makes explicit the orientation anddepth estimate at all the visible surface coordinates. depth information between the edges is computed with a local shape- from-shading algorithm.
Hierarchical representation of three dimensional (3d) object shape has been based on different levels of resolution. This paper introduces a representational hierarchy that is based on the connectedness and neighborli...
详细信息
ISBN:
(纸本)0819410276
Hierarchical representation of three dimensional (3d) object shape has been based on different levels of resolution. This paper introduces a representational hierarchy that is based on the connectedness and neighborliness of object shape expressed through topologies on the bounding surface with increasing strength. The topology at the object part level is weaker than at the level of simply connected elliptic, parabolic, plane and hyperbolic regions and the strongest topology is given by the classical topology for smooth surfaces. This provides a unified view on the representation of three-dimensional object shape for recognition. The open sets have a natural interpretation in the context of object recognition and relate to different types of recognition processes. More elaborate descriptions are naturally obtained by the introduction of additional structure, such as affine and metric. Qualitative shape features are defined at each level of the hierarchy their usefulness and limitation for shape discrimination is discussed. The possibility of deriving the topologies from ordinal structure is considered and examples of object description presented.
In this study we have measured the spectra from 250nm to 700nm of a set of standard colors. The aim was to find out if it was a suitable set for color vision tests outside the region of human vision in the UV region. ...
详细信息
ISBN:
(纸本)0819410276
In this study we have measured the spectra from 250nm to 700nm of a set of standard colors. The aim was to find out if it was a suitable set for color vision tests outside the region of human vision in the UV region. Another test set measured the set of natural colors in the range 250nm to IR, which contained color samples of Finnish natural scenes. The first three principal components of the spectra of the Munsell hue test set are shown. The results show that the color test patches are useful for testing color vision in the human visual region from 380nm to 700nm. Results are also given from analysis of the set of natural colors. The principles of color coordinate systems are given for some animals. Color discrimination between animals and humans is compared.
The proposed parallel clustering technique performs several clustering processes (for the same data set) in parallel, using different sets of initial cluster centers. Each clustering process consists of a sequence of ...
详细信息
ISBN:
(纸本)0819410276
The proposed parallel clustering technique performs several clustering processes (for the same data set) in parallel, using different sets of initial cluster centers. Each clustering process consists of a sequence of iterations. The clustering processes are iterated in parallel within each parallel step. By the end of each parallel step, the clustering parameters are evaluated according to prespecified criteria. 'Non-promising' cluster center sets are discarded, and new cluster center sets are formed using 'promising' cluster centers. The presented illustrated examples indicate a reduction of 7% to 30% in the number of iterations required for convergence.
The proceedings contain 52 papers. The topic discussed include: nonreconstruction approach for road following;applying geometric sensor and scene models for range image understanding;estimation of motion parameters us...
The proceedings contain 52 papers. The topic discussed include: nonreconstruction approach for road following;applying geometric sensor and scene models for range image understanding;estimation of motion parameters using binocular camera configurations;fusion-baseddepth estimation from a sequence of monocular images;scene description: interactive computation of stability with friction;planning of an active range sensor structure for pose estimation of 3-d regular objects;and clustering methods for removing outliers from vision-based range estimates.
In this paper, we discuss a statistical framework for multiscale signal and image processing based on a class of multiresolution stochastic models, which can be used to represent spatial random processes at a range of...
详细信息
ISBN:
(纸本)0819410276
In this paper, we discuss a statistical framework for multiscale signal and image processing based on a class of multiresolution stochastic models, which can be used to represent spatial random processes at a range of scales. The model class is quite rich, and in fact includes the class of Markov random fields. In addition, the models have a scale recursive structure which naturally leads to efficient, scale recursive algorithms for smoothing and likelihood calculation. We discuss an application of the framework to the problem of computing optical flow in image sequence, anddemonstrate computational savings on the order of one to two orders of magnitude over standard algorithms.
One of the purposes of computervision is to reconstruct a three-dimensional description of the environment from multiple sensor images. Because no single image can show all salient features of a complex scene, an int...
详细信息
ISBN:
(纸本)0819410276
One of the purposes of computervision is to reconstruct a three-dimensional description of the environment from multiple sensor images. Because no single image can show all salient features of a complex scene, an intelligently chosen set of images is needed to provide the complete description. Because the sensor data is generally incomplete and errorful, a priori knowledge about the scene and the sensors is used in the reconstruction process. This paper will describe methods for applying scene and sensor knowledge to the problem of three- dimensional reconstruction from multiple images. In particular, the geometric representation and reasoning techniques applied to 3d generic object recognition in the 3d FORM system [Walk88, Walk90] will be extended to reason about segmented 2d images. Knowledge about the sensors and objects in the scene will be represented as frames in the 3d FORM system. For each sensor, the geometric relationship between the sensor's pose, the image features and the world features is modeled, and for each object, the geometric relationships between the object and its parts are modeled. Three-dimensional reconstruction is performed by transforming each sensor image to a set of constraints on the world, and then combining the constraints from all sensors with constraints imposed by the object models to generate an interpretation that satisfies all constraints. The advantage of this method is that the resulting system is able to adjust itself to the available information without knowing in advance which constraints will be specified.
暂无评论