3D space and time in optics and in human vision are linked together in spectral diffractive-optical transformations of the visible world. A 4D-RGB correlator hardware - integrated in an optical imaging system like the...
详细信息
3D space and time in optics and in human vision are linked together in spectral diffractive-optical transformations of the visible world. A 4D-RGB correlator hardware - integrated in an optical imaging system like the human eye - processes a hierarchy of relativistic equilibrium states and a sequence of double-cone transformations. The full chain of light-like events ends in von Laue interference maxima in reciprocal space, where 4D-RGB signals are miniaturized down to the level of individual photoreceptors. The diffractive-optical correlator relates local information to global data in the visual field and illustrates the potential of future development of cameras towards more intelligent 4D optical sensors.
Tactical behavior of UGVs, which is needed for successful autonomous off-road driving, can be in many cases achieved by covering most possible driving situations with a set of rules and switching into a "drive-me...
详细信息
Tactical behavior of UGVs, which is needed for successful autonomous off-road driving, can be in many cases achieved by covering most possible driving situations with a set of rules and switching into a "drive-me-away" semi-autonomous mode when no such rule exists. However, the unpredictable and rapidly changing nature of combat situations requires more intelligent tactical behavior that must be based on predictive situation awareness with ongoing scene understanding and fast autonomous decision making. The implementation of image understanding and active vision is possible in the form of biologically inspired Network-Symbolic models, which combine the power of Computational Intelligence with graph and diagrammatic representation of knowledge. A Network-Symbolic system converts image information into an "understandable" Network-Symbolic format, which is similar to relational knowledge models. The traditional linear bottom-up "segmentation-grouping-learning-recognition" approach cannot provide a reliable separation of an object from its background/clutter, while human vision unambiguously solves this problem. An Image/Video Analysis that is based on Network-Symbolic approach is a combination of recursive hierarchical bottom-up and top-down processes. Logic of visual scenes can be captured in the Network-Symbolic models and used for the reliable disambiguation of visual information, including object detection and identification. Such a system can better interpret images/video for situation awareness, target recognition, navigation and actions and seamlessly integrates into 4D/RCS architecture.
This paper describes the design and implementation of a vision-guided autonomous vehicle that represented BYU in the 2005 intelligent Ground Vehicle Competition (IGVC), in which autonomous vehicles navigate a course m...
详细信息
This paper describes the design and implementation of a vision-guided autonomous vehicle that represented BYU in the 2005 intelligent Ground Vehicle Competition (IGVC), in which autonomous vehicles navigate a course marked with white lines while avoiding obstacles consisting of orange construction barrels, white buckets and potholes. Our project began in the context of a senior capstone course in which multi-disciplinary teams of five students were responsible for the design, construction, and programming of their own robots. Each team received a computer motherboard, a camera, and a small budget for the purchase of additional hardware, including a chassis and motors. The resource constraints resulted in a simple vision-based design that processes the sequence of images from the single camera to determine motor controls. Color segmentation separates white and orange from each image, and then the segmented image is examined using a 10×10 grid system, effectively creating a low resolution picture for each of the two colors. Depending on its position, each filled grid square influences the selection of an appropriate turn magnitude. Motor commands determined from the white and orange images are then combined to yield the final motion command for video frame. We describe the complete algorithm and the robot hardware and we present results that show the overall effectiveness of our control approach.
Edwin Land - based on photometric data - tried to explain through a retinex (retina + cortex) model calculating scaled integrated reflectances, how human color vision determines the perceived hues of colored Mondrian ...
详细信息
Edwin Land - based on photometric data - tried to explain through a retinex (retina + cortex) model calculating scaled integrated reflectances, how human color vision determines the perceived hues of colored Mondrian patches by relating illuminants and 'energies at the eye' and including calculation over the whole image. An alternative - purely optical - model, the diffractive-optical correlator hardware in aperture and image space of the human eye, relating 'local' data onto 'global' data in color vision, becomes illustrated. Based on Edwin Land's experimental data it is shown how the perceived hues result from diffractive-optical transformations and cross-correlations between object space and reciprocal space (RGB space), from matrix multiplications and divisions in vector space. Optical pre-processing causes that our eyes do not see what (physically) is real, but what they optically have calculated. This same diffractive-optical mechanism has also lead to an explanation of the phenomenon of paradoxically colored shadows, shortly re-presented in the introduction.
A vision system designed to detect people in complex backgrounds is presented. The purpose of the proposed algorithms is to allow the identification and tracking of single persons under difficult conditions - in crowd...
详细信息
A vision system designed to detect people in complex backgrounds is presented. The purpose of the proposed algorithms is to allow the identification and tracking of single persons under difficult conditions - in crowded places, under partial occlusion and in low resolution images. In order to detect people reliably, we combine different information channels from video streams. Most emphasis for the initialization of trajectories and the subsequent pedestrian recognition is placed on the detection of the head-shoulder contour. In the first step a simple and fast shape model selects promising candidates, then a local active shape model is matched against the gradients found in the image with the help of a cost function. Texture analysis in the form of co-occurrence features ensures that shape candidates form coherent trajectories over time. In order to reduce the amount of false positives and to become more robust, a pattern analysis step based on Eigenimage analysis is presented. The cues which form the basis of pedestrian detection are integrated into a tracking algorithm which uses the shape information for initial pedestrian detection and verification, propagates positions into new frames using local motion and matches pedestrians with the help of texture information.
In this paper, a new real-time and intelligent mobile robot system for path planning and navigation using stereo camera embedded on the pan/tilt system is proposed. In the proposed system, face area of a moving person...
详细信息
In this paper, a new real-time and intelligent mobile robot system for path planning and navigation using stereo camera embedded on the pan/tilt system is proposed. In the proposed system, face area of a moving person is detected from a sequence of the stereo image pairs by using the YCbCr color model and using the disparity map obtained from the left and right images captured by the pan/tilt-controlled stereo camera system and depth information can be detected. And then, the distance between the mobile robot system and the face of the moving person can be calculated from the detected depth information. Accordingly, based-on the analysis of these data, three-dimensional objects can be detected. Finally, by using these detected data, 2-D spatial map for a visually guided robot that can plan paths, navigate surrounding objects and explore an indoor environment is constructed. From some experiments on target tracking with 480 frames of the sequential stereo images, it is analyzed that error ratio between the calculated and measured values of the relative position is found to be very low value of 1.4 % on average. Also, the proposed target tracking system has achieved a high speed of 0.04 sec/frame for target detection and 0.06 sec/frame for target tracking.
Virtual Keeper is a goalkeeper simulator. It estimates the trajectory of a ball thrown by a player using machine vision and animates a goalkeeper with a video projector. The first version of Virtual Keeper used only o...
详细信息
Virtual Keeper is a goalkeeper simulator. It estimates the trajectory of a ball thrown by a player using machine vision and animates a goalkeeper with a video projector. The first version of Virtual Keeper used only one camera. In this paper, a new version that uses two gray-scale cameras for trajectory estimation is proposed. In addition, a color camera and a microphone are used to determine the intersection-point of the ball trajectory and the goal line, in order to enable feedback and online calibration of the machine vision with neural networks, which in turn allows varying external parameters of the cameras and the video projector. The color camera takes images of the goal and determines the positions of the goalposts with pattern matching. After the gray-scale cameras have observed the ball and estimated its trajectory, the sound processing block is triggered. When the ball hits the screen, the noise pattern is recognized with a neural network, whose input consists of temporal and spectral features. The sound processing block in turn triggers the color camera image processing block. The color of the ball differs from the colors of the background and goalkeeper to make the segmentation problem easier. The ball is recognized with fuzzy color based segmentation and fuzzy pattern matching.
Moving cameras are needed for a wide range of applications in robotics, vehicle systems, surveillance, etc. However, many foreground object segmentation methods reported in the literature are unsuitable for such setti...
详细信息
Moving cameras are needed for a wide range of applications in robotics, vehicle systems, surveillance, etc. However, many foreground object segmentation methods reported in the literature are unsuitable for such settings;these methods assume that the camera is fixed and the background changes slowly, and are inadequate for segmenting objects in video if there is significant motion of the camera or background. To address this shortcoming, a new method for segmenting foreground objects is proposed that utilizes binocular video. The method is demonstrated in the application of tracking and segmenting people in video who are approximately facing the binocular camera rig. Given a stereo image pair, the system first tries to find faces. Starting at each face, the region containing the person is grown by merging regions from an over-segmented color image. The disparity map is used to guide this merging process. The system has been implemented on a consumer-grade PC, and tested on video sequences of people indoors obtained from a moving camera rig. As can be expected, the proposed method works well in situations where other foreground-background segmentation methods typically fail. We believe that this superior performance is partly due to the use of object detection to guide region merging in disparity/color foreground segmentation, and partly due to the use of disparity information available with a binocular rig, in contrast with most previous methods that assumed monocular sequences.
During past decades, the enormous growth of image archives has significantly increased the demand for research efforts aimed at efficiently finding specific images within large databases. This paper investigates match...
详细信息
During past decades, the enormous growth of image archives has significantly increased the demand for research efforts aimed at efficiently finding specific images within large databases. This paper investigates matching of images of buildings, architectural designs, blueprints and sketches. Their geometrical constrains lead to the proposed approach: the use of local grey-level invariants based on internal contours of the object. The problem involves three key phases: object recognition in image data, matching two images and searching the database of images. The emphasis of this paper is on object recognition based on internal contours of image data. In her master's thesis, M.M. Kulkarni described a technique for image retrieval by contour analysis implemented on external contours of an object in an image data. This is used to define the category of a building (tower, dome, flat, etc). Integration of these results with local grey-level invariant analysis creates a more robust image retrieval system. Thus, the best match result is the intersection of the results of contour analysis and grey-level invariants analysis. Experiments conducted for the database of architectural buildings have shown robustness w.r.t. to image rotation, translation, small view-point variations, partial visibility and extraneous features. The recognition rate is above 99% for a variety of tested images taken under different conditions.
Singular value decomposition (SVD) is a common technique that is performed on video sequences in a number of computervision and robotics applications. The left singular vectors represent the eigenimages, while the ri...
详细信息
ISBN:
(纸本)0780389123
Singular value decomposition (SVD) is a common technique that is performed on video sequences in a number of computervision and robotics applications. The left singular vectors represent the eigenimages, while the right singular vectors represent the temporal properties of the video sequence. It is obvious that spatial reduction techniques affect the left singular vectors, however, the extent of their effect on the right singular vectors is not clear. Understanding how the right singular vectors are affected is important because many SVD algorithms rely on computing them as an intermediate step to computing the eigenimages. The work presented here quantifies the effects of different spatial resolution reduction techniques on the right singular vectors that are computed from those video sequences. Examples show that using random sampling for spatial resolution reduction rather than a low-pass filtering technique results in less perturbation of the temporal properties.
暂无评论