Active vision refers to a purposeful change in the camera setup to aid the processing of visual information. An important issue in using active vision is the need to represent the 3D environment in a manner that is in...
详细信息
ISBN:
(纸本)0819421413
Active vision refers to a purposeful change in the camera setup to aid the processing of visual information. An important issue in using active vision is the need to represent the 3D environment in a manner that is invariant to changing camera configurations. Conventional methods require precise knowledge of various camera parameters in order to build this representation. However, these parameters are prone to calibration errors. This motivates us to explore a neural network based approach using Vector Associative Map to learn the invariant representation of 3D point targets for active vision. An efficient learning scheme is developed that is suitable for robotic implementation. The representation thus learned is also independent of the intrinsic parameters of the imaging system, making it immune to systematic calibration errors. To evaluate the effectiveness of this scheme, computer simulations were first performed using a detailed model of the University of Illinois Active Vision System (UIAVS). This is followed by an experimental verification on the actual UIAVS. Several robotic applications are then explored that utilize the invariance property of the learned representation. These applications include motion detection, active vision based robot control, robot motion planning, and saccade sequence planning.
Typical vision-based vehicle detection systems use a video camera mounted on an overpass or an adjacent utility pole to observe vehicles passing on the road. Classical image-processing techniques are applied to the di...
详细信息
Typical vision-based vehicle detection systems use a video camera mounted on an overpass or an adjacent utility pole to observe vehicles passing on the road. Classical image-processing techniques are applied to the digitized video image to obtain pulse and presence signals, traditionally produced by inductive loop detectors. These image-processing techniques can also be performed by artificialneuralnetworks. Using a neural network requires that the network be first trained on several example video images in which the position of the vehicle is already indicated by a human operator. The trained network is then used to locate and track vehicles in images it has never been exposed to. Recently, Nestor Inc. and Intel Corporation have developed a hardware chip, called Ni1000 Recognition Accelerator, which is capable of implementing Radial Basis Function (RBF) networks in real time. This paper describes the results of converting a software feedforward network based detection system to a real time hardware implemented RBF network based detection system. Success rates greater than 90% were obtained for the RBF network based detection system.
Receptive field structures found in the visual cortex of the mammalian brain act as oriented, localized spatial frequency filters. There has been interest in the use of such receptive field profiles for image coding a...
详细信息
Receptive field structures found in the visual cortex of the mammalian brain act as oriented, localized spatial frequency filters. There has been interest in the use of such receptive field profiles for image coding and texture processing. These receptive field structures resemble Gabor filters. Systems employing such Gabor filters have been implemented in software for a variety of applications. We believe a hardware implementation of such cells will be helpful in artificial visual processing. We have implemented analog VLSI cells whose outputs resemble the receptive field profiles found in the visual cortex. We describe experimental results of our circuit. Our circuit is the first silicon model of visual cortical processing.
This paper uses a high level vision model to describe the information passing and linking within the primate visual system. Information linking schemes, such as state dependent modulation and temporal synchronization,...
详细信息
ISBN:
(纸本)0819421413
This paper uses a high level vision model to describe the information passing and linking within the primate visual system. Information linking schemes, such as state dependent modulation and temporal synchronization, are presented as methods the vision system uses to combine information using expectation to fill in missing information and remove unneeded information. The possibility of using linking methods derived from physiologically based theoretical models to combine current imageprocessing techniques for pattern recognition purposes is investigated. These imageprocessing techniques are transforms such as (but not limited to) wavelet filters, hit or miss filters, morphological filters, and difference of gausian filters. These particular filters are chosen because they simulate functions that are performed in the primate visual system. To implement the physiologically motivated linking methods, the Pulse Coupled neural Network (PCNN) is chosen as a basic building block for the vision model which performs linking at the neuronal pulse level. Last, an image fusion network which incorporates information linking based on the PCNN is described, and initial results are presented.
A neural network is used to extract the flight model of guided, short to medium range, tripod and shoulder-fired missile systems which is then integrated into a training simulator. The simulator uses injected video to...
详细信息
ISBN:
(纸本)0819421413
A neural network is used to extract the flight model of guided, short to medium range, tripod and shoulder-fired missile systems which is then integrated into a training simulator. The simulator uses injected video to replace the optical sight and is fitted with a multi-axis positioning system which senses the gunner's movement. The movement creates an image shift and affects the input data to the missile control algorithm. Accurate flight dynamics are a key to efficient training, particularly in the case of closed loop guided systems. However, flight model data is not always available, either because it is proprietary, or because it is too complex to embed in a real time simulator. A solution is to reverse engineer the flight model by analyzing the missile's response when submitted to typical input conditions. Training data can be extracted from either recorded video or from a combination of weapon and missile positioning data. The video camera can be located either on the weapon or attached to a through-sight adapter. No knowledge of the missile flight transfer function is used in the process. The data is fed to a three-layer back-propagation type neural network. The network is configured within a standard spreadsheet application and is optimized with the built-in solver functions. The structure of the network, the selected inputs and outputs, as well as training data, output data after training, and output data when embedded in the simulator are presented.
Computational complexity and image fidelity are two key issues of vector quantization (VQ), especially for real-time applications. This paper proposes a three-phase SOFM (TPSOFM) algorithm to design selectively three-...
详细信息
ISBN:
(纸本)7505338900
Computational complexity and image fidelity are two key issues of vector quantization (VQ), especially for real-time applications. This paper proposes a three-phase SOFM (TPSOFM) algorithm to design selectively three-level tree-structured codebooks. The computational complexity during the coding process is reduced by a factor of 20 over a full search. A tan times speed up in training time is also achieved. Degradation of reconstructed image quality is less than 0.38dB in PSNR while the bit rate is reduced.
With the current trend of integrating machine vision systems in industrial manufacturing and inspection applications comes the issue of camera and illumination stabilization. Unless each application is built around a ...
详细信息
ISBN:
(纸本)0819423092
With the current trend of integrating machine vision systems in industrial manufacturing and inspection applications comes the issue of camera and illumination stabilization. Unless each application is built around a particular camera and highly controlled lighting environment, the interchangability of cameras or fluctuations in lighting become a problem as each camera usually has a different response. An empirical approach is proposed where color tile data is acquired using the camera of interest, and a mapping is developed to some predetermined reference image using neuralnetworks. A similar analytical approach based on a rough analysis of the imaging systems is also considered for deriving a mapping between cameras. Once a mapping has been determined, all data from one camera is mapped to correspond to the images of the other prior to performing any processing on the data. Instead of writing separate imageprocessing algorithms for the particular image data being received, the image data is adjusted based on each particular camera and lighting situation. All that is required when swapping cameras is the new mapping for the camera being inserted. The imageprocessing algorithms can remain the same as the input data has been adjusted appropriately. The results of utilizing this technique will be presented for an inspection application.
Although the properties of the human visual system have been studied extensively, the knowledge of its operations are usually considered only for comparison, and models based on its function are rarely utilized. Compl...
详细信息
ISBN:
(纸本)0819421413
Although the properties of the human visual system have been studied extensively, the knowledge of its operations are usually considered only for comparison, and models based on its function are rarely utilized. Complexity and speed of computation are often the most quoted reasons for choosing not to implement such models. Nonetheless, if we are to achieve the flexibility and power of the biological visual system, researchers would be wise to continue to explore practical, yet comprehensive models based on human vision. This paper examines the first stage of the primate visual system, the retina, and how simple models of its neurons, along with their properties and interactions, can mimic what is presently believed to be some of the initial forms of visual imageprocessing. A static model based on the previously explored ideas of shunting dynamics will be presented along with the introduction of image (photoreceptor) blurring, driven by feedback from the shunting network. Simulations are used to demonstrate the model.
A combination of imageprocessing with neural network sorting was conducted to demonstrate feasibility of automated cervical smear screening. Nuclei were isolated to generate a series of data points relating to the de...
详细信息
ISBN:
(纸本)0819421413
A combination of imageprocessing with neural network sorting was conducted to demonstrate feasibility of automated cervical smear screening. Nuclei were isolated to generate a series of data points relating to the density and size of individual nuclei. This was followed by segmentation to isolate entire cells for subsequent generation of data points to bound the size of the cytoplasm. Data points were taken on as many as ten cells per image frame and included correlation against a series of filters providing size and density readings on nuclei. Additional point data was taken on nuclei images to refine size information and on whole cells to bound the size of the cytoplasm, twenty data points per assessed cell were generated. These data point sets, designated as neural tensors, comprise the inputs for training and use of a unique neural network to sort the images and identify those indicating evidence of disease. The neural network, named the Fast Analog Associative Memory, accumulates data and establishes lookup tables for comparison against images to be assessed. Six networks were trained to differentiate normal cells from those evidencing various levels abnormality that may lead to cancer. A blind test was conducted on 77 images to evaluate system performance. The image set included 31 positives (diseased) and 46 negatives (normal). Our system correctly identified all 31 positives and 41 of the negatives with 5 false positives. We believe this technology can lead to more efficient automated screening of cervical smears.
Visual quality control in many food processing operations continues to be a manual, difficult, and tedious task. Computer/machine vision systems offer a solution, but the development of effective algorithms that are a...
详细信息
ISBN:
(纸本)0819423092
Visual quality control in many food processing operations continues to be a manual, difficult, and tedious task. Computer/machine vision systems offer a solution, but the development of effective algorithms that are able to accommodate the natural variability of food products has proved to be problematic. This paper examines and compares three techniques for processing multi-spectral imagery for these applications. One technique is to use artificialneuralnetworks (ANNs). ANNs have the ability to be fault tolerant when establishing decision surfaces within the test data and can operate in parallel at high speeds - this makes them ideal for this application. The main drawback of ANNs is their inability to provide a meaningful justification for the decision boundaries they establish when classifying data. Another imageprocessing technique that uses a more deterministic data classification method is vector quantization (VQ). VQ uses a data clustering and splitting algorithm that can be modified to improve speed and accuracy according to the application. In an effort to include all levels of algorithm complexity, a modified thresholding approach is also compared to the more computationally demanding ANN and VQ techniques. The strengths and weaknesses of each of these algorithms are highlighted based on their performance in these domains.
暂无评论