The process of edge detection and feature extraction methods is based on converting a change of gray level between two regions of an image into a variation function that gives the difference between the gray level of ...
详细信息
The process of edge detection and feature extraction methods is based on converting a change of gray level between two regions of an image into a variation function that gives the difference between the gray level of each region and the gray level of the line of discontinuity. The process of computing the relative difference yields the magnitude, direction, and slope sign of the magnitude, which in turn characterize the features of an edge. By focusing on the relative variation of gray level for pixels between small regions, the filtering process of the local region identifies and smoothes uncorrelated features for a truer edge than is possible by smoothing a larger region.
We are interested in processing video data for the purpose of solving a variety of problems in video search, analysis, indexing, browsing and compression. Instead of concentrating on a particular problem, in this pape...
详细信息
If 3D rigid motion is estimated with some error a distorted version of the scene structure will in turn be computed. Of computational interest are these regions in space where the distortions are such that the depths ...
详细信息
If 3D rigid motion is estimated with some error a distorted version of the scene structure will in turn be computed. Of computational interest are these regions in space where the distortions are such that the depths become negative, because in order to be visible the scene has to lie in front of the image. The stability analysis for the structure-from-motion problem presented in this paper investigates the optimal relationship between the errors in the estimated translational and rotational parameters of a rigid motion, that results in the estimation of a minimum number of negative depth values. The input used is the value of the flow along some direction, which is more general than optic flow or correspondence. For a planar retina it is shown that the optimal configuration is achieved when the projections of the translational and rotational errors on the image plane are perpendicular. Furthermore, the projection of the actual and the estimated translation lie on a line passing through the image center. For a spherical retina given a rotational error, the optimal translation is the correct one, while given a translational error. The optimal rotational error is normal to the translational one at an equal distance from the real and estimated translations. The proofs, besides illuminating the confounding of translation and rotation in structure from motion, have an important application to ecological optics, explaining differences of planar and spherical eye or camera designs in motion and shape estimation.
An algorithm is presented to answer window queries in a quadtree-based spatial database environment by retrieving all of the quadtree blocks in the underlying spatial database that cover the quadtree blocks that compr...
详细信息
An algorithm is presented to answer window queries in a quadtree-based spatial database environment by retrieving all of the quadtree blocks in the underlying spatial database that cover the quadtree blocks that comprise the window. It works by decomposing the window operation into sub-operations over smaller window partitions. These partitions are the quadtree blocks corresponding to the window. Although a block b in the underlying spatial database may cover several of the smaller window partitions, b is only retrieved once rather than multiple times. This is achieved by using an auxiliary main memory data structure called the active border which requires O(n) additional storage for a window query of size n × n. As a result, the algorithm generates an optimal number of disk I/O requests to answer a window query (i.e., one request per covering quadtree block). A proof of correctness and an analysis of the algorithm's execution time and space requirements are given, as are some experimental results.
Algorithms in computervision are characterized by (i) complex and repetitive operations; (ii) large amount of data and (iii) a variety of data interaction (e.g., point operations, neighborhood operations, global oper...
详细信息
Algorithms in computervision are characterized by (i) complex and repetitive operations; (ii) large amount of data and (iii) a variety of data interaction (e.g., point operations, neighborhood operations, global operations). Based on the computation and communication complexity, vision algorithms have been characterized into three categories: (i) low-level, (ii) intermediate-level and (iii) high-level. In this paper, we describe the usage of custom computing approach to meet the computation and communication needs of computervision algorithms. By customizing hardware architecture for every application at the instruction level, the optimal grain size needed for the problem at hand and the instruction granularity can be matched. Field Programmable Gate Array (FPGA) based processing elements (PEs) are being used to provide this facility. Using programmable communication resources, the diverse communication requirements can be met. A vision system needs to integrate hardware for the three levels. A custom computing approach alleviates the problem of achieving optimal granularity for different stages as the same hardware gets reconfigured at a software level for different levels of the application. We demonstrate the advantages of our approach using Splash 2-a Xilinx 4010-based custom computer.
Multilayer perceptrons (MLPs) are one of the most popular neural network models for solving pattern classification and image classification problems. Because of their ability to learn complex decision boundaries, MLPs...
详细信息
Multilayer perceptrons (MLPs) are one of the most popular neural network models for solving pattern classification and image classification problems. Because of their ability to learn complex decision boundaries, MLPs are used in many practical computervision applications involving classification (or supervised segmentation). Once the connection weights in a MLP have been learnt, the network can be used repeatedly for classification of new input patterns. Several special-purpose architectures have been described in the literature for neural networks as they are slow on a conventional uniprocessor. In this paper, we describe mapping of MLPs onto Splash 2-a "custom computing machine". The main features of the proposed mapping are: (i) the number of nodes in a layer is not fixed; (ii) the number of layers in the network is not fixed; (iii) it is based on a set of reprogrammable FPGAs and a programmable crossbar; and (iv) it has a significant speedup over a uniprocessor. The mapping has been used for implementing a 3-layer MLP for page segmentation application with an appreciable speedup of approximately 150 over a SPARCstation 20 for one million pattern vectors with 20 features per pattern.
A framework for learning parameterized models of optical flow from image sequences is presented. A class of motions is represented by a set of orthogonal basis flow fields that are computed from a training set using p...
详细信息
A framework for learning parameterized models of optical flow from image sequences is presented. A class of motions is represented by a set of orthogonal basis flow fields that are computed from a training set using principal component analysis. Many complex image motions can be represented by a linear combination of a small number of these basis flows. The learned motion models may be used for optical flow estimation and for model-based recognition. For optical flow estimation we describe a robust, multi-resolution scheme for directly computing the parameters of the learned flow models from image derivatives. As examples we consider learning motion discontinuities, non-rigid motion of human mouths, and articulated human motion.
In this paper we discuss a new approach to invariant signatures for recognizing curves under viewing distortions and partial occlusion. The approach is intended to overcome the ill-posed problem of finding derivatives...
详细信息
暂无评论