This paper presents a new algorithm for detecting objects in images, one of the fundamental tasks of computervision. The algorithm extends the representational efficiency of eigenimage methods to binary features, whi...
详细信息
ISBN:
(纸本)0780342364
This paper presents a new algorithm for detecting objects in images, one of the fundamental tasks of computervision. The algorithm extends the representational efficiency of eigenimage methods to binary features, which are less sensitive to illumination changes than gray-level values normally used with eigenimages. Binary features (square subtemplates) are automatically chosen on each training image. Using features rather than whole templates makes the algorithm more robust to background clutter and partial occlusions. Instead of representing the features with real-valued eigenvector principle components, we use binary vector quantization to avoid floating point computations. The object is defected in the image using a simple geometric hash table and Hough transform. On a rest of 1000 images, the algorithm works on 99.3%. We present a theoretical analysis of the algorithm in terms of the receiver operating characteristic, which consists of the probabilities of detection and false alarm. We verify this analysis with the results of our 1000-image test, and we use the analysis as a principled way to select some of the algorithm's important operating parameters.
We describe a novel technique for face recognition based on deformable intensity surfaces which incorporates both the shape and texture components of the 2D image. The intensity surface of the facial image is modeled ...
详细信息
ISBN:
(纸本)0818672587
We describe a novel technique for face recognition based on deformable intensity surfaces which incorporates both the shape and texture components of the 2D image. The intensity surface of the facial image is modeled as a deformable 3D mesh in (x, y, I(x, y)) space. Using an efficient technique for matching two surfaces (in terms of the analytic modes of vibration), we obtain a dense correspondence field (or 3D warp) between two images. The probability distributions of two classes of warps are then estimated from training data: interpersonal and extrapersonal variations. These densities are then used in a Bayesian framework for image matching and recognition. Experimental results with facial data from the US Army FERET database demonstrate an increased recognition rate over the previous best methods.
A method is described for the determination of the viewing parameters of randomly acquired projections of asymmetric objects. It extends upon the common lines algorithm by determining the relative orientation of proje...
详细信息
ISBN:
(纸本)0818672587
A method is described for the determination of the viewing parameters of randomly acquired projections of asymmetric objects. It extends upon the common lines algorithm by determining the relative orientation of projections from the location of lines of intersection among the Fourier transforms of the projections in three-dimensional Fourier space. A new technique for finding the lines of intersection in the presence of translational displacement, and for subsequently finding the translational displacement, is presented. A new technique for dealing with noise is also presented. The complete algorithm is described and its efficacy is demonstrated using real data. This technique may be applied to the three-dimensional reconstruction of viruses, molecules, and cells from in vivo images. It also has many other applications including the reconstruction of underwater scenes, radioastronomy, geoseismic analysis, and portable radiography for medical diagnosis and industrial inspection.
Despite many successful applications of robust statistics, they have yet to be completely adapted to many computervision problems. Range reconstruction, particularly in unstructured environments, requires a robust es...
详细信息
ISBN:
(纸本)0818672587
Despite many successful applications of robust statistics, they have yet to be completely adapted to many computervision problems. Range reconstruction, particularly in unstructured environments, requires a robust estimator that not only tolerates a large outlier percentage but also tolerates several discontinuities, extracting multiple surfaces in an image region. Observing that random outliers and/or points from across discontinuities increase a hypothesized fit's scale estimate (standard deviation of the noise), our new operator, called MUSE (Minimum Unbiased Scale Estimator), evaluates a hypothesized fit over potential inlier sets via an objective function of unbiased scale estimates. MUSE extracts the single best fit from the data by minimizing its objective function over a set of hypothesized fits and can sequentially extract multiple surfaces from an image region. We show MUSE to be effective on synthetic data modelling small scale discontinuities and in preliminary experiments on complicated range data.
This paper presents a formal model of an active recognition system that can be programmed by learning. At each time step the system decides between producing an action to generate new data and stopping to issue the na...
详细信息
ISBN:
(纸本)0818672587
This paper presents a formal model of an active recognition system that can be programmed by learning. At each time step the system decides between producing an action to generate new data and stopping to issue the name of the object observed. The actions can be directed either towards the external environment or towards the internal perceptual system of the agent. The decision strategy is based on a quantitative evaluation of the system learning experience. The problem studied is the recognition of chess pieces using a moving camera and a multiscale feature detector. The recognition is difficult because the objects are complex - neither polyhedral nor smooth - and rather similar between classes, especially in certain view configurations. The system uses the information obtained by observing internal state transitions when the camera is moved or when the feature detector scale is changed. A simulation of the agent and the environment is used for experimental measures of the model performances.
We develop a classification algorithm for hybrid autoregressive models of human motion for the purpose of video-based analysis and recognition. We assume that some temporal statistics are extracted from the images, an...
详细信息
The efficiency of patternrecognition is particularly crucial in two scenarios;whenever there are a large number of classes to discriminate, and, whenever recognition must be performed a large number of times. We prop...
详细信息
ISBN:
(纸本)0818672587
The efficiency of patternrecognition is particularly crucial in two scenarios;whenever there are a large number of classes to discriminate, and, whenever recognition must be performed a large number of times. We propose a single technique, namely, pattern rejection, that greatly enhances efficiency in both cases. A rejector is a generalization of a classifier, that quickly eliminates a large fraction of the candidate classes or inputs. This allows a recognition algorithm to dedicate its efforts to a much smaller number of possibilities. Importantly, a collection of rejectors may be combined to form a composite rejector, which is shown to be far more effective than any of its individual components. A simple algorithm is proposed for the construction of each of the component rejectors. Its generality is established through close relationships with the Karhunen-Loeve expansion and Fisher's discriminant analysis. Composite rejectors were constructed for two representative applications, namely, appearance matching based object recognition and local feature detection. The results demonstrate substantial efficiency improvements over existing approaches, most notably Fisher's discriminant analysis.
Focus of attention mechanisms for robot vision are discussed. A new method for neglecting low level filter responses from already modelled structures is presented. The method is based on a filtering technique termed n...
详细信息
ISBN:
(纸本)0818672587
Focus of attention mechanisms for robot vision are discussed. A new method for neglecting low level filter responses from already modelled structures is presented. The method is based on a filtering technique termed normalized convolution. In one experiment, the robot is continuously moving its arm in the scene while tracking other objects. It is shown how the arm can be made 'invisible' so that only the moving object of interest is detected. This makes tracking of objects much simpler. In another experiment, the attention of the system is shifted between objects by simply cancelling the mask of the object to be attended to. With this strategy the low level processes do not need to know the difference between a new object entering the scene and a mask being cancelled, and thus a complex communication structure between high and low levels is avoided.
Image-based virtual reality is emerging as a major alternative to the more traditional 3D-based VR. The main advantages of the image-based VR are its photo-quality realism and 3D illusion without any 3D information. U...
详细信息
ISBN:
(纸本)0780342364
Image-based virtual reality is emerging as a major alternative to the more traditional 3D-based VR. The main advantages of the image-based VR are its photo-quality realism and 3D illusion without any 3D information. Unfortunately, creating content for image-based VR is usually a very tedious process. This paper proposes to use a non-perspective fisheye lens to capture the spherical panorama with very few images. Unlike most of camera calibration in computervision, self-calibration of the fisheye lens poses new questions regarding the parameterization of the distortion and wrap-around effects. Because of its unique projection model and large field of view (near 180 degrees), most of the ambiguity problems in self-calibrating a traditional lens can be solved trivially. We demonstrate that with four fisheye lens images, we can seamlessly register them to create the spherical panorama, while self-calibrating its distortion and field of view.
We demonstrate a concept of computervision as a secure, live service on the Internet. We show a platform to distribute a real lime vision algorithm using simple widely available web technologies, such as Adobe Flash....
详细信息
ISBN:
(纸本)9781424439942
We demonstrate a concept of computervision as a secure, live service on the Internet. We show a platform to distribute a real lime vision algorithm using simple widely available web technologies, such as Adobe Flash. We allow a user to access this service without downloading an executable or sharing the image stream with anyone. We support developers to publish without distribution complexity Finally the platform supports user-permitted aggregation of data for computervision research or analysis. We describe results a simple distributed motion detection algorithm. We discuss future scenarios for organically extending the horizon of computervision research.
暂无评论