Dynamic events can be regarded as long-term temporal objects, which are characterized by spatio-temporal features at multiple temporal scales. Based on this, we design a simple statistical distance measure between vid...
详细信息
ISBN:
(纸本)0769512720
Dynamic events can be regarded as long-term temporal objects, which are characterized by spatio-temporal features at multiple temporal scales. Based on this, we design a simple statistical distance measure between video sequences (possibly of different lengths) based on their behavioral content. This measure is non-parametric and can thus handle a wide range of dynamic events. We use this measure for isolating and clustering events within long continuous video sequences. This is done without prior knowledge of the types of events, their models, or their temporal extent. An outcome of such a clustering process is a temporal segmentation of long video sequences into event consistent sub-sequences, and their grouping into event consistent clusters. Our event representation and associated distance measure can also be used for event-based indexing into long video sequences, even when only one short example-clip is available. However, when multiple example-clips of the same event are available (either as a result of the clustering process, or given manually), these can be used to refine the event representation, the associated distance measure, and accordingly the quality of the detection and clustering process.
This paper presents an approach to object feature extraction and object matching for line object processing. The invariant features of objects have been extracted based on inertia coordinate systems. Weighted fuzzy di...
详细信息
ISBN:
(纸本)078037293X
This paper presents an approach to object feature extraction and object matching for line object processing. The invariant features of objects have been extracted based on inertia coordinate systems. Weighted fuzzy dissimilarity is proposed as a matching variable. To simplify the optimal object matching, an optimal matching pair theorem is proposed. Experiments are conducted for verifying the feature extraction and the optimal matching pair theorem.
Maximization of mutual information is a powerful method for registering images (and other data) captured with different sensors or under varying conditions, since the technique is robust to variations in the image for...
详细信息
Maximization of mutual information is a powerful method for registering images (and other data) captured with different sensors or under varying conditions, since the technique is robust to variations in the image formation process. On the other hand, the high level of robustness allows false positives when matching over a large search space and also makes it difficult to formulate an efficient search strategy for this case. We describe techniques to overcome these problems by aligning image entropies, which are robust to illumination variation and can be applied to multi-sensor registration. This results in a lower rate of false positives and a more efficient method to search an image for the matching position. The techniques are applied to real imagery and compared to methods based on mutual information and gradients to demonstrate their effectiveness.
This paper deals with the problem of estimating structure and motion from long continuous image sequences, applying the Expectation Maximization algorithm based on extended Kalman smoother to impose the time-continuit...
详细信息
This paper deals with the problem of estimating structure and motion from long continuous image sequences, applying the Expectation Maximization algorithm based on extended Kalman smoother to impose the time-continuity of the motion parameters. By repeatedly estimating the state transition matrix of the dynamic equation and the parameters of noise processes in the dynamic and measurement equations, this optimization gives the maximum likelihood estimates of the motion and structure parameters. Practically, this research is essential for dealing with a long video-rate image sequence with partially unknown system equation and noise. The algorithm is implemented and tested for a real image sequence.
This paper addresses the problem of calibrating camera lens distortion, which can be significant in medium to wide angle lenses. Our approach is based on the analysis of distorted images of straight lines. We derive n...
详细信息
This paper addresses the problem of calibrating camera lens distortion, which can be significant in medium to wide angle lenses. Our approach is based on the analysis of distorted images of straight lines. We derive new distortion measures that can be optimized using non-linear search techniques to find the best distortion parameters that straighten these lines. Unlike the other existing approaches, we also provide fast, closed-form solutions to the distortion coefficients. Experiments to evaluate the performance of this approach on synthetic and real data are reported.
Video-based eye gaze detection systems are useful for eye-slaved support system for the severely disabled. The pupil center in the video image is a focal point to determine the eye gaze. Recently, to improve the disad...
详细信息
ISBN:
(纸本)0780372115
Video-based eye gaze detection systems are useful for eye-slaved support system for the severely disabled. The pupil center in the video image is a focal point to determine the eye gaze. Recently, to improve the disadvantages of traditional pupil detection methods, a pupil detection technique using two light sources (LEDs) and the image difference method was proposed [1]. In addition, for users or subjects wearing corrective eyeglasses a method for eliminating the images of the light sources reflected in the glass lens was proposed. However, image -processing.hardware for implementing these methods is rather expensive. In the present paper, the hardware construction is replaced by a construction consisting of a combination of a conventional image grabber and a personal computer. An algorithm for windowing around the pupil image with an automatic thresholding method for pupil detection is proposed. The results show that the algorithm works well when the user or the subject is wearing eyeglasses and under normal ambient lighting conditions. The calculation time is quick enough for real time processing. These algorithms would contribute to consistent and reliable pupil detection.
Accurate estimation of effective camera focal length is crucial to the success of panoramic image stitching. Fast techniques for estimating the focal length exist, but are dependent upon a close initial approximation ...
详细信息
Accurate estimation of effective camera focal length is crucial to the success of panoramic image stitching. Fast techniques for estimating the focal length exist, but are dependent upon a close initial approximation or the existence of a full circle panoramic image sequence. Numerical solutions of the focal length demonstrate strong coupling between the focal length and the angles used to position each component image about the common spherical center. This paper demonstrates that parameterizing panoramic image positions using spherical arc length instead of angles effectively decouples the focal length from the image position. This new parameterization does not require an initial focal length estimate for quick convergence, nor does it require a full circle panorama in order to refine the focal length. Experiments with synthetic and real image sets demonstrate the robustness of the method and a speedup of 5 to 20 times over angle based positioning.
In this paper we describe techniques for 3D textured model construction of urban areas using acquisition devices such as intensity cameras, as well as 2D laser scanner. Our experimental set up consists of a truck equi...
详细信息
ISBN:
(纸本)0769512720
In this paper we describe techniques for 3D textured model construction of urban areas using acquisition devices such as intensity cameras, as well as 2D laser scanner. Our experimental set up consists of a truck equipped with one camera and two fast, inexpensive 2D laser scanner, traveling on city streets under normal traffic conditions. The horizontal laser scans are used to determine the approximate component of motion along the movement of the acquisition vehicle. The vertical scanner is used to build 3D models of the facade of the buildings. To improve the accuracy of localization of the truck and hence our resulting 3D models of the city, two different methods are developed and compared: the first method employs a correlation technique and the second method is based on Markov Monte Carlo localization. Both techniques use digital road maps and aerial photographs in conjunction with laser scans. A fairly accurate textured, 3D model of downtown area has been acquired in a matter of few minutes, limited only by traffic conditions during the data acquisition phase.
In this research, we introduce a reasonable noise model for range data which is obtained by a laser radar range finder, and derive two simple approximate solutions of the optimal local plane fitting the range data und...
详细信息
ISBN:
(纸本)0769512720
In this research, we introduce a reasonable noise model for range data which is obtained by a laser radar range finder, and derive two simple approximate solutions of the optimal local plane fitting the range data under the noise model. Then we compare our methods with the general least-squares based methods, such as Z-function fitting, the eigenvalue method, and the maximum likelihood estimation method, as well as the re-normalization method, which is an iterative method to obtain the optimal fitting of Planes of range data under the noise model. All the methods are compared and evaluated using both synthetic range data and real range data with ground truth. From the experimental evaluation results, the proposed methods are shown to be effective, and the general least-squares-based methods are shown to be unsuitable for the assumed noise model.
Eye movements are an important aspect of human visual behavior. The temporal and space-variant nature of sampling a visual scene requires frequent attentional gaze shifts, saccades, to fixate onto different parts of a...
详细信息
Eye movements are an important aspect of human visual behavior. The temporal and space-variant nature of sampling a visual scene requires frequent attentional gaze shifts, saccades, to fixate onto different parts of an image. Experimental evidence suggests that fixations are often directed towards the most informative regions in the visual scene. We develop a model and its simulation that can select such regions based on prior knowledge of similar scenes. Having representations of scene categories as a probabilistic combination of hypothetical objects, i.e., prototypical regions with certain properties, it is possible to assess the likely contribution of each image region to the successive recognition process. Using conditional probabilities for each region given the scene category, the model can then predict its informative value and initiate a sequential spatial information-gathering algorithm analogous to an eye movement saccade to a new fixation. This algorithm establishes the most likely scene category for a given image.
暂无评论