This paper deals with an absolute mobile robot self-localization algorithm in an indoor environment. Until now, localization methods based on conical omnidirectional vision sensors uniquely used radial segments from v...
详细信息
This paper deals with an absolute mobile robot self-localization algorithm in an indoor environment. Until now, localization methods based on conical omnidirectional vision sensors uniquely used radial segments from vertical environment landmarks projection. The main motivation of this work is to demonstrate that the SYCLOP sensor can be used as a vision sensor rather than a goniometric one. We will show how the calibration allows us to know the omnidirectional image formation process to compute a synthetic image base. Then, we will present the spatial localization method using a base of synthetics images and one real omnidirectional image. Finally, some experimental results obtained with real noisy omnidirectional images are shown.
In this paper, we present an approach for the classification of remote sensing multispectral data, which consists of two sequential stages. The first stage exploits the capabilities of the Support Vector Machines (SVM...
详细信息
In this paper, we present an approach for the classification of remote sensing multispectral data, which consists of two sequential stages. The first stage exploits the capabilities of the Support Vector Machines (SVM) approach for density estimation and uses it in a Bayes classification setup. In a typical image, the class of a pixel is highly dependent on the classes of its neighbor pixels. The second stage exploits the dependency of the classes. We incorporate this dependency using stochastic modeling of the context as a Markov Random Field (MRF). The MRF is modeled using Besag model and implemented using the Iterative Conditional Modes (ICM) algorithm. Results show that the stochastic modeling approach enhances the results and provides reasonable smoothness in the classified image.
When tracking in a particular environment, objects tend to appear and disappear at certain locations. These locations may correspond to doors, garages, tunnel entrances, or even the edge of a camera view. A tracking s...
详细信息
When tracking in a particular environment, objects tend to appear and disappear at certain locations. These locations may correspond to doors, garages, tunnel entrances, or even the edge of a camera view. A tracking system with knowledge of these locations is capable of improved initialization of tracking sequences, reconstitution of broken tracking sequences, and determination of tracking sequence termination. Further, knowledge of these locations is useful for activity-level descriptions of tracking sequences and for understanding relationships between non-overlapping camera views. This paper introduces a method for simultaneously solving these coupled problems: inferring the parameters of a source and sink model for a scene; and fixing broken tracking sequences and other tracking failures. A model selection criterion is also explained which allows determination of the number of sources and sinks in an environment. Results in multiple environments illustrate the effectiveness of this method.
We present an approach to rendering stereo pairs of views from a set of omnidirectional mosaic images allowing arbitrary viewing direction and vergence angle of two eyes of a viewer. Moreover, we allow the viewer to m...
详细信息
We present an approach to rendering stereo pairs of views from a set of omnidirectional mosaic images allowing arbitrary viewing direction and vergence angle of two eyes of a viewer. Moreover, we allow the viewer to move his head aside to see behind occluding objects. We propose a representation of the scene in a set of omnidirectional mosaic images composed from a sequence of images acquired by an omnidirectional camera equipped with a lens with a field of view of 183°. The proposed representation allows fast access to high resolution mosaic images and efficient representation in the memory. The proposed method can be applied in a representation of a real scene, where the viewer is supposed to stand at one spot and look around.
This paper presents a perceptual interface for visualization navigation using gesture recognition. Scientists are interested in developing interactive settings for exploring large data sets in an intuitive environment...
详细信息
This paper presents a perceptual interface for visualization navigation using gesture recognition. Scientists are interested in developing interactive settings for exploring large data sets in an intuitive environment. The input consists of registered 3-D data. Bezier curves are used for trajectory analysis and classification of gestures. The method is robust and reliable: correct hand identification rate is 99.9% (from 1641 frames), modes of hand movements are correct 95.6% of the time, recognition rate (given the right mode) is 97.9%. An application to gesture-controlled visualization is also presented. The paper advances the state-of-the-art of human-computer interaction with a robust attachment- and marker-free gestural information processing for visualization.
In this paper we present a new method which enables a robust calculation of the LDA classification rule, thus making the recognition of objects under non-ideal conditions possible, i.e., in situations when objects are...
详细信息
In this paper we present a new method which enables a robust calculation of the LDA classification rule, thus making the recognition of objects under non-ideal conditions possible, i.e., in situations when objects are occluded or they appear on a varying background, or when their images are corrupted by outliers. The main idea behind the method is to translate the task of calculating the LDA classification rule into the problem of determining the coefficients of an augmented generative model (PCA). Specifically, we construct an augmented PCA basis which, on the one hand, contains information necessary for the classification (in the LDA sense), and, on the other hand, enables us to calculate the necessary coefficients by means of a subsampling approach resulting in a high breakdown point classification. The theoretical results are evaluated on the ORL face database showing that the proposed method significantly outperforms the standard LDA.
In applications of egomotion estimation, such as real-time vision-based navigation, one must deal with the double-edged sword of small relative motions between images. On one hand, tracking feature points is easier, w...
详细信息
In applications of egomotion estimation, such as real-time vision-based navigation, one must deal with the double-edged sword of small relative motions between images. On one hand, tracking feature points is easier, while on the other, two-view structure-from-motion algorithms are poorly conditioned due to the low signal-to-noise ratio. In this paper, we derive a multi-frame structure from motion algorithm for calibrated central panoramic cameras. Our algorithm avoids the conditioning problem by explicitly incorporating the small baseline assumption in the algorithm's design. The proposed algorithm is linear, amenable to real-time implementation, and performs well in the small baseline domain for which it is designed.
Perceptual user interfaces promise modes of fluid computer-human interaction that complement the mouse and keyboard, and have been especially motivated in non-desktop scenarios, such as kiosks or smart rooms. Such int...
详细信息
Perceptual user interfaces promise modes of fluid computer-human interaction that complement the mouse and keyboard, and have been especially motivated in non-desktop scenarios, such as kiosks or smart rooms. Such interfaces, however, have been slow to see use for a variety of reasons, including the computational burden they impose, a lack of robustness outside the laboratory, unreasonable calibration demands, and a shortage of sufficiently compelling applications. We have tackled some of these difficulties by using a fast stereo vision algorithm for recognizing hand positions and gestures. Our system uses two inexpensive video cameras to extract depth information. This depth information enhances automatic object detection and tracking robustness, and may also be used in applications. We demonstrate the algorithm in combination with speech recognition to perform several basic window management tasks, report on a user study probing the ease of using the system, and discuss the implications of such a system for future user interfaces.
We adapted a vision-based face tracking system for cursor control by head movement. An additional vision-based algorithm allowed the user to enter a click by opening the mouth. The Fitts law information throughput rat...
详细信息
We adapted a vision-based face tracking system for cursor control by head movement. An additional vision-based algorithm allowed the user to enter a click by opening the mouth. The Fitts law information throughput rate of cursor movements was measured to be 2.0 bits/sec with the ISO 9241-9 international standard method for testing input devices. A usability assessment was also conducted and we report and discuss the results. A practical application of this facial gesture interface was studied: text input using the Dasher system, which allows a user to type by moving the cursor. The measured typing speed was 7-12 words/minute, depending on level of user expertise. Performance of the system is compared to a conventional mouse interface.
We present a platform for human-machine interfaces that provides functionality for robust, unencumbered interaction: the 4D Touchpad (4DT). The goal is direct interaction with interface components through intuitive ac...
详细信息
We present a platform for human-machine interfaces that provides functionality for robust, unencumbered interaction: the 4D Touchpad (4DT). The goal is direct interaction with interface components through intuitive actions and gestures. The 4DT is based on the 3D-2D Projection-based mode of the VICs framework. The fundamental idea behind VICs is that expensive global image processing with user modeling and tracking is not necessary in general vision-based HCI. Instead, interface components operating under simple-to-complex rules in local image regions provide more robust and less costly functionality with 3 spatial dimensions and 1 temporal dimension. A prototype realization of the 4DT platform is presented; it operates through a set of planar homographies with uncalibrated cameras.
暂无评论