This paper presents a system for view invariant gesture recognition. The approach is based on 3D data from a CSEM SwissRanger SR-2 camera. This camera produces both a depth map as well as an intensity image of a scene...
详细信息
This paper presents a system for view invariant gesture recognition. The approach is based on 3D data from a CSEM SwissRanger SR-2 camera. This camera produces both a depth map as well as an intensity image of a scene. Since the two information types are aligned, we can use the intensity image to define a region of interest for the relevant 3D data. This data fusion improves the quality of the range data and hence results in better recognition. The gesture recognition is based on finding motion primitives in the 3D data. The primitives are represented compactly and view invariant using harmonic shape context. A probabilistic Edit Distance classifier is applied to identify which gesture best describes a string of primitives. The approach is trained on data from one viewpoint and tested on data from a different viewpoint. The recognition rate is 92.9% which is similar to the recognition rate when training and testing on gestures from the same viewpoint, hence the approach is indeed view invariant.
In this paper we study the problem of visual loop closing for long trajectories in an urban environment. We use GPS positioning only to narrow down the search area and use pre-built vocabulary trees to find the best m...
详细信息
In this paper we study the problem of visual loop closing for long trajectories in an urban environment. We use GPS positioning only to narrow down the search area and use pre-built vocabulary trees to find the best matching image in this search area. Geometric consistency is then used to prune out the bad matches. We compare several vocabulary trees on a sequence of 6.5 kilometers. We experiment with hierarchical k-means based trees as well as extremely randomized trees and compare results obtained using five different trees. We obtain the best results using extremely randomized trees. After enforcing geometric consistency the matched images look promising for structure from motion applications.
In this work, we describe a white matter trajectory clustering algorithm that allows for incorporating and appropriately weighting anatomical information. The influence of the anatomical prior reflects confidence in i...
详细信息
In this work, we describe a white matter trajectory clustering algorithm that allows for incorporating and appropriately weighting anatomical information. The influence of the anatomical prior reflects confidence in its accuracy and relevance. It can either be defined by the user or it can be inferred automatically. After a detailed description of our novel clustering framework, we demonstrate its properties through a set of preliminary experiments.
This paper describes a real-time method for foreground/background segmentation of a color video sequence based primarily on range data of a time-of-flight sensor. This method uses depth information of a TOF-sensor pai...
详细信息
This paper describes a real-time method for foreground/background segmentation of a color video sequence based primarily on range data of a time-of-flight sensor. This method uses depth information of a TOF-sensor paired with a high resolution color video camera to efficiently segment foreground from background in a two-step process. First a trimap is produced using only range data: areas are located in each frame that have a high probability of being background or foreground, respectively. Pixels which cannot be definitively classified as foreground or background, typically about 1-2% of the frame, are assigned alpha-matte values using a cross bilateral filtering, applied directly to an estimate of the alpha-matte.
In this paper we present a natural feature tracking algorithm based on on-line boosting used for localizing a mobile computer. Mobile augmented reality requires highly accurate and fast six degrees of freedom tracking...
详细信息
In this paper we present a natural feature tracking algorithm based on on-line boosting used for localizing a mobile computer. Mobile augmented reality requires highly accurate and fast six degrees of freedom tracking in order to provide registered graphical overlays to a mobile user. With advances in mobile computer hardware, vision-based tracking approaches have the potential to provide efficient solutions that are non-invasive in contrast to the currently dominating marker-based approaches. We propose to use a tracking approach which can use in an unknown environment, i.e. the target has not be known beforehand. The core of the tracker is an on-line learning algorithm, which updates the tracker as new data becomes available. This is suitable in many mobile augmented reality applications. We demonstrate the applicability of our approach on tasks where the target objects are not known beforehand, i.e. interactive planing.
Phase-based optical flow algorithms are characterized by high precision and robustness, but also by high computational requirements. Using the CUDA platform, we have implemented a phase-based algorithm that maps excep...
详细信息
Phase-based optical flow algorithms are characterized by high precision and robustness, but also by high computational requirements. Using the CUDA platform, we have implemented a phase-based algorithm that maps exceptionally well on the GPUpsilas architecture. This optical flow algorithm revolves around a reliability measure that evaluates the consistency of phase information over time. By exploiting efficient filtering operations, the high internal bandwidth of the GPU, and the texture units, we obtain dense and reliable optical flow estimates in realtime at high resolutions (640 times 512 pixels and beyond). Even though the algorithm is local and does not involve iterative regularization, highly accurate results are obtained on synthetic and complex real-world sequences.
This paper presents a method for robustly tracking and estimating the face pose of a person in both indoor and outdoor environments. The method is invariant to identity and that does not require previous training. A f...
详细信息
This paper presents a method for robustly tracking and estimating the face pose of a person in both indoor and outdoor environments. The method is invariant to identity and that does not require previous training. A face model is automatically initialized and constructed on-line, when the face is frontal to the stereo camera system. To build the model, a fixed point distribution is superposed over the frontal face, and several appropriate points close to those locations are chosen for tracking. Using the stereo correspondence of the two cameras, the 3D coordinates of these points are extracted, and the 3D model is created. RANSAC and POSIT are used for tracking and 3D pose calculation at each frame. The approach runs in real time, and has been tested on sequences recorded in the laboratory and in a moving car.
Doors are important landmarks for indoor mobile robot navigation. Most existing algorithms for door detection use range sensors or work in limited environments because of restricted assumptions about color, pose, or l...
详细信息
Doors are important landmarks for indoor mobile robot navigation. Most existing algorithms for door detection use range sensors or work in limited environments because of restricted assumptions about color, pose, or lighting. We present a vision-based door detection algorithm that achieves robustness by utilizing a variety of features, including color, texture, and intensity edges. We introduce two novel geometric features that increase performance significantly: concavity and bottom-edge intensity profile. The features are combined using Adaboost to ensure optimal linear weighting. On a large database of images collected in a wide variety of conditions, the algorithm achieves more than 90% detection with a low false positive rate. Additional experiments demonstrate the suitability of the algorithm for real-time applications using a mobile robot equipped with an off-the-shelf camera and laptop.
Humans perceive some objects more complex than others and learning or describing a particular object is directly related to the judged complexity. Towards the goal of understanding why the geometry of some 3D objects ...
详细信息
Humans perceive some objects more complex than others and learning or describing a particular object is directly related to the judged complexity. Towards the goal of understanding why the geometry of some 3D objects appear more complex than others, we conducted a psychophysical study and identified contributing attributes. Our experiments conclude that surface variation, symmetry, part count, simpler part decomposability, intricate details and topology are six significant dimensions that influence 3D visual shape complexity. With that knowledge, we present a method of quantifying complexity and show that the informational aspect of Shannonpsilas theory agrees with the human notion of shape complexity.
A novel method is proposed for the problem of frame-to-frame correspondence search in video sequences. The method, based on hashing of low-dimensional image descriptors, establishes dense correspondences and allows la...
详细信息
A novel method is proposed for the problem of frame-to-frame correspondence search in video sequences. The method, based on hashing of low-dimensional image descriptors, establishes dense correspondences and allows large motions. All image pixels are considered for matching, the notion of interest points is reviewed. In our formulation, points of interest are those that can be reliably matched. Their saliency depends on properties of the chosen matching function and on actual image content. Both computational time and memory requirements of the correspondence search are asymptotically linear in the number of image pixels, irrespective of correspondence density and of image content. All steps of the method are simple and allow for a hardware implementation. Functionality is demonstrated on sequences taken from a vehicle moving in an urban environment.
暂无评论