We give a brief discussion of denoising algorithms for depth data and introduce a novel technique based oil the NL-Means Filter A unified approach is presented that removes outliers from depth data and accordingly ach...
详细信息
ISBN:
(纸本)9781424423392
We give a brief discussion of denoising algorithms for depth data and introduce a novel technique based oil the NL-Means Filter A unified approach is presented that removes outliers from depth data and accordingly achieves all unbiased smoothing result. This robust denoising algorithm takes intra-patch similarity and optional color information into account in order to handle strong discontinuities and to preserve fine detail structure in the data. We achieve fast computation times with a GPU-based implementation. Results using data from a time-of-flight camera system show a significant gain in visual quality.
Inferring the 3D spatial layout from a single 2D image is a fundamental visual task. We formulate it as a grouping problem where edges are grouped into lines, quadrilaterals, and finally depth-ordered planes. We demon...
详细信息
ISBN:
(纸本)9781424423392
Inferring the 3D spatial layout from a single 2D image is a fundamental visual task. We formulate it as a grouping problem where edges are grouped into lines, quadrilaterals, and finally depth-ordered planes. We demonstrate that the 3D structure of planar objects in indoor scenes can be fast and accurately inferred without any learning or indexing.
The combination of biometric matching scores can be enhanced by, taking into account the matching scores related to all enrolled persons in addition to traditional combinations utilizing only;matching scores related t...
详细信息
ISBN:
(纸本)9781424423392
The combination of biometric matching scores can be enhanced by, taking into account the matching scores related to all enrolled persons in addition to traditional combinations utilizing only;matching scores related to a single person. Identification models take into account the dependence between matching scores assigned to different persons and can be used for such enhancement. In this paper we compare the use of two such models - T-normalization and second best score model. The comparison is performed using two combination algorithms - likelihood ratio and multilayer perceptron. The results show, that while second best score model delivers better performance improvement than T-normalization, two models are complementary to each other and can be used together for further improvements.
Feature definition and selection are two important aspects in visual analysis of;notion. In this paper spatiotemporal local binary patterns computed at multiple resolutions are proposed for describing dynamic events, ...
详细信息
ISBN:
(纸本)9781424423392
Feature definition and selection are two important aspects in visual analysis of;notion. In this paper spatiotemporal local binary patterns computed at multiple resolutions are proposed for describing dynamic events, combining static and dynamic information from different spatiotemporal resolutions. Appearance and motion are the key components for visual analysis related to movements. AdaBoost algorithm is utilized for learning the principal appearance and motion from spatiotemporal descriptors derived from three orthogonal planes, providing important information about the locations and types of features for further analysis. In addition, learners are designed for selecting the most important features for each specific pair of different classes. The experiments carried out on diverse visual analysis tasks: facial expression recognition and visual speech recognition, show the effectiveness of the approach.
This paper presents an efficient partial shape matching method based on the Smith-Waterman algorithm. For two contours of m and n points respectively, the complexity of our method to find similar parts is only O(mn). ...
详细信息
ISBN:
(纸本)9781424423392
This paper presents an efficient partial shape matching method based on the Smith-Waterman algorithm. For two contours of m and n points respectively, the complexity of our method to find similar parts is only O(mn). In addition to this improvement in efficiency, we also obtain comparable accurate matching with fewer shape descriptors. Also, in contrast to arbitrary distance functions that are used by previous methods, we use a probabilistic similarity measurement, p-value, to evaluate the similarity of two shapes. Our experiments on several public shape databases indicate that our method outperforms state-of-the-art global and partial shape matching algorithms in various scenarios.
In this paper we present a natural feature tracking algorithm based on on-line boosting used for localizing a mobile computer Mobile augmented reality requires highly accurate and fast six degrees of freedom tracking ...
详细信息
ISBN:
(纸本)9781424423392
In this paper we present a natural feature tracking algorithm based on on-line boosting used for localizing a mobile computer Mobile augmented reality requires highly accurate and fast six degrees of freedom tracking in order to provide registered graphical overlays to a mobile user With advances in mobile computer hardware, vision-based tracking approaches have the potential to provide efficient solutions that are non-invasive in contrast to the currently dominating marker-based approaches. We propose to use a tracking approach which can use in an unknown environment, i.e. the target has not be known beforehand. The core of the tracker is an on-line learning algorithm, which updates the tracker as new data becomes available. This is suitable in man);mobile augmented reality applications. We demonstrate the applicability of our approach on tasks where the target objects are not known beforehand, i.e. interactive planing.
The intensity-images captured by Time-of-Flight (ToF)-cameras are biased in several ways. The values differ significantly, depending on the integration time set within the camera and on the distance of the scene. Wher...
详细信息
ISBN:
(纸本)9781424423392
The intensity-images captured by Time-of-Flight (ToF)-cameras are biased in several ways. The values differ significantly, depending on the integration time set within the camera and on the distance of the scene. Whereas the integration time leads to an almost linear scaling of the whole image, the attenuation due to the distance is nonlinear resulting in higher intensities for objects closer to the camera. The background regions that are farther away contain comparably low values, leading to a bad contrast within the image. Another effect is that some kind of specularity may be observed due to uncommon reflecting conditions at some points within the scene. These three effects lead to intensity, images which exhibit significantly different values depending on the integration time of the camera and the distance to the scene, thus making parameterization of processing steps like edge-detection, segmentation, registration and threshold computation a tedious task. Additionally, outliers with exceptionally high values lead to insufficient visualization results and problems in processing. In this work we propose scaling techniques which generate images whose intensities are independent of the integration time of the camera and the measured distance. Furthermore, a simple approach for reducing specularity effects is introduced.
Stereo vision has become a very interesting sensing technology for robotic platforms. It offers various advantages, but the drawback is a very high algorithmic effort. Due to the aptitude of certain non-parametric tec...
详细信息
ISBN:
(纸本)9781424423392
Stereo vision has become a very interesting sensing technology for robotic platforms. It offers various advantages, but the drawback is a very high algorithmic effort. Due to the aptitude of certain non-parametric techniques for Field Programmable Gate Array (FPGA) based stereo matching, these algorithms can be implemented in highly parallel design while offering adequate real-time behavior To enable the provision of color images by the stereo sensor for object classification tasks, we propose a technique for extending the rank and the census transform for increased robustness on gray scaled bayer patterned images. Furthermore, we analyze the extended and the original algorithms' behavior on image sets created in controlled environments as well as on real world images and compare their resource usage when implemented on our FPGA based stereo matching architecture.
Minutiae, which are the endpoints and bifurcations of fingerprint ridges, allow a very discriminative classification of fingerprints. However a minutiae set is an unordered set and the minutiae locations suffer from v...
详细信息
ISBN:
(纸本)9781424423392
Minutiae, which are the endpoints and bifurcations of fingerprint ridges, allow a very discriminative classification of fingerprints. However a minutiae set is an unordered set and the minutiae locations suffer from various deformations such as translation, rotation and scaling. In this paper, we introduce a novel method to represent a minutiae set as a fixed-length feature vector, which is invariant to translation, and in which rotation and scaling become translations, so that they can be easily compensated for By applying the spectral minutiae representation, we can combine the fingerprint recognition system with a template protection scheme, which requires a fixed-length feature vector This paper also presents two spectral minutiae matching algorithms and shows experimental results.
The lack of a written representation for American Sign Language (ASL) makes it difficult to do something as commonplace as looking up an unknown word in a dictionary. The majority, of printed dictionaries organize ASL...
详细信息
ISBN:
(纸本)9781424423392
The lack of a written representation for American Sign Language (ASL) makes it difficult to do something as commonplace as looking up an unknown word in a dictionary. The majority, of printed dictionaries organize ASL signs (represented in drawings or pictures) based on their nearest English translation;so unless one already knows the meaning of a sign, dictionary, look-up is not a simple proposition. In this paper we introduce the ASL Lexicon Video Dataset, a large and expanding public dataset containing video sequences of thousands of distinct ASL signs, as well as annotations of those sequences, including start/end frames and class label of every sign. This dataset is being created as part of a project to develop a computervision system that allows users to look up the meaning of an ASL sign. At the same time, the dataset can be useful for benchmarking a variety of computervision and machine learning methods designed for learning and/or indexing a large number of visual classes, and especially approaches for analyzing gestures and human communication.
暂无评论