The intensity-images captured by Time-of-Flight (ToF)-cameras are biased in several ways. The values differ significantly, depending on the integration time set within the camera and on the distance of the scene. Wher...
详细信息
ISBN:
(纸本)9781424423392
The intensity-images captured by Time-of-Flight (ToF)-cameras are biased in several ways. The values differ significantly, depending on the integration time set within the camera and on the distance of the scene. Whereas the integration time leads to an almost linear scaling of the whole image, the attenuation due to the distance is nonlinear resulting in higher intensities for objects closer to the camera. The background regions that are farther away contain comparably low values, leading to a bad contrast within the image. Another effect is that some kind of specularity may be observed due to uncommon reflecting conditions at some points within the scene. These three effects lead to intensity, images which exhibit significantly different values depending on the integration time of the camera and the distance to the scene, thus making parameterization of processing steps like edge-detection, segmentation, registration and threshold computation a tedious task. Additionally, outliers with exceptionally high values lead to insufficient visualization results and problems in processing. In this work we propose scaling techniques which generate images whose intensities are independent of the integration time of the camera and the measured distance. Furthermore, a simple approach for reducing specularity effects is introduced.
This paper presents an efficient partial shape matching method based on the Smith-Waterman algorithm. For two contours of m and n points respectively, the complexity of our method to find similar parts is only O(mn). ...
详细信息
ISBN:
(纸本)9781424423392
This paper presents an efficient partial shape matching method based on the Smith-Waterman algorithm. For two contours of m and n points respectively, the complexity of our method to find similar parts is only O(mn). In addition to this improvement in efficiency, we also obtain comparable accurate matching with fewer shape descriptors. Also, in contrast to arbitrary distance functions that are used by previous methods, we use a probabilistic similarity measurement, p-value, to evaluate the similarity of two shapes. Our experiments on several public shape databases indicate that our method outperforms state-of-the-art global and partial shape matching algorithms in various scenarios.
In this paper we present a natural feature tracking algorithm based on on-line boosting used for localizing a mobile computer Mobile augmented reality requires highly accurate and fast six degrees of freedom tracking ...
详细信息
ISBN:
(纸本)9781424423392
In this paper we present a natural feature tracking algorithm based on on-line boosting used for localizing a mobile computer Mobile augmented reality requires highly accurate and fast six degrees of freedom tracking in order to provide registered graphical overlays to a mobile user With advances in mobile computer hardware, vision-based tracking approaches have the potential to provide efficient solutions that are non-invasive in contrast to the currently dominating marker-based approaches. We propose to use a tracking approach which can use in an unknown environment, i.e. the target has not be known beforehand. The core of the tracker is an on-line learning algorithm, which updates the tracker as new data becomes available. This is suitable in man);mobile augmented reality applications. We demonstrate the applicability of our approach on tasks where the target objects are not known beforehand, i.e. interactive planing.
Stereo vision has become a very interesting sensing technology for robotic platforms. It offers various advantages, but the drawback is a very high algorithmic effort. Due to the aptitude of certain non-parametric tec...
详细信息
ISBN:
(纸本)9781424423392
Stereo vision has become a very interesting sensing technology for robotic platforms. It offers various advantages, but the drawback is a very high algorithmic effort. Due to the aptitude of certain non-parametric techniques for Field Programmable Gate Array (FPGA) based stereo matching, these algorithms can be implemented in highly parallel design while offering adequate real-time behavior To enable the provision of color images by the stereo sensor for object classification tasks, we propose a technique for extending the rank and the census transform for increased robustness on gray scaled bayer patterned images. Furthermore, we analyze the extended and the original algorithms' behavior on image sets created in controlled environments as well as on real world images and compare their resource usage when implemented on our FPGA based stereo matching architecture.
Minutiae, which are the endpoints and bifurcations of fingerprint ridges, allow a very discriminative classification of fingerprints. However a minutiae set is an unordered set and the minutiae locations suffer from v...
详细信息
ISBN:
(纸本)9781424423392
Minutiae, which are the endpoints and bifurcations of fingerprint ridges, allow a very discriminative classification of fingerprints. However a minutiae set is an unordered set and the minutiae locations suffer from various deformations such as translation, rotation and scaling. In this paper, we introduce a novel method to represent a minutiae set as a fixed-length feature vector, which is invariant to translation, and in which rotation and scaling become translations, so that they can be easily compensated for By applying the spectral minutiae representation, we can combine the fingerprint recognition system with a template protection scheme, which requires a fixed-length feature vector This paper also presents two spectral minutiae matching algorithms and shows experimental results.
The lack of a written representation for American Sign Language (ASL) makes it difficult to do something as commonplace as looking up an unknown word in a dictionary. The majority, of printed dictionaries organize ASL...
详细信息
ISBN:
(纸本)9781424423392
The lack of a written representation for American Sign Language (ASL) makes it difficult to do something as commonplace as looking up an unknown word in a dictionary. The majority, of printed dictionaries organize ASL signs (represented in drawings or pictures) based on their nearest English translation;so unless one already knows the meaning of a sign, dictionary, look-up is not a simple proposition. In this paper we introduce the ASL Lexicon Video Dataset, a large and expanding public dataset containing video sequences of thousands of distinct ASL signs, as well as annotations of those sequences, including start/end frames and class label of every sign. This dataset is being created as part of a project to develop a computervision system that allows users to look up the meaning of an ASL sign. At the same time, the dataset can be useful for benchmarking a variety of computervision and machine learning methods designed for learning and/or indexing a large number of visual classes, and especially approaches for analyzing gestures and human communication.
In this paper we describe a fully integrated, real-time, miniaturized embedded stereo vision system (MESVS-II), which fits within 5x5cm and consumes very low power. This is a significant improvement over the original ...
详细信息
ISBN:
(纸本)9781424423392
In this paper we describe a fully integrated, real-time, miniaturized embedded stereo vision system (MESVS-II), which fits within 5x5cm and consumes very low power. This is a significant improvement over the original MESVS-I system in terms of performance, quality and accuracy of results. MESVS-II running at 600MHz per core, is capable of operating at up to 20 fps, which is twice as fast as MESVS-I, due to the efficient implementation of stereo-vision algorithms, improved memory and data management, in-place processing scheme, code optimization, and the pipelined-programming model that takes advantage of the dual-core architecture of the embedded processor. The firmware incorporates sub-sampling, rectification, pre-processing, matching, LRC (Left/Right Consistency) check and post-processing. As demonstrated by our experimental results, we have also enhanced the robustness of the stereo-matching engine to radiometric variations by choosing census transform over rank transform.
A growing number of applications depend on accurate and fast 3D scene analysis. Examples are object recognition, collision prevention, 3D modeling, mixed reality, and gesture recognition. The estimation of a range map...
详细信息
ISBN:
(纸本)9781424423392
A growing number of applications depend on accurate and fast 3D scene analysis. Examples are object recognition, collision prevention, 3D modeling, mixed reality, and gesture recognition. The estimation of a range map by image analysis or laser scan techniques is still a time-consuming and expensive part of such systems. A lower-priced, fast and robust alternative for distance measurements are Time-of-Flight (ToF) cameras. Recently, significant improvements have been made in order to achieve low-cost and compact ToF-devices, that have the potential to revolutionize many fields of research, including computervision, computer Graphics and Human computer Interaction (HCI). These technologies are starting to have an impact on research and commercial applications. The upcoming generation of ToF sensors, however will be even more powerful and will have the potential to become "ubiquitous geometry devices" for gaining, web-conferencing, and numerous other applications. This paper will give an account of some recent developments in ToF-technology and will discuss applications of this technology for vision, graphics, and HCI.
Object recognition systems designed for Internet applications typically need to adapt to users' needs in a flexible fashion and scale up to very large data sets. In this paper, we analyze the complexity of several...
详细信息
ISBN:
(纸本)9781424423392
Object recognition systems designed for Internet applications typically need to adapt to users' needs in a flexible fashion and scale up to very large data sets. In this paper, we analyze the complexity of several multiclass SVM-based algorithms and highlight the computational bottleneck they: stiffer at test time: comparing the input image to every training image. We propose an algorithm that overcomes this bottleneck;it offers not only the efficiency of a simple nearest-neighbor classifier, by voting on class labels based on the k nearest neighbors quickly determined by a vocabulary tree, but also the recognition accuracy comparable to that of a complex SVM classifier by incorporating SVM parameters into the voting scores incrementally accumulated from individual image features. Empirical results demonstrate that adjusting votes by relevant support vector weights can improve the recognition accuracy of a nearest-neighbor classifier without sacrificing speed. Compared to existing methods, our algorithm achieves a ten-fold speed increase while incurring an acceptable accuracy loss that can be easily offset by showing about two more labels in the result. The speed, scalability, and adaptability of our algorithm makes it suitable for Internet vision applications.
An algorithm is proposed to answer the challenges of autonomous corridor navigation and mapping by a mobile robot equipped with a single forward-facing camera. Using a combination of corridor ceiling lights, visual ho...
详细信息
ISBN:
(纸本)9781424423392
An algorithm is proposed to answer the challenges of autonomous corridor navigation and mapping by a mobile robot equipped with a single forward-facing camera. Using a combination of corridor ceiling lights, visual homing, and entropy, the robot is able to perform straight line navigation down the center of an unknown corridor Turning at the end of a corridor is accomplished using Jeffrey divergence and time-to-collision, while deflection from dead ends and blank walls uses a scalar entropy measure of the entire image. When combined, these metrics allow the robot to navigate in both textured and untextured environments. The robot can autonomously explore an unknown indoor environment, recovering from difficult situations like corners, blank walls, and initial heading toward a wall. While exploring, the algorithm constructs a Voronoi-based topo-geometric map with nodes representing distinctive places like doors, water fountains, and other corridors. Because the algorithm is based entirely upon low-resolution (32 x 24) grayscale images, processing occurs at over 1000 frames per second.
暂无评论