In this work, we investigate the applicability of the Kinect depth camera as a robot mounted measurement unit. In contrast to traditional head mounted robot sensors, Kinect is small, cheap and delivers robust depth me...
详细信息
In this work, we investigate the applicability of the Kinect depth camera as a robot mounted measurement unit. In contrast to traditional head mounted robot sensors, Kinect is small, cheap and delivers robust depth measurements on a variety of scenes. In the course of applying it on a robot arm, we solve a number of problems: we reduce the sensor working distance to a few centimeters, replace the Laser projector unit by a focusable projector, and calibrate this sensor unit. We further exploit the motion capabilities of the robot arm to integrate multiple depth maps with 30 Hz in a volumetric fusion approach. We show how this method considerably improves completeness of the scanned models, even under severe reflections and difficult surface properties. We employ our approach in a classical bin-picking setting, where the robot scans the object during its approaching motion, and picks it afterwards.
In this paper, we analyze the special requirements of a dynamic memory allocator that is designed for massively parallel architectures such as graphics Processing Units (GPUs). We show that traditional strategies, whi...
详细信息
In this paper, we analyze the special requirements of a dynamic memory allocator that is designed for massively parallel architectures such as graphics Processing Units (GPUs). We show that traditional strategies, which work well on CPUs, are not well suited for the use on GPUs and present the thorough design of ScatterAlloc, which can efficiently deal with hundreds of requests in parallel. Our allocator greatly reduces collisions and congestion by scattering memory requests based on hashing. We analyze ScatterAlloc in terms of allocation speed, data access time and fragmentation, and compare it to current state-of-the-art allocators, including the one provided with the NVIDIA CUDA toolkit. Our results show, that ScatterAlloc clearly outperforms these other approaches, yielding speed-ups between 10 to 100.
Micro aerial vehicles (MAVs) are gaining importance as image acquisition tools in urban environments, where areas of interest are often close to buildings and to the ground. While GPS is still the most widely used sen...
详细信息
Micro aerial vehicles (MAVs) are gaining importance as image acquisition tools in urban environments, where areas of interest are often close to buildings and to the ground. While GPS is still the most widely used sensor for outdoor localization, urban applications motivate the change towards visual localization. We present a framework based on metric, geo-referenced visual landmarks, which can be obtained by taking images with a consumer camera at ground level. Visual landmarks serve as prior knowledge to the MAV and allow robust, high-accuracy localization in urban environments. The issue of differing camera views in higher altitudes is reduced by incremental feature updates, a novel technique which boosts the performance by 30% in comparison to previous work, facilitates long-term operation, and results in a localization rate of 83%. We validate the visual pose estimation in-flight by comparison to IMU and GPS data, and evaluate our positioning accuracy with respect to differential GPS.
In this paper, we raise important issues on scalability and the required degree of supervision of existing Mahalanobis metric learning methods. Often rather tedious optimization procedures are applied that become comp...
详细信息
In this paper, we raise important issues on scalability and the required degree of supervision of existing Mahalanobis metric learning methods. Often rather tedious optimization procedures are applied that become computationally intractable on a large scale. Further, if one considers the constantly growing amount of data it is often infeasible to specify fully supervised labels for all data points. Instead, it is easier to specify labels in form of equivalence constraints. We introduce a simple though effective strategy to learn a distance metric from equivalence constraints, based on a statistical inference perspective. In contrast to existing methods we do not rely on complex optimization problems requiring computationally expensive iterations. Hence, our method is orders of magnitudes faster than comparable methods. Results on a variety of challenging benchmarks with rather diverse nature demonstrate the power of our method. These include faces in unconstrained environments, matching before unseen object instances and person re-identification across spatially disjoint cameras. In the latter two benchmarks we clearly outperform the state-of-the-art.
We present a novel system that is capable of generating live dense volumetric reconstructions based on input from a micro aerial vehicle. The distributed reconstruction pipeline is based on state-of-the-art approaches...
详细信息
We present a novel system that is capable of generating live dense volumetric reconstructions based on input from a micro aerial vehicle. The distributed reconstruction pipeline is based on state-of-the-art approaches to visual SLAM and variational depth map fusion, and is designed to exploit the individual capabilities of the system components. Results are visualized in real-time on a tablet interface, which gives the user the opportunity to interact. We demonstrate the performance of our approach by capturing several indoor and outdoor scenes on-the-fly and by evaluating our results with respect to a ground-truth model.
We present a novel approach to adapt a watertight polygonal model of the human body to multiple synchronized camera views. While previous approaches yield excellent quality for this task, they require processing times...
详细信息
We present a novel approach to adapt a watertight polygonal model of the human body to multiple synchronized camera views. While previous approaches yield excellent quality for this task, they require processing times of several seconds, especially for high resolution meshes. Our approach delivers high quality results at interactive rates when a roughly initialized pose and a generic articulated body model are available. The key novelty of our approach is to use a Gauss-Seidel type solver to iteratively solve nonlinear constraints that deform the surface of the model according to silhouette images. We evaluate both the visual quality and accuracy of the adapted body shape on multiple test persons. While maintaining a similar reconstruction quality as previous approaches, our algorithm reduces processing times by a factor of 20. Thus it is possible to use a simple human model for representing the body shape of moving people in interactive applications.
Recognizing persons over a system of disjunct cameras is a hard task for human operators and even harder for automated systems. In particular, realistic setups show difficulties such as different camera angles or diff...
详细信息
Recognizing persons over a system of disjunct cameras is a hard task for human operators and even harder for automated systems. In particular, realistic setups show difficulties such as different camera angles or different camera properties. Additionally, also the appearance of exactly the same person can change dramatically due to different views (e.g., frontal/back) of carried objects. In this paper, we mainly address the first problem by learning the transition from one camera to the other. This is realized by learning a Mahalanobis metric using pairs of labeled samples from different cameras. Building on the ideas of Large Margin Nearest Neighbor classification, we obtain a more efficient solution which additionally provides much better generalization properties. To demonstrate these benefits, we run experiments on three different publicly available datasets, showing state-of-the-art or even better results, however, on much lower computational efforts. This is in particular interesting since we use quite simple color and texture features, whereas other approaches build on rather complex image descriptions!
We present an image-based 3D reconstruction pipeline for acquiring geo-referenced semi-dense 3D models. Multiple overlapping images captured from a micro aerial vehicle platform provide a highly redundant source for m...
详细信息
We present an image-based 3D reconstruction pipeline for acquiring geo-referenced semi-dense 3D models. Multiple overlapping images captured from a micro aerial vehicle platform provide a highly redundant source for multi-view reconstructions. Publicly available geo-spatial information sources are used to obtain an approximation to a digital surface model (DSM). Models obtained by the semi-dense reconstruction are automatically aligned to the DSM to allow the integration of highly detailed models into the original DSM and to provide geographic context.
We present an efficient natural feature tracking pipeline solely implemented in JavaScript. It is embedded in a web technology-based Augmented Reality system running plugin-free in web browsers. The evaluation shows t...
详细信息
We present an efficient natural feature tracking pipeline solely implemented in JavaScript. It is embedded in a web technology-based Augmented Reality system running plugin-free in web browsers. The evaluation shows that real-time framerates on desktop computers and while on smartphones interactive framerates are achieved.
暂无评论