In this paper we present and start analyzing the iCub World data-set, an object recognition data-set, we acquired using a Human-Robot Interaction (HRI) scheme and the iCub humanoid robot platform. Our set up allows fo...
详细信息
ISBN:
(纸本)9780769549903
In this paper we present and start analyzing the iCub World data-set, an object recognition data-set, we acquired using a Human-Robot Interaction (HRI) scheme and the iCub humanoid robot platform. Our set up allows for rapid acquisition and annotation of data with corresponding ground truth. While more constrained in its scopes - the iCub world is essentially a robotics research lab - we demonstrate how the proposed data-set poses challenges to current recognition systems. The iCubWorld data-set is publicly available (1).
Recently released depth cameras provide effective estimation of 3D positions of skeletal joints in temporal sequences of depth maps. In this work, we propose an efficient yet effective method to recognize human action...
详细信息
ISBN:
(纸本)9780769549903
Recently released depth cameras provide effective estimation of 3D positions of skeletal joints in temporal sequences of depth maps. In this work, we propose an efficient yet effective method to recognize human actions based on the positions of joints. First, the body skeleton is decomposed in a set of kinematic chains, and the position of each joint is expressed in a locally defined reference system which makes the coordinates invariant to body translations and rotations. A multi-part bag-of-poses approach is then defined, which permits the separate alignment of body parts through a nearest-neighbor classification. Experiments conducted on the Florence 3D Action dataset and the MSR Daily Activity dataset show promising results.
Manifold learning has been effectively used in computervision applications for dimensionality reduction that improves classification performance and reduces computational load. Grassmann manifolds are well suited for...
详细信息
ISBN:
(纸本)9780769549903
Manifold learning has been effectively used in computervision applications for dimensionality reduction that improves classification performance and reduces computational load. Grassmann manifolds are well suited for computervision problems because they promote smooth surfaces where points are represented as subspaces. In this paper we propose Grassmannian Sparse Representations (GSR), a novel subspace learning algorithm that combines the benefits of Grassmann manifolds with sparse representations using least squares loss L1-norm minimization for optimal classification. We further introduce a new descriptor that we term Motion Depth Surface (MDS) and compare its classification performance against the traditional Motion History Image (MHI) descriptor. We demonstrate the effectiveness of GSR on computationally intensive 3D action sequences from the Microsoft Research 3D-Action and 3D-Gesture datasets.
New SOC like the Xilinx Zynq 7045 allow researchers and developers to combine the advantages of writing software for control functionality and having accelerators in the FPGA logic for the number crunching. The dual c...
详细信息
ISBN:
(纸本)9780769549903
New SOC like the Xilinx Zynq 7045 allow researchers and developers to combine the advantages of writing software for control functionality and having accelerators in the FPGA logic for the number crunching. The dual core Cortex-A9 ARM processor runs with up to 1 GHz and the FPGA has up to 900 DSP slices allowing a performance of up to 1,334 GMACs. SCS is porting a lot of algorithms like SGM stereo [1], Stixel clustering or an optical flow [2] to such devices allowing new cars to see their environment and react appropriately. The new developed SCS Zynq 7045 module will allow accelerated development using this technology.
In recent years, with the advent of cheap and accurate RGBD (RGB plus Depth) active sensors like the Microsoft Kinect and devices based on time-of-flight (ToF) technology, there has been increasing interest in 3D-base...
详细信息
ISBN:
(纸本)9780769549903
In recent years, with the advent of cheap and accurate RGBD (RGB plus Depth) active sensors like the Microsoft Kinect and devices based on time-of-flight (ToF) technology, there has been increasing interest in 3D-based applications. At the same time, several effective improvements to passive stereo vision algorithms have been proposed in the literature. Despite these facts and the frequent deployment of stereo vision for many research activities, it is often perceived as a bulky and expensive technology not well suited to consumer applications. In this paper, we will review a subset of state-of-the-art stereo vision algorithms that have the potential to fit a target computing architecture based on low-cost field-programmable gate arrays (FPGAs), without additional external devices (e. g., FIFOs, DDR memories, etc.). Mapping these algorithms into a similar low-power, low-cost architecture would make RGBD sensors based on stereo vision suitable to a wider class of application scenarios currently not addressed by this technology.
While most approaches to symmetry detection in machine vision try to explain the gray-values or colors of the pixels, Gestalt algebra has no room for such measurement data. The entities (i.e. Gestalten) are only defin...
详细信息
ISBN:
(纸本)9780769549903
While most approaches to symmetry detection in machine vision try to explain the gray-values or colors of the pixels, Gestalt algebra has no room for such measurement data. The entities (i.e. Gestalten) are only defined with respect to each other. They form a generic hierarchy, and live in a continuous domain without any pixel raster. There is also no constraint forcing them to completely fill an image, or prohibiting overlap. Yet, when used as a tool for symmetry recognition, the algebra must be somehow connected to the given data. In this paper this is done only on the primitive level using the well-known SIFT feature detector. From a set of such SIFT-based Gestalten follows a combinatorial set of higher-order symmetric Gestalten by constructing all possible terms using the operations of the algebra. The Gestalt domain contains a quality or assessment dimension. Taking the best Gestalten with respect to this attribute and clustering them yields the output for this competition participation.
Face recognition technique is widely used in the real-world applications over the past decade. Different from other biometric traits such as fingerprint and iris, face is the biological nature for humans to recognise ...
详细信息
ISBN:
(纸本)9780769549903
Face recognition technique is widely used in the real-world applications over the past decade. Different from other biometric traits such as fingerprint and iris, face is the biological nature for humans to recognise a person even met just once. In this paper, we propose a novel method, which simulates the mechanism of fixations and saccades in human visual perception, to handle the face recognition from single image per person problem. Our method is robust to the local deformations of the face (i.e., expression changes and occlusions). Especially for the occlusion related problems, which have not received enough attentions compared with other challenging variations of illumination, expression and pose, our method significantly outperforms the state-of-the-art approaches despite various types of occlusions. Experimental results on the FRGC and the AR databases confirm the effectiveness of our method.
There is a growing demand in automated public safety systems for detecting unauthorized vehicle parking, intrusions, unintended baggage, etc. Object detection and recognition significantly impact these applications. O...
详细信息
We propose hinge-loss Markov random fields (HL-MRFs), a powerful class of continuous-valued graphical models, for high-level computervision tasks. HL-MRFs are characterized by log-concave density functions, and are a...
详细信息
ISBN:
(纸本)9780769549903
We propose hinge-loss Markov random fields (HL-MRFs), a powerful class of continuous-valued graphical models, for high-level computervision tasks. HL-MRFs are characterized by log-concave density functions, and are able to perform efficient, exact inference. Their templated hinge-loss potential functions naturally encode soft-valued logical rules. Using the declarative modeling language probabilistic soft logic, one can easily define HL-MRFs via familiar constructs from first-order logic. We apply HL-MRFs to the task of activity detection, using principles of collective classification. Our model is simple, intuitive and interpretable. We evaluate our model on two datasets and show that it achieves significant lift over the low-level detectors.
"Big Data" analysis is an emerging topic in computervision and patternrecognition. As one example problem of big data, we study semantic age labels and facial aging pattern analysis on a large database. In...
详细信息
ISBN:
(纸本)9780769549903
"Big Data" analysis is an emerging topic in computervision and patternrecognition. As one example problem of big data, we study semantic age labels and facial aging pattern analysis on a large database. In aging analysis, one of the great challenges is the lack of a large number of face images with ground truth age labels. Unlike many other example-based recognition problems where human annotations can be used as the ground truth labels for both training and testing, it is quite difficult to label the exact ages in face images by human annotators. An alternative is to exploit the unlabeled ages to enhance the age estimation performance. However, it is unclear whether the face images with unlabeled ages can be used or not for age estimation, and how to use the unlabeled data. In this paper, we study the two problems comprehensively under two paradigms: the semi-supervised learning and unsupervised learning for aging pattern analysis. We emphasize the importance of using ground truth age labels and a large database in order to derive a meaningful measure in the context of big data. Our study can make an impact on collecting aging patterns that is very expensive and time consuming in practice.
暂无评论