In this paper, we present an approach to streaming graph-based hierarchical video segmentation by simple label propagation. Here, we transform the streaming video segmentation into a graph partitioning problem in whic...
详细信息
In this paper, we present an approach to streaming graph-based hierarchical video segmentation by simple label propagation. Here, we transform the streaming video segmentation into a graph partitioning problem in which each part corresponds to one region of the video, furthermore, we apply a simple method for merging the segmentations of two consecutive blocks to achieve the temporal coherence. The spatial-temporal coherence is given, only, by color information instead of more complex features. We provide an extensive comparative analysis among our method and methods in the literature showing efficiency, ease of use, and temporal coherence of ours. According to the experiments, our method produces good results when applied to video segmentation besides presenting a low space and time cost, compared to other methods.
Spoofing detection is a challenging task in biometric systems, when differentiating illegitimate users from genuine ones. Although iris scans are far more inclusive than fingerprints, and also more precise for person ...
详细信息
Spoofing detection is a challenging task in biometric systems, when differentiating illegitimate users from genuine ones. Although iris scans are far more inclusive than fingerprints, and also more precise for person authentication, iris recognition systems are vulnerable to spoofing via textured cosmetic contact lenses. Iris spoofing detection is also referred to as liveness detection (binary classification of fake and real images). In this work, we focus on a three-class detection problem: images with textured (colored) contact lenses, soft contact lenses, and no lenses. Our approach uses a convolutional network to build a deep image representation and an additional fully-connected single layer with soft max regression for classification. Experiments are conducted in comparison with a state-of-the-art approach (SOTA) on two public iris image databases for contact lens detection: 2013 Notre Dame and IIIT-Delhi. Our approach can achieve a 30% performance gain over SOTA on the former database (from 80% to 86%) and comparable results on the latter. Since IIIT-Delhi does not provide segmented iris images and, differently from SOTA, our approach does not segment the iris yet, we conclude that these are very promising results.
Agents path planning is an essential part of games and crowd simulations. In those contexts they are usually restricted to planar surfaces due to the huge computational cost of mapping arbitrary surfaces to a plane wi...
详细信息
Agents path planning is an essential part of games and crowd simulations. In those contexts they are usually restricted to planar surfaces due to the huge computational cost of mapping arbitrary surfaces to a plane without distortions. Mapping is required to benefit from the lower computational cost of distance calculations on a plane (Euclidean distance) when compared to distances on arbitrary surfaces (Geodesic distance). Although solutions have been presented, none have properly handled non-planar surfaces around the agent. In this paper we present mesh parametrization techniques to unfold the region around the agent allowing to extend to arbitrary surfaces the use of existing path planning algorithms initially designed only for planar surfaces. To mitigate the high computational cost of unfolding the entire surface dynamically, we propose pre-processing stages and massive parallelization, resulting in performances similar to that of using a planar surface. We also present a GPU implementation schema that permits a solution to be computed in real-time allowing agents to navigate on deformable surfaces that require dynamic unfolding of the surface. We present results with over 100k agents to prove the approach practicality.
Deaf people use systems of communication based on sign language and finger spelling. Finger spelling is a system where each letter of the alphabet is represented by a unique and discrete movement of the hand. RGB and ...
详细信息
Deaf people use systems of communication based on sign language and finger spelling. Finger spelling is a system where each letter of the alphabet is represented by a unique and discrete movement of the hand. RGB and depth images can be used to characterize hand shapes corresponding to letters of the alphabet. There exists an advantage of depth sensors, as Kinect, over color cameras for finger spelling recognition: depth images provide 3D information of the hand. In this paper, we propose a model for finger spelling recognition based on depth information using kernel descriptors, consisting of four stages. The performance of this approach is evaluated on a dataset of real images of the American Sign Language finger spelling. Different experiments were performed using a combination of both descriptors over depth information. Our approach obtains 92.92% of mean accuracy with 50% of samples for training, outperforming other state-of-the-art methods.
This paper proposes solution capable to process aerial images from UAVs to identify failures in plantations and makes a comparison of the system running on light sized computers and low power computing platforms. An a...
详细信息
This paper proposes solution capable to process aerial images from UAVs to identify failures in plantations and makes a comparison of the system running on light sized computers and low power computing platforms. An algorithm was developed based on watersheds using OpenCV library. The solution was embedded on X86 architecture (AlteraDE2i-150) and Intel Edison boards as well as on ARM architecture (Raspberry Pi 2). The results show that the proposed system is a cost-effective solution for the problem of fault identification in plantations, and can be embedded in UAVs for processingimages in real time.
In this paper, we propose a new approach for finger spelling recognition using depth information captured by Kinect sensor. We only use depth information to characterize hand configurations corresponding to alphabet l...
详细信息
In this paper, we propose a new approach for finger spelling recognition using depth information captured by Kinect sensor. We only use depth information to characterize hand configurations corresponding to alphabet letters. First, we use depth data to generate a binary hand mask which is used to segment the hand area from background. Then, the major hand axis is determined and aligned with Y axis in order to achieve rotation invariance. Later, we convert the depth data in a 3D point cloud. The point cloud is divided into sub regions and in each one, using direction cosines, we calculated three histograms of cumulative magnitudes Hx, Hy and Hz corresponding to each axis. Finally, these histograms were concatenated and used as input to our Support Vector Machine (SVM) classifier. The performance of this approach is quantitatively and qualitatively evaluated on a dataset of real images of American Sign Language (ASL) hand shapes. The dataset used is composed of 60000 depth images. According to our experiments, our approach has an accuracy rate of 99.37%, outperforming other state-of-the-art methods.
In this paper we propose a clustering-based learning approach to improve an existing model for human head-shoulder contour estimation. The contour estimation is guided by a learned head-shoulder shape model, initializ...
详细信息
In this paper we propose a clustering-based learning approach to improve an existing model for human head-shoulder contour estimation. The contour estimation is guided by a learned head-shoulder shape model, initialized automatically by a face detector. A dataset with labeled data is used to create the head-shoulder shape model and to quantitatively analyze the results. In the proposed approach, geometric features are firstly extracted from the learning dataset. Then, the number of shape models to be learned is obtained by an unsupervised clustering algorithm. In the segmentation stage, different graphs with an omega-like shape are built around the detected face, related to each learned shape model. A path with maximal cost, related to each graph, defines a initial estimative of the head-shoulder contour. The final estimation is given by the path with maximum average energy. Experimental results indicate that the proposed technique outperformed the original model, which is based on a single shape model, learned in a more simple way. In addition, it achieved comparable accuracy to other state-of-the-art models.
This paper aims at reducing the ocular discomfort created by stereoscopy due to the effect called 'frame cancellation', for movies and interactive applications. This effect appears when a virtual object in neg...
详细信息
The rapidly growing applications based on morphological operations in imageprocessing and computer vision make efficient implementations of these key blocks an important topic of research. Nevertheless, a detailed co...
详细信息
The rapidly growing applications based on morphological operations in imageprocessing and computer vision make efficient implementations of these key blocks an important topic of research. Nevertheless, a detailed comparison of the energy efficiency and performance of these implementations that covers all available major hardware platforms is still missing. In this paper we evaluate the performance and power consumption of the most efficient available morphological imageprocessing algorithms for CPU, GPU, and FPGA platforms in detail. In addition, we study the suitability of available morphological library units for high-level synthesis and compare the results with an optimized hand-coded FPGA implementation. We demonstrate that even high-end GPUs cannot achieve the throughputs of modern CPUs and FPGAs by far. Our experimental results show that an FPGA implementation is 8-10 times more energy efficient for this application, being comparable in speed to CPUs for large kernels.
Given the fact that the traditional GPU mainly supports the parallel computing with SIMD (Single Instruction Multiple Data) and SIMT (Single Instruction Multiple Thread) mode, the Firefly2 GPU (Graphic processing Unit...
详细信息
ISBN:
(纸本)9781467395885
Given the fact that the traditional GPU mainly supports the parallel computing with SIMD (Single Instruction Multiple Data) and SIMT (Single Instruction Multiple Thread) mode, the Firefly2 GPU (Graphic processing Unit) has special hardware configuration mechanism and can be used for paralleling computing on data-level, thread-level and operated-level. This paper presents parallel implementation of OpenVX kernels on Firefly2 GPU with the method by combing the operation level parallelism with data level parallelism. Experimental results indicate satisfactory speedup of the parallel implementation and show that the Firefly2 is suitable for graphics and imageprocessing.
暂无评论