The huge growth of image collections and multimedia resources available is remarkable. One of the most common approaches to support image searches relies on the use of Content-Based image Retrieval (CBIR) systems. CBI...
详细信息
The huge growth of image collections and multimedia resources available is remarkable. One of the most common approaches to support image searches relies on the use of Content-Based image Retrieval (CBIR) systems. CBIR systems aim at retrieving the most similar images in a collection, given a query image. Since the effectiveness of those systems is very dependent on the accuracy of ranking approaches, re-ranking algorithms have been proposed to exploit contextual information and improve the effectiveness of CBIR systems. image re-ranking algorithms typically consider the relationship among every image in a given dataset when computing the new ranking. This approach demands a huge amount of computational power, which may render it prohibitive on very large data sets. In order to mitigate this problem, we propose using the computational power of graphicsprocessing Units (GPU) to speedup the computation of image re-ranking algorithms. GPUs are fast emerging and relatively inexpensive parallel processors that are becoming available on a wide range of computer systems. In this paper, we propose a parallel implementation of an image re-ranking algorithm designed to fit the computational model of GPUs. Experimental results demonstrate that relevant performance gains can be obtained by our approach.
A common neural network used for complex data clustering is the Self Organizing Maps(SOM). This algorithm have a expensive training step, that occur mainly on high dimensional applications like image clustering. This ...
详细信息
A common neural network used for complex data clustering is the Self Organizing Maps(SOM). This algorithm have a expensive training step, that occur mainly on high dimensional applications like image clustering. This makes impossible for some of these applications to be run in real time or even in a feasible time. On this paper we explore the use of GPUs with the NVIDIA CUDA language to decrease computational cost of SOM. We propose a three steps implementation able to reduce the computational complexity of the algorithm under SIMD paradigm and also making a good use of GPU's resources. At the end we were able to get a peak speed-up of 44 times against a C CPU implementation, fact that concludes about SOM's data parallelism.
Recent advances on biometrics, information forensics, and security have improved the accuracy of biometric systems, mainly those based on facial information. However, an ever-growing challenge is the vulnerability of ...
详细信息
Recent advances on biometrics, information forensics, and security have improved the accuracy of biometric systems, mainly those based on facial information. However, an ever-growing challenge is the vulnerability of such systems to impostor attacks, in which users without access privileges try to authenticate themselves as valid users. In this work, we present a solution to video-based face spoofing to biometric systems. Such type of attack is characterized by presenting a video of a real user to the biometric system. To the best of our knowledge, this is the first attempt of dealing with video-based face spoofing based in the analysis of global information that is invariant to video content. Our approach takes advantage of noise signatures generated by the recaptured video to distinguish between fake and valid access. To capture the noise and obtain a compact representation, we use the Fourier spectrum followed by the computation of the visual rhythm and extraction of the gray-level co-occurrence matrices, used as feature descriptors. Results show the effectiveness of the proposed approach to distinguish between valid and fake users for video-based spoofing with near-perfect classification results.
To cope with the complexity of programming GPU accelerators for medical imaging computations, we developed a framework to describe imageprocessing kernels in a domain-specific language, which is embedded into C++. Th...
详细信息
To cope with the complexity of programming GPU accelerators for medical imaging computations, we developed a framework to describe imageprocessing kernels in a domain-specific language, which is embedded into C++. The description uses decoupled access/execute metadata, which allow the programmer to specify both execution constraints and memory access patterns of kernels. A source-to-source compiler translates this high-level description into low-level CUDA and Open CL code with automatic support for boundary handling and filter masks. Taking the annotated metadata and the characteristics of the parallel GPU execution model into account, two-layered parallel implementations - utilizing SPMD and MPMD parallelism - are generated. An abstract hardware model of graphics card architectures allows to model GPUs of multiple vendors like AMD and NVIDIA, and to generate device-specific code for multiple targets. It is shown that the generated code is faster than manual implementations and those relying on hardware support for boundary handling. Implementations from Rapid Mind, a commercial framework for GPU programming, are outperformed and similar results achieved compared to the GPU backend of the widely used imageprocessing library Open CV.
Heterogeneous parallel systems including accelerators such as graphicsprocessing Units (GPUs), are expected to play a major role in architecting the largest systems in the world, as well as the most powerful embedded...
详细信息
ISBN:
(纸本)9781467309745
Heterogeneous parallel systems including accelerators such as graphicsprocessing Units (GPUs), are expected to play a major role in architecting the largest systems in the world, as well as the most powerful embedded devices. Impressive computational speedups have been reported for numerous algorithms in fields of medical imageprocessing, digital signal processing, astrophysics, modeling and simulations. However, it is frequently assumed that the working data set of the application fits in the memory of the accelerator. In this paper, first we elevate this constraint by presenting a simple and scalable compile-time approach for processing large data sets based on I/O tiling. Second, we combine tiling with streaming in our asynchronous execution model, which enables efficient data-driven processing of large data sets on heterogeneous platforms with accelerators. Finally, we present results for several micro benchmarks and three data parallel kernels.
High quality view generation and interpolation is a crucial problem for the success of 3D video systems. It is known that interpolation artifacts especially occur along object boundaries where depth values fluctuate. ...
详细信息
ISBN:
(纸本)9781467350839
High quality view generation and interpolation is a crucial problem for the success of 3D video systems. It is known that interpolation artifacts especially occur along object boundaries where depth values fluctuate. We therefore present a novel view interpolation algorithm that introduces specific processing without producing any geometric distortion. The paper also describes the method to solve the problem of wrongly exposed pixels in depth image based rendering (DIBR) which degrade the quality of intermediate view.
Two stages are commonly employed in modern algorithms of image/video quality assessment (QA): (1) a local frequency-based decomposition, and (2) block-based statistical comparisons between the frequency coefficients o...
详细信息
Two stages are commonly employed in modern algorithms of image/video quality assessment (QA): (1) a local frequency-based decomposition, and (2) block-based statistical comparisons between the frequency coefficients of the reference and distorted images. This paper presents a performance analysis of and techniques for accelerating these stages. We specifically analyze and accelerate one representative QA algorithm recently developed by the authors (Larson and Chandler, 2010). We identify the bottlenecks from the abovementioned stages, and we present methods of acceleration using integral images, inline expansion, a GPGPU implementation, and other code modifications. We show how a combination of these approaches can yield a speedup of 47×.
Application programming for GPUs (graphicsprocessing Units) is complex and error-prone, because the popular approaches - CUDA and OpenCL - are intrinsically low-level and offer no special support for systems consisti...
详细信息
ISBN:
(纸本)9781467309745
Application programming for GPUs (graphicsprocessing Units) is complex and error-prone, because the popular approaches - CUDA and OpenCL - are intrinsically low-level and offer no special support for systems consisting of multiple GPUs. The SkelCL library presented in this paper is built on top of the OpenCL standard and offers pre-implemented recurring computation and communication patterns (skeletons) which greatly simplify programming for multi-GPU systems. The library also provides an abstract vector data type and a high-level data (re)distribution mechanism to shield the programmer from the low-level data transfers between the system's main memory and multiple GPUs. In this paper, we focus on the specific support in SkelCL for systems with multiple GPUs and use a real-world application study from the area of medical imaging to demonstrate the reduced programming effort and competitive performance of SkelCL as compared to OpenCL and CUDA. Besides, we illustrate how SkelCL adapts to large-scale, distributed heterogeneous systems in order to simplify their programming.
This paper describes a new encoder control method for multiview video plus depth coding. Since large parts of a multiview scenery are present in more than one of the captured video sequences, a depth-aware encoder con...
详细信息
This paper describes a new encoder control method for multiview video plus depth coding. Since large parts of a multiview scenery are present in more than one of the captured video sequences, a depth-aware encoder control is introduced, which identifies those regions based on given depth maps and omits the coding of the residual signal for those regions. Experimental results indicate that bit rate reductions of about 5-9 %, depending on the bit rate, can be achieved for the 2-view case at a constant subjective quality.
This paper shows that the principles of video coding that are related to removing temporal redundancy by means of motion estimation and compensation can be successfully used to compress still images. If all polyphase ...
详细信息
This paper shows that the principles of video coding that are related to removing temporal redundancy by means of motion estimation and compensation can be successfully used to compress still images. If all polyphase components of an image are identified with correlated video frames, only one of them needs to be intra-coded, as the rest can be encoded using mainly bidirectional prediction. Using the H.264 reference software, it has been experimentally verified that the approach offers compression comparable to intra-coding the whole image as a single video frame, the common way of applying the H.264 standard to still pictures. As the H.264 encoder is not optimized for processing polyphase components, the results suggest that, based on the presented idea, it is possible to develop a new image codec that could compete with the state-of-art algorithms.
暂无评论