In this paper we present how Intel's Single-Chip-Cloud processor behaves for parallel macro pipeline applications. Subsets of the SCC's available cores can be arranged as a pipeline where each core processes o...
详细信息
ISBN:
(纸本)9781479913725
In this paper we present how Intel's Single-Chip-Cloud processor behaves for parallel macro pipeline applications. Subsets of the SCC's available cores can be arranged as a pipeline where each core processes one stage of the overall workload. Each of the independent cores processes a small part of a larger task and feeds the following core with new data after it finishes its work. Our case-study is a parallel rendering system which renders successive images and applies different filters on them. On normal graphics adapters this is usually done in multiple cycles, we do this in a single pipeline pass. We show that we can achieve a significant speedup by using multiple parallel pipelines on the SCC. We show that we can further improve performance by using SCC's controlling PC in conjunction with the SCC. We also identify aspects of the SCC that hinder the overall performance, mainly the lack of local memory banks for each core on the SCC. The results presented in this paper are not limited to only imageprocessing, but users could expect similar experiences where macro pipelining is used in other applications on the SCC.
Color cast is a crucial problem for color imageprocessing. White balance has been widely used to eliminate color cast to improve the image's quality. Most of white balance implementations are based on color const...
详细信息
Color cast is a crucial problem for color imageprocessing. White balance has been widely used to eliminate color cast to improve the image's quality. Most of white balance implementations are based on color constancy hypothesis. A well-known color constancy hypothesis is given in [1], unifying White Patch [2], Grey World [3], Shades of Grey [4], and Grey Edge [1] assumptions in one expression. However, this general hypothesis works on underwater images not as reliable as on common images. In the color constancy hypothesis for common scenes, the ambient light source is spatial constant. But in underwater scenes, the light suffers from serious attenuation, especially in the red part of the visible spectrum. This attenuation causes spatial variance of the ambient light source, which lets classic color constancy hypothesis fail. In this paper, we propose a novel low-level image feature-based color constancy hypothesis for underwater scenes. Based on this hypothesis, we propose an algorithm, using a distance map to estimate multiple gain factors to remove the color cast.
We present a novel real-time implementation of local phase feature extraction from volumetric image data based on 3D directional (log-Gabor) filters. We achieve drastic performance gains without compromising the signa...
详细信息
ISBN:
(纸本)9781467364560
We present a novel real-time implementation of local phase feature extraction from volumetric image data based on 3D directional (log-Gabor) filters. We achieve drastic performance gains without compromising the signal-to-noise ratio by pre-computing the filters and adaptive noise estimation parameters, and streamlining the remainder of the computations to efficiently run on a multi-processor graphic processing unit (GPU). We validate our method on clinical ultrasound data and demonstrate a 15-fold speedup in computation time over state-of-the art methods, which could potentially facilitate a wide range of practical applications for real-time image-guided procedures.
Nowadays Graphical processing Units (GPUs) have become increasingly popular due to their high computational power and low prices. This makes them particularly suitable for high-performance computing applications, like...
详细信息
Nowadays Graphical processing Units (GPUs) have become increasingly popular due to their high computational power and low prices. This makes them particularly suitable for high-performance computing applications, like data elaboration and imageprocessing. In these fields, the capability of properly work even in presence of faults is mandatory. This paper presents an innovative approach, that combines a Software Based Self Test & Diagnosis (SBSTD) methodology with a fault mitigation strategy, to increase the robustness of a CUDA Fermi GPU-based system.
Using image information to make 3D model reconstruction of real objects in computer vision and computergraphics is getting more and more attentions. Widely used methods include reconstructing object based on the mult...
详细信息
Using image information to make 3D model reconstruction of real objects in computer vision and computergraphics is getting more and more attentions. Widely used methods include reconstructing object based on the multiple silhouettes and stereo vision. In this work, we design a fault tolerance system for multiple moving object reconstruction using three cameras located in the Cartesian coordinate frame. In the first stage, digital imageprocessing (DIP) is used to extract vertexes and edges. In the second stage, we apply the principle of perspective projection to analyze the coordinates of features in the 3D space, and then iteratively remove the redundant vertices by reasoning. We also develop an adaptive method to prevent any avoidable misjudgement and sudden transition.
graphicsprocessing units (GPUs) offer significant speedups over CPUs for certain classes of applications. However, programming for GPUs is challenging. There are many parameters that affect performance and their valu...
详细信息
ISBN:
(纸本)9781467360661
graphicsprocessing units (GPUs) offer significant speedups over CPUs for certain classes of applications. However, programming for GPUs is challenging. There are many parameters that affect performance and their values may change depending on both problem instance and GPU hardware specifics. In addition, most GPU kernels are compiled once; performance optimizations are applied at application compile time. As a result, many GPU libraries and programs have limited adaptability to variations among problem instances and hardware configurations. These factors limit code reuse and the applicability of GPU computing to a wider variety of problems. This paper introduces GPGPU kernel specialization, a technique used to describe highly adaptable kernels that exhibit high performance across a wide range of programmer variables as well as different generations of GPUs. We also introduce our GPU Prototyping Framework (GPU-PF) for dynamic runtime generation of customized GPU kernels incorporating both problem and implementation-specific parameters. GPU-PF fully separates the GPU and CPU code so the GPU code can be compiled during program execution once all the parameters are known. This work explores the implementation and parameterization of two real world applications targeting two generations of NVIDIA CUDA-enabled GPUs using kernel specialization and GPU-PF: large template matching and cone-beam image reconstruction via backprojection. Starting with high performance GPU kernels that compare favorably to multi-threaded reference implementations, kernel specialization is shown to increase adaptability while providing performance improvements including improved run time and reduction in resource usage. Kernel specialization offers productivity benefits, improved library code, and a means to increase the parameterizability of GPGPU implementations.
The ever-growing access to high-resolution images has prompted the development of region-based classification methods for remote sensing images. However, in agricultural applications, the recognition of specific regio...
详细信息
The ever-growing access to high-resolution images has prompted the development of region-based classification methods for remote sensing images. However, in agricultural applications, the recognition of specific regions is still a challenge as there could be many different spectral patterns in a same studied area. In this context, depending on the features used, different learning methods can be used to create complementary classifiers. Many researchers have developed solutions based on the use of machine learning techniques to address these problems. Examples of successful initiatives are those dedicated to the development of learning techniques for data fusion or Multiple Classifier Systems (MCS). In MCS, diversity becomes an essential factor for their success. Different works have been using diversity measures to select appropriate high-performance classifiers, but the challenge of finding the optimal number of classifiers for a target task has not been properly addressed yet. In general, the proposed solutions rely on the a priori use of ad hoc strategies for selecting classifiers, followed by the evaluation of their effectiveness results during training. Searching by the optimal number of classifiers, however, makes the selection process more expensive. In this paper, we address this issue by proposing a novel strategy for selecting classifiers to be combined based on the correlation of different diversity measures. Diversity measures are used to rank pairs of classifiers and the agreement among ranked lists guides the classifier selection process. A fusion framework has been used in our experiments targeted to the classification of coffee crops in remote sensing images. Experiment results demonstrate that the novel strategy is able to yield comparable effectiveness results when contrasted to several baselines, but using much fewer classifiers.
Thanks to their massive computational power and their SIMT computational model, graphicsprocessing Units (GPUs) have been successfully used to accelerate a wide variety of regular applications (linear algebra, stenci...
详细信息
ISBN:
(纸本)9781467360661
Thanks to their massive computational power and their SIMT computational model, graphicsprocessing Units (GPUs) have been successfully used to accelerate a wide variety of regular applications (linear algebra, stencil computations, imageprocessing and bioinformatics algorithms, among others). However, many established and emerging problems are based on irregular data structures, such as graphs. Examples can be drawn from different application domains: networking, social networking, machine learning, electrical circuit modeling, discrete event simulation, compilers, and computational sciences. It has been shown that irregular applications based on large graphs do exhibit runtime parallelism; moreover, the amount of available parallelism tends to increase with the size of the datasets. In this work, we explore an implementation space for deploying a variety of graph algorithms on GPUs. We show that the dynamic nature of the parallelism that can be extracted from graph algorithms makes it impossible to find an optimal solution. We propose a runtime system able to dynamically transition between different implementations with minimal overhead, and investigate heuristic decisions applicable across algorithms and datasets. Our evaluation is performed on two graph algorithms: breadth-first search and single-source shortest paths. We believe that our proposed mechanisms can be extended and applied to other graph algorithms that exhibit similar computational patterns.
Electromagnetic environment is becoming more and more complex in modern battlefield. Using direct volume rendering to describe it is helpful for the commanders to understand the electromagnetic situation. To realize t...
详细信息
Electromagnetic environment is becoming more and more complex in modern battlefield. Using direct volume rendering to describe it is helpful for the commanders to understand the electromagnetic situation. To realize the volume rendering of electromagnetic environment data, the work in this paper uses ray casting algorithm rendering the electromagnetic environment sample data, with the analyzing and solving of the ray correction problem when the sample data are transformed between different coordinate systems. To improve the rendering efficiency, the CUDA is used to speed up the data updating and rendering. The experiment results show that the complex electromagnetic environment can be rendered in real time with the using of CUDA, and the electromagnetic situation information can be clearly displayed with real-time interaction.
暂无评论