Heterogeneous multi-core architectures have become an integral component of high performance systems and high performance scientific computing (HPC). the use of these systems has been vital for research applications b...
详细信息
this paper presents a new technique for test pattern generation based on a genetic algorithm and parallelprocessing techniques. this new method offers compact test sets, compared to other methods, that achieve maximu...
详细信息
Data centre power consumption can be reduced by switching off servers during low load. However, excess switching is wasteful. this paper reviews online algorithms for optimizing this tradeoff, including the benefits o...
详细信息
It is presented in this paper that the design and analysis of finite difference domain decomposition algorithms for the two-dimensional heat equation and the numerical results have shown the stability and accuracy of ...
详细信息
ISBN:
(纸本)0769515126
It is presented in this paper that the design and analysis of finite difference domain decomposition algorithms for the two-dimensional heat equation and the numerical results have shown the stability and accuracy of the algorithms. the algorithms in the paper have further extended those developed by Dawson and the others [6].
We present a family of algorithms for local optimization that exploit the parallelarchitectures of contemporary computing systems to accomplish significant performance enhancements. this capability is important for d...
详细信息
ISBN:
(数字)9783642551956
ISBN:
(纸本)9783642551956
We present a family of algorithms for local optimization that exploit the parallelarchitectures of contemporary computing systems to accomplish significant performance enhancements. this capability is important for demanding real time applications, as well as, for problems with time-consuming objective functions. the proposed concurrent schemes namely nomadic and bundle search are based upon well established techniques such as quasi-Newton updates and line searches. the parallelization strategy consists of (a) distributed computation of an approximation to the Hessian matrix and (b) parallel deployment of line searches on different directions (bundles) and from different starting points (nomads). Preliminary results showed that the new parallelalgorithms can solve problems in less iterations than their serial rivals.
Smith-Waterman algorithm is a classic dynamic programming algorithm to solve the problem of biological sequence alignment. However, withthe rapid increment of the number of DNA and protein sequences, the originally s...
详细信息
ISBN:
(纸本)9783642131189
Smith-Waterman algorithm is a classic dynamic programming algorithm to solve the problem of biological sequence alignment. However, withthe rapid increment of the number of DNA and protein sequences, the originally sequential algorithm is very time consuming due to there existing the same computing task computed repeatedly on large-scale data. Today's CPU (graphics processor unit) consists of hundreds of processors, so it has a more powerful computation capacity than the current multicore CPU. And as the programmability of GPU improved continuously, using it to do generous purpose computing is becoming very popular. In order to accelerate sequence alignment, previous researchers use the parallelism of the anti-diagonal of similarity matrix to parallelize the Smith-Waterman algorithm on CPU. In this paper, we design a new parallel algorithm which exploits the parallelism of the column of similarity matrix to parallelize the Smith-Waterman algorithm on a heterogeneous system based on CPU and CPU. the experiment result shows that our new parallel algorithm is more efficient than that of previous, which takes full advantage of the features of boththe CPU and CPU and obtains approximately 37 times speedup compared withthe sequential algorithm named OSEARCH implemented on Intel dual-core E2140 processor.
the recent advent of novel multi-and many-core architectures forces application programmers to deal with hardware-specific implementation details and to be familiar with software optimisation techniques to benefit fro...
详细信息
ISBN:
(纸本)9783642552243
the recent advent of novel multi-and many-core architectures forces application programmers to deal with hardware-specific implementation details and to be familiar with software optimisation techniques to benefit from new high-performance computing machines. An extra care must be taken for communication-intensive algorithms, which may be a bottleneck for forthcoming era of exascale computing. this paper aims to present performance evaluation of preliminary adaptation techniques to hybrid MPI+OpenMP parallelisation schemes we provided into the EULAG code. Various techniques are discussed, and the results will lead us toward efficient algorithms and methods to scale communication-intensive elliptic solver with preconditioner, including GPU architectures to be provided later in the future.
this paper presents a CIVA optimized ultrasonic inspection simulation tool, which takes benefit 01 the power of massively parallelarchitectures : graphical processing units (GPU) and multi-core general purpose proces...
详细信息
ISBN:
(纸本)9780735412125
this paper presents a CIVA optimized ultrasonic inspection simulation tool, which takes benefit 01 the power of massively parallelarchitectures : graphical processing units (GPU) and multi-core general purpose processors (GPP). this tool is based on the classical approach used in CIVA : the interaction model is based on Kirchoff, and the ultrasonic field around the defect is computed by the pencil method. the model has been adapted and parallelized for botharchitectures. At this stage, the configurations addressed by the tool are : multi and mono-element probes. planar specimens made of simple isotropic materials, planar rectangular defects or side drilled holes of small diameter. Validations on the model accuracy and performances measurements are presented.
NVIDIA's Graphics processing Units (GPUs) have been widely adopted in many application domains to shorten the execution time by parallelprocessing and the Compute Unified Device Architecture (CUDA) platform enabl...
详细信息
ISBN:
(纸本)9781467393232
NVIDIA's Graphics processing Units (GPUs) have been widely adopted in many application domains to shorten the execution time by parallelprocessing and the Compute Unified Device Architecture (CUDA) platform enables high-performance, many-core parallel programming for NVIDIA GPUs. Various kinds of metaheuristic algorithms, aiming at finding an acceptable good solution rather than the optimum solution for NP-complete problems, have been studied for parallel execution on GPUs. the simulated annealing algorithm (SA) is one of metaheuristic algorithms and has been widely used on solving hard problems on many application areas. In general, when the number of iterations is decreased, the execution time is shortened but the solution quality becomes poorer. therefore, it is a hard work for programmers to choose an appropriate number of iterations for the SA algorithm when they parallelize the sequential SA. this paper proposes an approach that optimizes the mapping of the simulated annealing algorithm onto CUDA-enabled GPUs. Unlike the previous research, our goal of this work is to parallelthe SA algorithm by setting the number of iterations to that adopted in the sequential version, which results in high speedup and good solution quality.
this paper describes an architecture dedicated to the real-time processing of census correlation in the context of the realization of passive stereovision sensors. Although DSP circuits have dramatically increased the...
详细信息
ISBN:
(纸本)9781424403127
this paper describes an architecture dedicated to the real-time processing of census correlation in the context of the realization of passive stereovision sensors. Although DSP circuits have dramatically increased their performances in terms of frequency (about 600 MHz today), DSP cores (several Multipliers Accumulators) and pipelines (Super Harvard architectures for example), FPGA circuits remain the best way to design massive parallelarchitectures when ultra fast algorithms computation are needed like it is the case in real time vision systems for collision avoidance.
暂无评论