Force directed approach is one of the most widely used methods in graph drawing research. However, the running time is increased intolerablely along with the enlargement of the graph size, which restricts the algorith...
详细信息
Constant degree P2P systems are turning into the P2P domain's promising hotspot due to their good properties. However, it is often hard to convert a standard constant degree digraph to a flexible DHT schema adapti...
详细信息
Chip Multi-Processors (CMPs) emerge as a mainstream architectural design alternative for high performance parallel and distributed computing. Last Level Cache (LLC) management is critical to CMPs because off-chip acce...
详细信息
The efficiency of communication is a key factor to the performance of networking applications, and concurrent communication is an important approach to the efficiency of communication. However, many concurrency opport...
详细信息
This paper quantitatively studies the trace effects to the performance and accuracy of the BigSim Emulator, a scalable parallel emulator for large-scale computers. To assess the accuracy effect we modify the emulator ...
详细信息
We introduce a novel method for the consolidation of unorganized point clouds with noise, outliers, non-uniformities as well as sharp features. This method is feature preserving, in the sense that given an initial est...
详细信息
Successive interference cancellation (SIC) is an effective technique of multipacket reception to combat interference. As not all collision are resolvable, careful transmission coordination is required. We study link s...
详细信息
Multicore systems provide potential to improve the performance of the applications. However, substantial programming effort is required to exploit the power of the parallelism. This paper presents a single source comp...
详细信息
ISBN:
(纸本)9783642133732
Multicore systems provide potential to improve the performance of the applications. However, substantial programming effort is required to exploit the power of the parallelism. This paper presents a single source compiler to map the data-parallel programs onto Cell Broadband Engine. Based on the distributed memory model, the compiler performs automatic data distribution and generates SPMD programs with message-passing primitives for Cell. We evaluate our compiler using a range of computation intensive benchmarks, high performance is achieved on Cell platform. In contrast to OpenMP, our method can fully exploit data locality through managing the shared data using inter-processor communication instead of accessing main memory, which significantly reduces the off-chip memory access overhead.
Recently, GPGPU has been adopted well in the High Performance Computing (HPC) field. The limited global memory bandwidth poses a great challenge to many GPGPU programmers trying to exploit parallelism within the CPUGP...
详细信息
Lots of toolboxes of accelerating MatLab using GPU are available now[1], but, users are confused by which toolbox is best suitable for a particular task. Three toolboxes-Jacket, GPUmat, and parallel Computing Toolbox ...
详细信息
Lots of toolboxes of accelerating MatLab using GPU are available now[1], but, users are confused by which toolbox is best suitable for a particular task. Three toolboxes-Jacket, GPUmat, and parallel Computing Toolbox of MatLab are selected. For each toolbox, its advantages and pitfalls are reviewed, with an aim to allow the reader to identify which toolbox is appropriate for a given task. Strategies of whether a function should execute on GPU are given after a formula analysis. The analysis is also a framework for program automatically decides which function is cost-efficient to execute on GPU. A series of benchmark of different types of computing, including data transfer between GPU and CPU, data matrix Generation, matrix operation and GPU functions were tested in all three toolboxes. And the results show that Jacket is the best one. Some advices to improve the performance of toolboxes are given in the end.
暂无评论