With the growth of supercomputer's scale, the communication time during executing is increasing. This phenomenon arouses the architecture researchers' interests. In this paper, based on the fat-tree topology, ...
详细信息
On chip multiprocessors (CMPs) platforms, multiple co-scheduled applications can severely degrade performance and quality of service (QoS) when they contend for last-level cache (LLC) resources. Whether an application...
详细信息
Network calculus is a promising theory for analyzing and modeling networks based on min-plus algebra. Using network calculus theory, we propose formulas of arrival curve and service curve for end-to-end communication,...
详细信息
Many applications demand distributing data with different contents efficiently in the network environment with unreliable links and a high node churn. Existing approaches mostly focus on optimizing either efficiency o...
详细信息
As one of the most popular accelerators, Graphics processing Unit (GPU) has demonstrated high computing power in several application fields. On the other hand, GPU also produces high power consumption and has been one...
详细信息
Network size is one of the fundamental information of distributed applications. The approach to estimate network size must feature both high accuracy and robustness in order to adapt to the dynamic environment in diff...
详细信息
As one of the most popular many-core architecture, GPUs have illustrated power in many non-graphic applications. Traditional general purpose computing systems tend to integrate GPU as the co-processor to accelerate pa...
详细信息
Multicore systems provide potential to improve the performance of the applications. However, substantial programming effort is required to exploit the power of the parallelism. This paper presents a single source comp...
详细信息
ISBN:
(纸本)9783642133732
Multicore systems provide potential to improve the performance of the applications. However, substantial programming effort is required to exploit the power of the parallelism. This paper presents a single source compiler to map the data-parallel programs onto Cell Broadband Engine. Based on the distributed memory model, the compiler performs automatic data distribution and generates SPMD programs with message-passing primitives for Cell. We evaluate our compiler using a range of computation intensive benchmarks, high performance is achieved on Cell platform. In contrast to OpenMP, our method can fully exploit data locality through managing the shared data using inter-processor communication instead of accessing main memory, which significantly reduces the off-chip memory access overhead.
As a fast on-chip SRAM managed by software (the application and/or compiler), Scratchpad Memory (SPM) is widely used in many fields. This paper presents a SimpleScalar-based multi-level SPM memory hierarchy architectu...
详细信息
Nested Circular Directional MAC, a modified medium access control protocol of DMAC protocol, is proposed in this paper to support both directional antennas and omni-directional antennas simultaneously in one Ad Hoc ne...
详细信息
暂无评论