Single-Instruction-Multiple-data (SIMD) architectures are widely used to accelerate applications involving data-Level Parallelism (DLP);the on-chip memory system facilitates the communication between Processing Elemen...
详细信息
ISBN:
(纸本)9781538659892
Single-Instruction-Multiple-data (SIMD) architectures are widely used to accelerate applications involving data-Level Parallelism (DLP);the on-chip memory system facilitates the communication between Processing Elements (PE) and on-chip vectormemory. It is observed that inefficiency of the onchip memory system is often a computational bottleneck. In this paper, we describe the design and implementation of an efficient vector data memory system. The proposed memory system consists of two novel parts: an access-pattern-aware memory controller and an automatic loading mechanism. The memory controller reduces the data reorganization overheads. The automatic loading mechanism loads data automatically according to the access patterns without load instructions. This eliminates overhead of fetching and decoding. The proposed design is implemented and synthesized with Cadence tools. Experimental results demonstrate that our design improves the performance of 8 application kernels by 44% and reduces the energy consumption by 26%, on average.
暂无评论