This paper presents an innovative multimedia reconfigurable accelerator for mobile systems associated to a programmingmodel and a compiler flow. The architecture implements a flexible memory subsystem based on softwa...
详细信息
ISBN:
(纸本)9783540786092
This paper presents an innovative multimedia reconfigurable accelerator for mobile systems associated to a programmingmodel and a compiler flow. The architecture implements a flexible memory subsystem based on software controlled scratchpad sharedmemory banks. The main concern of the paper is sharedmemory management as it is a dominant factor in current designs and influences the performance of embedded systems as well as their energy consumption. An embedded shared-memory programming model is presented that abstracts the details of the hardware architecture but yet exposing parallelism to the user. It is open and user friendly while the hardware can execute complex data feeding on heavily pipelined datapath for compute intensive kernels. The architecture has been designed, and synthesized for 65nm technology for an operating frequency of 200MHz.
In this paper, we describe our experience of creating an OpenMP implementation of the SPICE3 circuit simulator program. Given the irregular patterns of access to dynamic data structures in the SPICE code, a paralleliz...
详细信息
In this paper, we describe our experience of creating an OpenMP implementation of the SPICE3 circuit simulator program. Given the irregular patterns of access to dynamic data structures in the SPICE code, a parallelization using current standard OpenMP directives is impossible without major rewriting of the original program. The aim of this work is to present a case study showing the development of a sharedmemory parallel code with minimum effort. We present two implementations, one with minimal code modification and one without modification to the original SPICE3 program using Intel's taskq construct. We also discuss the results of the case study in terms of what future compiler tools may be needed to help OpenMP application developers with similar porting goals. Our experiments using SPICE3, based on SRAM model simulation, were compiled by the SUN compiler running on a SunFire V880 UltraSPARC-III 750 MHz and by the Intel icc compiler running on both an IBM Itanium with four CPUs and Intel Xeon of two processors machines. The results are promising.
暂无评论