We propose a software transactional memory (STM) for heterogeneous multicores with small local memory. The heterogeneous multicore architecture consists of a general-purpose processor element (GPE) and multiple accele...
详细信息
ISBN:
(纸本)9781450301787
We propose a software transactional memory (STM) for heterogeneous multicores with small local memory. The heterogeneous multicore architecture consists of a general-purpose processor element (GPE) and multiple accelerator processor elements (APEs). The GPE is typically backed by a deep, on-chip cache hierarchy and hardware cache coherence. On the other hand, the APEs have small, explicitly addressed local memory that is not coherent with the main memory. Programmers of such multicore architectures suffer from explicit memory management and coherence problems. The STM for such multicores can alleviate the burden of the programmer and transparently handle data transfers at run time. Moreover, it makes the programmer free from controlling locks. Our TM is based on an existing software SVM for the accelerator architecture. The software SVM exploits software-managed caches and coherence protocols between the GPE and APEs. We also propose an optimization technique, called abort prediction, for the TM. It blocks a transaction from running until the chance of potential conflicts is eliminated. We implement the TM system and the optimization technique for a single Cell BE processor and evaluate their effectiveness with six compute-intensive benchmark applications.
The Cell BE processor is a heterogeneous multicore that contains one PowerPC Processor Element (PPE) and eight Synergistic Processor Elements (SPEs). Each SPE has a small software-managed local store. Applications mus...
详细信息
ISBN:
(纸本)9781605582825
The Cell BE processor is a heterogeneous multicore that contains one PowerPC Processor Element (PPE) and eight Synergistic Processor Elements (SPEs). Each SPE has a small software-managed local store. Applications must explicitly control all DMA transfers of code and data between the SPE local stores and the main memory, and they must perform any coherence actions required for data transferred. The need for explicit memory management, together with the limited size of the SPE local stores, makes it challenging to program the Cell BE and achieve high performance. In this paper, we present the design and implementation of our COMIC runtime system and its programming model. It provides the program with an illusion of a globally sharedmemory, in which the PPE and each of the SPEs can access any shared data item, without the programmer having to worry about where the data is, or how to obtain it. COMIC is implemented entirely in software with the aid of user-level libraries provided by the Cell SDK. For each rea or write operation in SPE code, a COMIC runtime function is inserted to check whether the data is available in its local store, and to automatically fetch it if it is not. We propose a memory consistency model and a programming model or COMIC, in which the management of synchronization and coherence is centralized in the PPE. To characterize the effectiveness of the COMIC runtime system, we evaluate it with twelve OpenMP benchmark applications on a Cell BE system and an SMP-like homogeneous multicore (Xeon).
暂无评论