This paper describes the methods used to formulate and validate the memory subsystem of the cache-coherent Sun Scalable shared-memory MultiProcessor (***) at three levels of abstraction: the memory consistency model, ...
详细信息
This paper describes the methods used to formulate and validate the memory subsystem of the cache-coherent Sun Scalable shared-memory MultiProcessor (***) at three levels of abstraction: the memory consistency model, the cache coherence protocol, and the implementation.
Mechanisms for managing message buffers in Time Warp parallel simulations executing on cache-coherent shared-memory multiprocessors are studied. Two simple buffer management strategies called the sender pool and recei...
详细信息
ISBN:
(纸本)9780818671203
Mechanisms for managing message buffers in Time Warp parallel simulations executing on cache-coherent shared-memory multiprocessors are studied. Two simple buffer management strategies called the sender pool and receiver pool mechanisms are examined with respect to their efficiency, and in particular, their interaction with multiprocessor cache-coherence protocols. Measurements of implementations on a Kendall Square Research KSR-2 machine using both synthetic workloads and benchmark applications demonstrate that sender pools offer significant performance advantages over receiver pools. However, it is also observed that both schemes, especially the sender pool mechanism, are prone to severe performance degradations due to poor locality of reference in large simulations using substantial amounts of message buffer memory. A third strategy called the partitioned buffer pool approach is proposed that exploits the advantages of sender pools, but exhibits much better locality. Measurements of this approach indicate that the partitioned pool mechanism yields substantially better performance than both the sender and receiver pool schemes for large-scale, small-granularity parallel simulation *** central conclusions from this study are: (1) buffer management strategies play an important role in determining the overall efficiency of multiprocessor-based parallel simulators, and (2) the partitioned buffer pool organization offers significantly better performance than the sender and receiver pool schemes. These studies demonstrate that poor performance may result if proper attention is not paid to realizing an efficient buffer management mechanism.
暂无评论