Modern multicore processors, such as the Cell Broadband Engine, achieve high performance by equipping accelerator cores with small "scratch-pad" memories. The price for increased performance is higher progra...
详细信息
Modern multicore processors, such as the Cell Broadband Engine, achieve high performance by equipping accelerator cores with small "scratch-pad" memories. The price for increased performance is higher programming complexity - the programmer must manually orchestrate data movement using direct memory access (DMA) operations. programming using asynchronous DMA operations is error-prone, and DMA races can lead to nondeterministic bugs which are hard to reproduce and fix. We present a method for DMA race analysis in C programs. Our method works by automatically instrumenting a program with assertions modeling the semantics of a memory flow controller. The instrumented program can then be analyzed using state-of-the-art software model checkers. We show that bounded model checking is effective for detecting DMA races in buggy programs. To enable automatic verification of the correctness of instrumented programs, we present a new formulation of k-induction geared towards software, as a proof rule operating on loops. Our techniques are implemented as a tool, Scratch, which we apply to a large set of programs supplied with the IBM Cell SDK, in which we discover a previously unknown bug. Our experimental results indicate that our k-induction method performs extremely well on this problem class. To our knowledge, this marks both the first application of k-induction to software verification, and the first example of software model checking in the context of heterogeneous multicore processors.
multicore processor designs have become increasingly popular for embedded applications in recent years, but diversified inter-core communication mechanisms have led to the difficulties in software development, integra...
详细信息
multicore processor designs have become increasingly popular for embedded applications in recent years, but diversified inter-core communication mechanisms have led to the difficulties in software development, integration and migration. A unified, portable, and efficient inter-core communication mechanism would have helped reduce these difficulties significantly, but such a solution did not exist today. We proposed a scheme called MSG, which provides users with a set of essential message-passing programming interfaces adopted from MP! and MCAPI, including blocking and non-blocking point-to-point communications, one-sided communications, and collective operations. We experimented and evaluated our design methodology with case studies on two heterogeneous multicore platforms: IBM CELL and ITRI PAC DUO. On the CELL platform, our MSG library fitted in the 256 KB local memory on each individual processor core and outperformed two existing communication libraries, DaCS and CML. On the second case study, we were able to port MSG onto the PAC DUO platform within two weeks upon receiving the platform. With a systematic approach, we showed how optimizations could be done to improve the performance of the MSG libraries. Hopefully, our experiences help the design and development of communication libraries for existing and future multicore platforms. (C) 2010 Elsevier B.V. All rights reserved.
multicore programming is both prevalent and difficult. Industry programmers deal with large amounts of legacy code and are increasingly relying on multithreading to provide scalability. For legacy systems, it may not ...
详细信息
ISBN:
(纸本)9781450309424
multicore programming is both prevalent and difficult. Industry programmers deal with large amounts of legacy code and are increasingly relying on multithreading to provide scalability. For legacy systems, it may not be possible to change this programming model. The Transitioning to multicore (TMC) workshop is focused on tools and systems for parallel programming that are interoperable with legacy code, that minimize the annotation burden for developers, and match well with current industry practice. We solicit industry experience reports about working or unworkable examples of such tools or systems, as well as research reports.
The impending multi/many-core processor revolution requires that programmers leverage explicit concurrency to improve performance. Unfortunately, a large body of applications/algorithms are inherently hard to parallel...
详细信息
ISBN:
(纸本)9781450309424
The impending multi/many-core processor revolution requires that programmers leverage explicit concurrency to improve performance. Unfortunately, a large body of applications/algorithms are inherently hard to parallelize due to execution order constraints imposed by data and control dependencies or being sensitive to their input data and not scale perfectly, leaving several cores idle. The goal of this research is to enable such applications leverage multi/many-cores efficiently to improve their performance.
Speculative multithreading is one of the most hopeful methods for speeding up the execution of programs in multicore systems. Each loop has a lot of possible execution paths, however, only a few of them are executed f...
详细信息
ISBN:
(纸本)9780889869073
Speculative multithreading is one of the most hopeful methods for speeding up the execution of programs in multicore systems. Each loop has a lot of possible execution paths, however, only a few of them are executed frequently in many cases. We focus two-path limited speculation method, that speculates only the most frequent two paths based on path profiling results of the whole program execution using path profiling. To maximize the performance of the method, this paper discusses 'phased behavior' in pro- gram execution. This paper firstly shows actual behaviors in program execution in terms of execution paths. Then, we introduce practical methods of dynamic selection of speculation paths. Preliminary estimation results show that realistically optimal method performs 1.78 times speedup at the maximum and also that the practical method can in- crease speculation success ratio up to 10 percents.
With the growing amount of parallelism available on today's multicore processors, achieving good performance at scale is challenging. We approach this issue through an alternative to traditional thread-based parad...
详细信息
multicore programming is both prevalent and difficult. Industry programmers deal with large amounts of legacy code and are increasingly relying on multithreading to provide scalability. The Transitioning to multicore ...
详细信息
In this paper, we describe challenges and solutions for programming multi-processor systems-on-a-chip, based on our experience in programming Platform2012, a large-scale multicore fabric under development by STMicroel...
详细信息
Nowadays, the one of the most important challenges in the programming is the efficient usage of multicore processors. Many new programming languages and libraries support multicore programming. Cilk++ is one of the mo...
详细信息
Nowadays, the one of the most important challenges in the programming is the efficient usage of multicore processors. Many new programming languages and libraries support multicore programming. Cilk++ is one of the most well-known languages extension of C++ providing new keywords for multicore programming. C++ Standard Template Library is efficient generic library but it does not support parallelism. It is optimized to the sequential realm, hence it can be an efficiency bottleneck when it is used in multicore environment. In this paper we argue for a multicore implementation of C++ Standard Template Library for Cilk++. We consider the implementation of containers, algorithms, and functors as well. Our implementation takes advantage of generative technologies of C++. We also measure the speedup of our implementation.
暂无评论