This article consists of a collection of slides from the author's conference presentation on software transactional memory (STM). Some of the specific topics discussed include: how to translate a language construc...
详细信息
This article consists of a collection of slides from the author's conference presentation on software transactional memory (STM). Some of the specific topics discussed include: how to translate a language construct via STM, including runtime and compiler support; example of hybrid TM; and the outlook for the future development STM and the applications to be supported.
Presents a collection of slides covering the following topics: Transactional Memory language construct; synchronised HashMap; transactional HashMap; Java; failure recovery; and parallel programming.
Presents a collection of slides covering the following topics: Transactional Memory language construct; synchronised HashMap; transactional HashMap; Java; failure recovery; and parallel programming.
Search is essential for constraint programming. Search engines typically combine several features like state restoration for back-tracking, best solution search, parallelism, or visualization. In current implementatio...
详细信息
Business process modeling (BPM) is one of the key factors in defining service-oriented solutions for business collaborations. Like in traditional software engineering there is a need for adaptable methodologies to dev...
详细信息
Business process modeling (BPM) is one of the key factors in defining service-oriented solutions for business collaborations. Like in traditional software engineering there is a need for adaptable methodologies to develop information and communication technology (ICT) systems supporting collaborative business processes. In this work we introduce a categorization for the classification of modeling languages and approaches used to model collaborative business processes. Considering an example, we show how the classification of modeling languages and approaches facilitates the development of methodologies for collaborative business processes.
With the industry trend towards multi-core chip, we need to understand the practical issues faced by users who are porting large existing sequential software to a parallel platform. There is much literature on inventi...
详细信息
ISBN:
(纸本)1932415602
With the industry trend towards multi-core chip, we need to understand the practical issues faced by users who are porting large existing sequential software to a parallel platform. There is much literature on inventing, implementing and optimizing parallel algorithms. We study what a user needs in order to come up with an efficient parallel program, starting from a sequential application. We parallelized, debugged and optimized a large, public-domain ray-tracing code (POV-Ray). We built a performance tool to predict parallel POV-Ray's scalability on different platforms. The performance prediction tool and modeling was an important part of the optimization methodology because it guides the user's expectations on the potential benefits of more efforts to identify performance bottlenecks. We describe the steps in our POV-Ray work, and summarize potential relevant research issues. Even for a sequential algorithm which can be "embarrassingly parallelized", there is still much scope for further research to simplify the user's parallelization and optimization efforts.
Dynamic voltage and frequency scaling (DVFS) is an effective technique for controlling microprocessor energy and performance. Existing DVFS techniques are primarily based on hardware, OS timeinterrupts, or static-comp...
详细信息
Code density is an important issue in memory constrained systems. Some RISC processor, e.g. the THUMB extension in the ARM processor, supports aggressive code size reduction even at the cost of significant performance...
详细信息
ISBN:
(纸本)1595931619
Code density is an important issue in memory constrained systems. Some RISC processor, e.g. the THUMB extension in the ARM processor, supports aggressive code size reduction even at the cost of significant performance loss. In this paper, we develop an algorithm that utilizes a set of novel variable length Echo instructions and evaluate its effectiveness for IA32 binaries. Our experiments show that IA32 processor equipped with Echo instructions is capable of achieving a similar code density as the THUMB extension in the ARM instruction set with significantly lower performance penalty. Copyright 2005 ACM.
We propose a lexicalized formulation of dependency grammar that addresses both immediate dependence and linear precedence. Our approach distinguishes two orthogonal, yet mutually constraining dependency trees: an ID t...
详细信息
We propose a lexicalized formulation of dependency grammar that addresses both immediate dependence and linear precedence. Our approach distinguishes two orthogonal, yet mutually constraining dependency trees: an ID t...
We propose a lexicalized formulation of dependency grammar that addresses both immediate dependence and linear precedence. Our approach distinguishes two orthogonal, yet mutually constraining dependency trees: an ID tree of syntactic dependencies and a LP tree of topological dependencies. The ID tree is non-ordered, non-projective and its edges are labeled by grammatical functions. The LP tree is ordered and projective and expresses licensed linearizations; its edges are labeled by topological fields. The LP tree can be regarded as deriving from the ID tree through a process of emancipation controlled by lexicalized constraints and principles. In the present article, we formalize valid ID/LP analyses and show how they can be characterized as the solutions of a constraint satisfaction problem. The latter can be solved by constraint programming and forms the basis of our implementation.
The emerging hardware support for thread-level speculation opens new opportunities to parallelize sequential programs beyond the traditional limits. By speculating that many data dependences are unlikely during runtim...
详细信息
The emerging hardware support for thread-level speculation opens new opportunities to parallelize sequential programs beyond the traditional limits. By speculating that many data dependences are unlikely during runtime, consecutive iterations of a sequential loop can be executed speculatively in parallel. Runtime parallelism is obtained when the speculation is correct. To take full advantage of this new execution model, a program needs to be programmed or compiled in such a way that it exhibits high degree of speculative thread-level parallelism. We propose a comprehensive cost-driven compilation framework to perform speculative parallelization. Based on a misspeculation cost model, the compiler aggressively transforms loops into optimal speculative parallel loops and selects only those loops whose speculative parallel execution is likely to improve program performance. The framework also supports and uses enabling techniques such as loop unrolling, software value prediction and dependence profiling to expose more speculative parallelism. The proposed framework was implemented on the ORC compiler. Our evaluation showed that the cost-driven speculative parallelization was effective. Our compiler was able to generate good speculative parallel loops in ten Spec2000Int benchmarks, which currently achieve an average 8% speedup. We anticipate an average 15.6% speedup when all enabling techniques are in place.
暂无评论