This paper surveys the program dependence analysis technique for parallel and/or distributed programs and its applications from the viewpoint of software engineering. We present primary program dependences which may e...
详细信息
ISBN:
(纸本)0818678763
This paper surveys the program dependence analysis technique for parallel and/or distributed programs and its applications from the viewpoint of software engineering. We present primary program dependences which may exist in a parallel and/or distributed program, a general approach to define, analyze, and represent these program dependences formally, and applications of an explicit program dependence based representation for parallel and/or distributed programs in various software engineering activities. We also suggest some research problems on this direction.
When used to stimulate manufacturing systems, most existing parallel simulation languages cannot easily implement some features of those systems, such as the scheduling rules of a machine or the sharing of operators b...
详细信息
ISBN:
(纸本)076951104X;0769511058
When used to stimulate manufacturing systems, most existing parallel simulation languages cannot easily implement some features of those systems, such as the scheduling rules of a machine or the sharing of operators by multiple machines. This paper presents the design and implementation of a parallel object-orientated manufacturing Simulation Language, called POMSim. A POMSim simulation id developed by using the concept of classes (entity-types) and inheritance to support iterative design of efficient simulation models. POMSim completely hides all the details of parallel simulation, and provides simple and direct constructs to efficiently model the scheduling rules and in manufacturing simulation. it also provides asynchronous method invocation and synchronous function call. POMSim libraries predefine a set of basic classes for manufacturing simulation, each of which represents a particular component in the physical manufacturing system.
Today, due to many reasons, such as the inherent heterogeneity, the diversity, and the continuous evolving of actual computational supports, writing efficient parallel applications on such systems represents a great c...
详细信息
ISBN:
(纸本)9781424416936
Today, due to many reasons, such as the inherent heterogeneity, the diversity, and the continuous evolving of actual computational supports, writing efficient parallel applications on such systems represents a great challenge. One way to answer this problem is to optimize communications of such applications. Our objective within this work is to design a realistic model able to accurately predict the cost of communication operations on execution environments characterized by both heterogeneity and hierarchical structure. We principally aim to guarantee a good quality of prediction with a neglected additional overhead The proposed model was applied on point-to-point and collective communication operations and showed by achieving experiments on a hierarchical cluster-based system with heterogeneous resources that the predicted performances are close to measured ones.
The adage "the whole is not equal to the sum of its parts" is very appropriate in the context of verifying a range of systemic properties, such as deadlocks, correctness, and conformance to quality of servic...
详细信息
ISBN:
(纸本)9781424416936
The adage "the whole is not equal to the sum of its parts" is very appropriate in the context of verifying a range of systemic properties, such as deadlocks, correctness, and conformance to quality of service (QoS) requirements, for component-based distributed real-time and embedded (DRE) systems. For example, end-to-end worst case response time (WCRT) in component-based DRE systems is not as simple as accumulating WCRT for each individual component in the system because of inherent complexities introduced by the large solution space of possible deployment and configurations. This paper describes a novel process and tool-based artifacts that simplify the formal specification of component-based DRE systems for verification of systemic QoS properties. Our approach is based on the mathematical formalism of Timed Input/Output Automata and uses generative programming techniques for automating the verification of systemic QoS properties for component-based DRE systems.
Task-based programming models have shown their potential for efficiency and scalability in parallel and distributedsystems. With such a model, a parallel application is broken down into a graph of tasks, which are su...
详细信息
ISBN:
(纸本)9780738110646
Task-based programming models have shown their potential for efficiency and scalability in parallel and distributedsystems. With such a model, a parallel application is broken down into a graph of tasks, which are subsequently scheduled for execution. Recently, implementations of task-based models have addressed distributed memory and heterogeneous systems with accelerators. However, the problem of scheduling tasks as well as allocating resources at runtime is still a challenge. In this paper, we propose coordinated and cooperative task scheduling across multiple applications. The main idea is to exploit the application's idle time e.g. from imbalance to serve tasks from another application. The experiments use Chameleon, a task-based framework for reactive tasking in distributed memory systems. In various example scenarios, we show improvements in CPU utilization of 5% - 15% by coordinated scheduling.
The second international workshop on Emerging Technologies for Next-generation GRID (ETNGRID) gathered many researchers working on emerging aspects of Grid computing. Submitted papers focused on challenging open probl...
详细信息
ISBN:
(纸本)0769523625
The second international workshop on Emerging Technologies for Next-generation GRID (ETNGRID) gathered many researchers working on emerging aspects of Grid computing. Submitted papers focused on challenging open problems, proposing solutions based on approaches borrowed from the world of distributedsystems, as well as new and promising techniques. We provide here a brief overview of the papers, showing the contribution given and the advances introduced in the field of Grid computing.
Many simulations in the natural sciences and engineering require the numerical solution of nonlinear differential equations. For this class of numerical methods, we propose an appropriate parallel computation model on...
详细信息
Many simulations in the natural sciences and engineering require the numerical solution of nonlinear differential equations. For this class of numerical methods, we propose an appropriate parallel computation model on distributed memory machines that supports the prediction of execution times. As a case study, we investigate the parallel implementation of the diagonal-implicitly iterated Runge-Kutta method, a solution method for stiff systems of ordinary differential equations. An implementation on the Intel iPSC/860 confirms the accuracy of the prediction model.
In linear algebra, Cholesky factorization is useful in solving a system of equations with a symmetric positive definite coefficient matrix. Cholesky factorization is roughly twice as fast relative to LU factorization ...
详细信息
ISBN:
(纸本)9781467345651;9780769549033
In linear algebra, Cholesky factorization is useful in solving a system of equations with a symmetric positive definite coefficient matrix. Cholesky factorization is roughly twice as fast relative to LU factorization which applies to general matrices. In recent years, with advances in technology, a Fermi GPU card can accommodate hundreds of cores compared to the small number of 8 or 16 cores on CPU. Therefore a trend is seen to use the graphics card as a general purpose graphics processing unit (GPGPU) for parallel computation. In this work, Volkov's hybrid implementation of Cholesky factorization is evaluated on the new Fermi GPU with others and then some improvement strategies were proposed. After experiments, compared to the CPU version using Intel Math Kernel Library (MKL), our proposed GPU improvement strategy can achieve a speedup of 3.85x on Cholesky factorization of a square matrix of dimension 10,000.
The proceedings contains 26 papers from the 11th workshop on parallel and distributed Simulation (PADS'97). Topics discussed include: conservative simulation;architecture and VLSI simulation;event simultaneity;VLS...
详细信息
The proceedings contains 26 papers from the 11th workshop on parallel and distributed Simulation (PADS'97). Topics discussed include: conservative simulation;architecture and VLSI simulation;event simultaneity;VLSI circuit partitioning;optimistic simulation;Petri net simulation;interconnection computer network;distributed simulation;Hierarchical Tool HIT;Multi-Resolution Entity;bulk synchronous parallel models;and parallel simulation environments.
Developing applications for parallel and distributedsystems is hard due to their nondeterministic nature;developing debugging tools for such systems and applications is even harder. A number of distributed debugging ...
详细信息
ISBN:
(纸本)0780394925
Developing applications for parallel and distributedsystems is hard due to their nondeterministic nature;developing debugging tools for such systems and applications is even harder. A number of distributed debugging tools and techniques exist;however, we believe that they lack the infrastructure to scale to large-scale distributedsystems, systems with hundreds and thousands of nodes, such as Grids. In this paper, we introduce PDB, our prototype debugger, which is based on a hierarchical, scalable architecture. We explain the design of the PDB, highlight its functionality, and demonstrate its usability with two case studies. Before concluding, we discuss portability and extensibility issues for PDB, and discuss some solutions.
暂无评论