The hybrid programming model MPI+OpenMP are useful to solve the problems of load balancing of parallel applications independently of the architecture. Typical approaches to balance parallel applications using two leve...
详细信息
The hybrid programming model MPI+OpenMP are useful to solve the problems of load balancing of parallel applications independently of the architecture. Typical approaches to balance parallel applications using two levels of parallelism or only MPI consist of including complex codes that dynamically detect which data domains are more computational intensive and either manually redistribute the allocated processors or manually redistribute data. This approach has two drawbacks: it is time consuming and it requires an expert in application analysis. In this paper we present an automatic and dynamic approach for load balancing MPI+OpenMP applications. The system calculates the percentage of load imbalance and decides a processor distribution for the MPI processes that eliminates the computational load imbalance. Results show that this method can balance effectively applications without analyzing nor modifying them and that in the cases that the application was well balanced does not incur in a great overhead for the dynamic instrumentation and analysis realized.
Gamma is a simple, but powerful, parallel programming language. Its only data structure is the multiset or bag. Banatre and Metayer proposed the language as a means to derive systematically programs in the spirit of D...
详细信息
Gamma is a simple, but powerful, parallel programming language. Its only data structure is the multiset or bag. Banatre and Metayer proposed the language as a means to derive systematically programs in the spirit of Dijkstra's discipline of programming. It has been applied to solve problems in scheduling, image processing and fractal generation. Different operational semantics models for Gamma have been proposed. We single out one of them and also produce a denotational semantics for which full abstraction is proven with respect to the operational model chosen. The result can be used as a basis for a further proof system for Gamma.
This paper presents a model which combines the processing power of parallel computation with the ease of Web service usage. In this model, parallel programming environment can be embedded in a visual environment. Para...
详细信息
This paper presents a model which combines the processing power of parallel computation with the ease of Web service usage. In this model, parallel programming environment can be embedded in a visual environment. parallelization of Web services is provided by using multithreading technology with dataset parameters. This work also provides parallel usage of computers located in different places via a wide area network such as Internet.
The pipeline is a simple and intuitive structure to speed up many problems. Novice parallel programmers are usually taught this structure early on. However, expert parallel programmers typically eschew using the pipel...
详细信息
The pipeline is a simple and intuitive structure to speed up many problems. Novice parallel programmers are usually taught this structure early on. However, expert parallel programmers typically eschew using the pipeline in coarse-grained applications because it has three serious problems that make it difficult to implement efficiently. First, processors are idle when the pipeline is not full. Second, load balancing is crucial to obtaining good speedup. Third, it is difficult to incrementally incorporate more processors into an existing pipeline. Instead, experts recast the problem as a master/slave structure which does not suffer from these problems. This paper details a transformation that allows programs written in a pipeline style to execute using the master/slave structure. parallel programmers can benefit from both the intuitive simplicity of the pipeline and the efficient execution of a master/slave structure. This is demonstrated by performance results from two applications.
We describe the design and implementation of a "Grid-enabled" message passing library, in the context of the Phoenix message passing model. It supports: (1) message routing between nodes not directly reachab...
详细信息
ISBN:
(纸本)9780780384309
We describe the design and implementation of a "Grid-enabled" message passing library, in the context of the Phoenix message passing model. It supports: (1) message routing between nodes not directly reachable due to firewalls and/or NAT; (2) resource discovery facilitating ease of configuration that allows nodes without static names; (e.g., DHCP nodes) to participate in computation without specific efforts; and (3) nodes dynamically joining/leaving computation at runtime. We argue that, in future Grid environments, all of the above functions, not just routing across firewalls, will become important issues of Grid-enabled message passing systems including MPI. Unlike solutions commonly proposed by previous work on a Grid-enabled MPI, our system runs a distributed resource discovery and routing table construction algorithm, rather than assuming all such pieces of information are available in a static configuration file or alike. Experimental results using 400 nodes in three LAN indicate that our algorithm is able to dynamically discover participating peers, connect them to each other and calculate a routing table. The elapsed time of our algorithm is only approximately twice as long as that of offline route calculation that just connects nodes based on a fully given configuration.
Summary form only given. Since the introduction of the Java language less then a decade ago, there have been several attempts to create a runtime system for distributed execution of multithreaded Java applications. Th...
详细信息
Summary form only given. Since the introduction of the Java language less then a decade ago, there have been several attempts to create a runtime system for distributed execution of multithreaded Java applications. The goal of these attempts was to gain increased computational power while preserving Java's convenient parallel programming paradigm. This paper gives a detailed overview of the existing distributed runtime systems for Java and presents a new approach, implemented in a system called JavaSplit. Unlike previous works, which either forfeit Java's portability or introduce unconventional programming constructs, Java-Split is able to execute standard multithreaded Java while preserving portability. JavaSplit works by rewriting the bytecodes of a given parallel application, transforming it into a distributed application that incorporates all the runtime logic. Each runtime node carries out its part of the resulting distributed computation using nothing but its local standard (unmodified) Java virtual machine (JVM).
Summary form only given. Within the trend of object-based distributed computing, we present the design and implementation of a numerical simulation for electromagnetic waves propagation. A sequential Java design and i...
详细信息
Summary form only given. Within the trend of object-based distributed computing, we present the design and implementation of a numerical simulation for electromagnetic waves propagation. A sequential Java design and implementation is first presented. Further, a distributed and parallel version is derived from the first, using an active object pattern. In addition, benchmarks are presented on this nonembarrassingly parallel application. A first contribution resides in the sequential object-oriented design that proved to be very modular and extensible; the classes and abstractions are designed to allow both element and volume type methods, furthermore, valid on structured, unstructured, or hybrid meshes. Compared to a Fortran version, the performance of this highly modular version proved to be in the same range. It is also shown how smoothly the sequential version can be distributed, keeping the same structuring and object abstractions, allowing to deal with larger data size. Finally, benchmarks on up to 64 processors compare the performances with respect to sequential and parallel versions, putting that in perspective with a comparable Fortran version.
Shared virtual memory (SVM) is a practical approach for providing a simple parallel programming environment on a cluster of computers since it permits programmers to assume the existence of a shared memory image acros...
详细信息
Shared virtual memory (SVM) is a practical approach for providing a simple parallel programming environment on a cluster of computers since it permits programmers to assume the existence of a shared memory image across physically distributed memory systems, obviating the need of explicit message passing operations. While several fault-tolerant techniques for crash recovery support in SVM have been proposed and studied extensively, very little attention has been given to fault detection issues. In this paper, we propose and evaluate a new fault detection technique, called lightweight fault detection (LFD). Our experimental results confirmed that LFD provides a swift fault detection support to SVM and incurs very little overhead (1.42% on average) during a failure-free execution. Hence, a combination of LFD with previously proposed crash recovery techniques make cluster computing on SVM more reliable and attractive.
This work presents an original approach for identifying and analyzing cognitive aspects specific to artifacts creation in the context of a senior-level project-course. We analyzed the effort spent by student teams on ...
详细信息
This work presents an original approach for identifying and analyzing cognitive aspects specific to artifacts creation in the context of a senior-level project-course. We analyzed the effort spent by student teams on each specific artifact meant for two different processes carried on in parallel: UPEDU (unified process for education) - the traditional activity-role-artifact based methodology and XP (extreme programming) - an agile methodology. For this purpose we extend the notion of mental model to software process and we propose to consider the terms of mental artifact and physical artifact. Even though the two used processes are very different, our results show that the same effort is spent on mental artifacts as on physical artifacts, in the context of both processes. The comparison results may allow a better understanding of students' cognitive behavior, in order to designate the required actions for improving academic projects.
Summary form only given. The article is devoted to the concept of the stepping commands in parallel debuggers. It reviews the main existing schemes (synchronous and asynchronous step implementations) and introduces a ...
详细信息
Summary form only given. The article is devoted to the concept of the stepping commands in parallel debuggers. It reviews the main existing schemes (synchronous and asynchronous step implementations) and introduces a new kind of synchronous scheme that has several advantages over existing ones. In this scheme the debugger performs the dynamic analysis of the program state thus simplifying the program state presentation and control. The possible implementation of suggested scheme in MPI debuggers is discussed and the existing implementation in mpC Workshop parallel debugger is presented.
暂无评论