The field of distributed parallel programming is predominated by tools such as the parallel Virtual Machine (PVM) and the Message Passing Interface (MPI). On the other hand, mainly standards like the Common Object Req...
详细信息
The field of distributed parallel programming is predominated by tools such as the parallel Virtual Machine (PVM) and the Message Passing Interface (MPI). On the other hand, mainly standards like the Common Object Request Broker Architecture (CORBA), Remote Method Invocation (RMI), and the Distributed Component Object Model (DCOM) are used for distributed computing. In this paper, we examine the suitability of CORBA-based solutions for meeting application requirements in the field of parallel programming. We outline concepts defined within CORBA which are helpful for the development of parallel applications. Subsequently, we present our design of an object group service and a join service which facilitate the development of CORBA-based distributed and parallel software applications by transparently encapsulating typical forking and joining mechanisms often needed in that context.
The design pattern concept is widely used in large object-oriented software development, but this should not be limited to the object-oriented field: it can be used in many other areas. Explicit parallel programming i...
详细信息
The design pattern concept is widely used in large object-oriented software development, but this should not be limited to the object-oriented field: it can be used in many other areas. Explicit parallel programming is well-known to be complex and error-prone, and design patterns can ease this work. This paper introduces a pattern-based approach for parallel programming, in which we classify design patterns into two levels to support (a) the parallel algorithm design phase and (b) the parallel coding phase, respectively. Through this approach, a programmer doesn't need much additional knowledge about parallel computing; what he need to do is to describe the problem he wants to solve and offer some parameters, sequential code or components. We demonstrate this approach with a case study in this paper.
作者:
P. KacsukMTA SZTAKI
Hungarian Academy of Sciences (ATOMKI) Budapest Hungary
The paper describes the performance visualization tool (PROVE) of an integrated parallel program development environment called GRADE. All four major aspects of performance visualization (source code instrumentation, ...
详细信息
The paper describes the performance visualization tool (PROVE) of an integrated parallel program development environment called GRADE. All four major aspects of performance visualization (source code instrumentation, data acquisition, data analysis, visualization), are explained generally and in the context of PROVE.
This paper presents some results of programming efficient matching algorithms on a new asynchronous parallel programming model. Matching algorithms are widely used in image processing when considering high-level treat...
详细信息
This paper presents some results of programming efficient matching algorithms on a new asynchronous parallel programming model. Matching algorithms are widely used in image processing when considering high-level treatments. Pattern analysis, database search, 2D and 3D reconstruction all need matching algorithms to perform. Experiments we did were mainly oriented towards a particular matching problem: the stable marriage algorithm. Different implementations of this algorithm have been done on a massively parallel asynchronous model. This model relies on a network of asynchronously communicating processors leading to very fast SIMD treatments. The asynchronous model and implementations of the matching algorithm are presented. An example of image processing problem is also used for illustration purpose and supports the architectural discussion and results.
The parallel programming tools and packages are evolving rapidly. However the complexity of parallel thinking does not allow to implement many algorithms for the end user. In most cases only expert programmers risk to...
详细信息
The parallel programming tools and packages are evolving rapidly. However the complexity of parallel thinking does not allow to implement many algorithms for the end user. In most cases only expert programmers risk to involve in parallel programming and program debugging. In this paper we extend the ideas from [3] of template programming for a certain class of problems which could be solved by using general master-slave paradigm. The template is suitable for solution of the coarse grain and middle grain granularity problem set. Actually, it could be applied to solve any problem P, which is decomposable into a set of tasks P=UNi=0ti. The most effective application cases are obtained for the problems where all ti are independent. The template programming sets some requirements for the sequential version of the user program: 1. The main program must comprise of several code blocks: data initialization, computation of one task ti and the processing of the result. 2. The user has to define the data structures: initial data, one task data, the result data. These requirements do not require to rewrite the existing sequential code but to organize it into some logical parts. After these requirements (and naming conventions) are fulfilled, the parallel version of the code is obtained automatically by compiling and linking the code with the Master-Slave Template library. In this paper we introduce the idea of the template programming and describe the layer structure of the Master-Slave Template library. We show how the user has to adjust the sequential code to obtain a valid parallel version of the initial program. We also give examples of the prime number search problem and the Mandelbrot set calculation problem.
We are currently studying the implementation of a C++-based parallel programming library to simplify parallel programming and efficiently execute parallel programs on a locally distributed computers with nonuniform pe...
详细信息
We are currently studying the implementation of a C++-based parallel programming library to simplify parallel programming and efficiently execute parallel programs on a locally distributed computers with nonuniform performances. The class library employs dynamic allocation scheme of Light-Weight Processes (LWPs) and efficient execution of parallel applications. In this paper, we describe the dynamic allocation scheme implemented in the class library, discuss the experimental evaluation with parallel applications, and provide comparison with static allocation. The experiments results show that with dynamic allocation implemented in the class library, there is a significant performance advantage on a cluster environment with machines having nonuniform performances. Furthermore, this study provides insights on the trade-offs between task or process allocation and memory-access costs.
We present a new parallel programming tool environment that is (1) accessible and executable "anytime, anywhere," through standard Web browsers and (2) integrated in that it provides tools that adhere to a c...
详细信息
We present a new parallel programming tool environment that is (1) accessible and executable "anytime, anywhere," through standard Web browsers and (2) integrated in that it provides tools that adhere to a common underlying methodology for parallel programming and performance tuning. The environment is based on a new network computing infrastructure, developed at Purdue University. We evaluate our environment qualitatively by comparing our tool access method with conventional schemes of software download and installation. We also quantitatively evaluate the efficiency of interactive tool access in our environment. We do this by measuring the response times of various functions of the URSA MINOR tool and compare them with those of a Java Applet-based "anytime, anywhere" tool access method. We found that our environment offers significant advantages in terms of tool accessibility, integration, and efficiency.
There are two classes of dataflow schemata: DF and ADF. ADF is known to be equivalent to EF and DF, and the class of ordinary dataflow schemata is known to be equivalent to EF/sup d/. ADF is given by strengthening wit...
详细信息
There are two classes of dataflow schemata: DF and ADF. ADF is known to be equivalent to EF and DF, and the class of ordinary dataflow schemata is known to be equivalent to EF/sup d/. ADF is given by strengthening with two devices compared with DF, the class of ordinary dataflow schemata. One is recursion and the other is arbiter which allows timing dependent processing. We are interested in whether both devices are necessary for ADF to have such powerful expression ability. The author examines the expression ability of the class RDF which is strengthened with just recursion compared to DF. As a result it is shown that RDF is also equivalent to EF/sup d/, which means that some kind of timing dependency is necessary for the class of dataflow schemata to be powerful enough.
We present a programming methodology that reduces parallel programming complexity, while creating portable and automatically scalable parallel software. To support this methodology two separate tools have been develop...
详细信息
We present a programming methodology that reduces parallel programming complexity, while creating portable and automatically scalable parallel software. To support this methodology two separate tools have been developed - the PARSA software development environment, and an accompanying thread manager. The development environment addresses programming issues via an object-based graphical programming methodology that transforms a project automatically into a portable and scalable source code. The generated source code makes calls to the user-level thread manager, which manages the run time execution of the parallel software. Two sample applications that contain various forms of parallelism have been developed and are compiled on three different systems with diverse native threading mechanisms to demonstrate portability. Finally, the automatic scalability is demonstrated with the run time performance of the applications on multiprocessor systems.
作者:
Saito, TTohoku Univ
Shock Wave Res Ctr Inst Fluid Sci Aoba Ku Sendai Miyagi 9808577 Japan
This paper presents the development of a numerical code for simulating unsteady dusty-gas flows including shock and rarefaction waves. The numerical results obtained for a shock tube problem are used for validating th...
详细信息
This paper presents the development of a numerical code for simulating unsteady dusty-gas flows including shock and rarefaction waves. The numerical results obtained for a shock tube problem are used for validating the accuracy and performance of the code. The code is then extended for simulating two-dimensional problems. Since the interactions between the gas and particle phases are calculated with the operator splitting technique, we can choose numerical schemes independently for the different phases. A semi-analytical method is developed for the dust phase, while the TVD scheme of Harten and Yee is chosen for the gas phase. Throughout this study, computations are carried out on SGI Origin2000, a parallel computer with multiple of RISC based processors, The efficient use of the parallel computer system is an important issue and the code implementation on Origin2000 is also described. Flow profiles of both the gas and solid particles behind the steady shock wave are calculated by integrating the steady conservation equations. The good agreement between the pseudo-stationary solutions and those from the current numerical code validates the numerical approach and the actual coding. The pseudo-stationary shock profiles can also be used as initial conditions of unsteady multidimensional simulations. (C) 2002 Elsevier Science (USA).
暂无评论