This paper presents a new highly efficient procedure for the determination of the dynamical equations of motion for complex multibody systems and their subsequent temporal integration using parallelcomputing. The met...
详细信息
In this paper we study the parallel aspects of PCGLS, a basic iterative method whose main idea is to organize the computation of conjugate gradient method with preconditioner applied to normal equations, and incomplet...
详细信息
ISBN:
(纸本)0818678763
In this paper we study the parallel aspects of PCGLS, a basic iterative method whose main idea is to organize the computation of conjugate gradient method with preconditioner applied to normal equations, and incomplete modified Gram-Schmidt (IMGS) preconditioner for solving sparse least squares problems on massively paralleldistributed memory computers. The performance of these methods on this kind of architecture is always limited because of the global communication required for the inner products. We describe the parallelization of PCGLS and IMGS preconditioner by two ways of improvement. One is to assemble the results of a number of inner products collectively and the other is to create situations when communication can be overlapped with computation. A theoretical model of computation and communication phases is presented which allows us to decide the number of processors that minimizes the runtime. Several numerical experiments on Parsytec GC/PowerPlus are presented.
This paper proposes the design of a scalable memory-shared multiprocessing system SMMP which supports Client/Server mode. SMMP system is composed of two-level interconnection networks, three-level memory subsystem and...
详细信息
ISBN:
(纸本)0818678763
This paper proposes the design of a scalable memory-shared multiprocessing system SMMP which supports Client/Server mode. SMMP system is composed of two-level interconnection networks, three-level memory subsystem and three-level I/O subsystem. There are many advantages in the design of our SMMP, such as scalable, easy to implement and operate, general purpose and large I/O throughput. It can be an excellent server for high-speed communication network.
We introduce a new parallel programming paradigm, namely synchronous parallel critical sections. Such parallel critical sections must be seen in the context of switching between synchronous and asynchronous modes of c...
详细信息
We introduce a new parallel programming paradigm, namely synchronous parallel critical sections. Such parallel critical sections must be seen in the context of switching between synchronous and asynchronous modes of computation. Thread farming allows to generate bunches of threads to solve independent subproblems asynchronously and in parallel. Opposed to that, synchronous parallel critical sections allow to organize bunches of asynchronous parallel threads to execute certain task jointly and synchronously. We show how the PRAM language Fork95 can be extended by a construct join supporting parallel critical sections. We explain its semantics and implementation, and discuss possible applications.
The paper proposes a new process algebra, called /spl chi/-calculus. The language differs from /spl pi/-calculus in several aspects. First it takes a more uniform view on input and output. Second, the closed names of ...
详细信息
ISBN:
(纸本)0818678763
The paper proposes a new process algebra, called /spl chi/-calculus. The language differs from /spl pi/-calculus in several aspects. First it takes a more uniform view on input and output. Second, the closed names of the language are homogeneous in the sense that there is only one kind of bound name. Thirdly, the effects of communications in /spl chi/-calculus are delimited by localization operators, not by sequentiality combinator. Finally, the language cherishes more freedom of parallelism than /spl pi/-calculus. The algebraic properties of /spl chi/-processes are studied in terms of local bisimulation. It is shown that local bisimilarity is a congruence equivalence on /spl chi/-processes.
Snapshot algorithms are fundamental for many distributed applications and must often be executed repeatedly. We present three snapshot algorithms. The first one is based on the assumption of global time, it computes c...
详细信息
ISBN:
(纸本)0818678763
Snapshot algorithms are fundamental for many distributed applications and must often be executed repeatedly. We present three snapshot algorithms. The first one is based on the assumption of global time, it computes channel states using several schemes. Taking consistent cut for global time instant, we show that the algorithm is applicable for existing snapshot algorithms. The second one is a real token passing based algorithm for non-FIFO asynchronous distributed systems. Its message complexity of control messages is O(n). The last algorithm is the repeated version of the second one. Using this algorithm, processes can get consistent global states at their convenience concurrently.
In this paper we present the results of a parallel implementation of a heart field simulation algorithm. The application of biomagnetic fields offers a wide range for using parallel algorithms. Pathological changes in...
详细信息
In this paper we present the results of a parallel implementation of a heart field simulation algorithm. The application of biomagnetic fields offers a wide range for using parallel algorithms. Pathological changes in the human body, especially in the heart muscle, can be diagnosed and localised by means of biomagnetic field parameters. The benefit of this diagnosis method is to fit an individual reference model of the heart field of a patient. Based on differences between the reference model and the real measured biomagnetic field parameters, the type and the position of defects in the heart can be located. The most time consuming components of the whole algorithm are the matrix computations, especially the matrix inversion. The matrix inversion can be implemented on a paralleldistributed memory system. In this paper we discuss the routing, the parallel matrix inversion, and the speed up for different network topologies that depends on the number of processors and different problem sizes.
Delaunay triangulation has been much used in such applications as volume rendering, shape representation, terrain modeling and so on. The main disadvantage of Delaunay triangulation is large computation time required ...
详细信息
ISBN:
(纸本)0818678763
Delaunay triangulation has been much used in such applications as volume rendering, shape representation, terrain modeling and so on. The main disadvantage of Delaunay triangulation is large computation time required to obtain the triangulation on an input points set. This time can be reduced by using more than one processor, and several parallel algorithms for Delaunay triangulation have been proposed. In this paper, we propose an improved parallel algorithm for Delaunay triangulation, which partitions the bounding convex region of the input points set into a number of regions by using Delaunay edges and generates Delaunay triangles in each region by applying an incremental construction approach. Partitioning by Delaunay edges makes it possible to eliminate merging step required for integrating subresults. It is shown from the experiments that the proposed algorithm has good load balance and is more efficient than Cignoni et al.'s algorithm (1993) and our previous algorithm (1996).
The proceedings contain 31 papers. The special focus in this conference is on Coordination Languages and Models. The topics include: Exposing the skeleton in the coordination closet;design for open systems in java;che...
ISBN:
(纸本)3540633839
The proceedings contain 31 papers. The special focus in this conference is on Coordination Languages and Models. The topics include: Exposing the skeleton in the coordination closet;design for open systems in java;checking assumptions in component dynamics at the architectural level;security benefits from software architecture;regulated coordination in open distributed systems;debugging distributed applications using a coordination architecture;coordinating durative actions;communication-passing style for coordination languages;software architecture for large control systems;evaluation of software architectures for a control system;modeling railway control systems using graph grammars;formal description of linda as a reactive system;three semantics of the output operation for generative communication;coordinating mobile agents via blackboards and access rights;modeling coordination via asynchronous communication;partial order and SOS semantics for linear constraint programs;programmable coordination media;coordinating actions systems;approximating unity;mobile unity coordination constructs applied to packet forwarding for mobile hosts;object-oriented protocol refinement in kannel;an asynchronous model of locality, failure, and process mobility;a component calculus for modeling the olan configuration language;a coordination model for distributed object systems;coordination patterns for parallelcomputing;concurrent MetateM as a coordination language and control-based coordination of human and other activities in cooperative information systems.
暂无评论