In this paper we present an event driven multi-threading architecture and its underlying event flow system model of computation as a framework for the implementation of complex reactive and communication systems. Exis...
详细信息
ISBN:
(纸本)9780818678950
In this paper we present an event driven multi-threading architecture and its underlying event flow system model of computation as a framework for the implementation of complex reactive and communication systems. Existing process oriented specification languages can be used to specify the system and embedded in the model. the target architecture covers a wide variety of architectures, varying from small FSMs to large processors, which are interconnected by a network template which performs dynamic scheduling and communication for different levels of process granularity and timing. Interconnect and module implementation and optimisation is based on an event flow graph model (EFG). In this paper we present our system model and the architectural template and show how they can be applied to an industrial application example.
In this paper a modified parallel Jacobi-conditioned conjugate gradient (CG) method is proposed for solving linear elastic finite element system of equations. the conventional element-by-element and diagonally conditi...
详细信息
In this paper a modified parallel Jacobi-conditioned conjugate gradient (CG) method is proposed for solving linear elastic finite element system of equations. the conventional element-by-element and diagonally conditioned approaches are discussed with respect to parallel implementation on distributed memory MIMD architectures. the effects of communication overheads on the efficiency of the parallel CG solver are considered and it is shown that for the efficient performance of a parallel CG solver, the interprocessor communication has to be carried out concurrently. A concurrent communication scheme is proposed by relating the semi-bandwidth of the stiffness matrix withthe number of independent degrees of freedom and the number of processors and inducing directionalization of communication within the processor pipeline. Withthe aid of two examples the effectiveness of the proposed method is demonstrated showing that the cost of communication remains low and relatively insensitive to the increase in the number of processors. Copyright (C) 1996 Civil-Comp Limited and Elsevier Science Limited.
作者:
Lenke, MLRR-TUM
Lehrstuhl für Rechnertechnik und Rechnerorganisation Institut für Informatik Technische Universität München 80290 München Germany
Typical applications of the so-called Grand Challenges need massively parallel computer system architectures. Tools like parallel debuggers, performance analysers and visualizers help the code designer to develop effi...
详细信息
Typical applications of the so-called Grand Challenges need massively parallel computer system architectures. Tools like parallel debuggers, performance analysers and visualizers help the code designer to develop efficient parallelalgorithms. Such tools merely support the development cycle. But technical and scientific engineers who make use of parallel high-performance computing applications, e.g. numerical simulation algorithms in computational fluid dynamics (CFD), must be supported in their engineering work by another kind of tool. A tool for the application cycle is required because old, conventional suggestions regarding the arrangement for the application cycle rely on strictly sequential procedures. they are due to the heritage of traditional work on former vector computers. that formative influence is still felt in today's arrangements for the application cycle, prevents a more efficient engineering work and, therefore, must be overcome. New tool conceptions have to be introduced to enable on-line interaction between the technical and scientific engineers and their running parallel simulation. VIPER stands for VIsualization of parallel numerical simulation algorithms for Extended Research and offers physical parameters of the mathematical model and parameters of the numerical method as objects of a graphical user tool interface for online observation and online modification. A special client-server-client process architecture implementation enables technical and scientific engineers who are sitting at their graphic workstation to interact withtheir parallel simulation algorithms running on a remote parallel computer system. the VIPER prototype is applied on ParNsflex which is a parallel Navier-Stokes solver for real world aero-dynamic problems. A Paragon XP/S was selected as test parallel computer system. A first evaluation indicates the superiority of the VIPER conception against conventional procedures. Copyright (C) 1996 Published by Elsevier Science L
A methodology for constructing parallel embedded DSP systems is described. the method uses a software and embedded processor abstraction to help raise the level of problem analysis above the raw state machine concept....
详细信息
In this paper we present a robust scalable parallelization of a multitarget tracking algorithm developed for air traffic surveillance. We couple the state estimation and data association problems by embedding an Inter...
详细信息
ISBN:
(纸本)0818675829
In this paper we present a robust scalable parallelization of a multitarget tracking algorithm developed for air traffic surveillance. We couple the state estimation and data association problems by embedding an Interacting Multiple Model (IMM) state estimator into an optimization-based assignment framework. A SPMD distributed-memory parallelization is described, wherein the interface to the optimization problem, namely, computing the rather numerous gating and IMM state estimates, covariance calculations, and likelihood function evaluations (used as cost coefficients in the assignment problem), is parallelized. We describe several heuristic algorithms developed for the inherent task allocation problem, where in the problem is one of assigning track tasks, having uncertain processing costs and negligible communication costs, across a set of homogeneous processors to minimize workload imbalances. Using a measurement database based on two FAA air traffic control radars, courtesy of Rome Laboratory, we show that near linear speedups are obtainable on a 32-node Intel Paragon supercomputer using simple task allocation algorithms.
We study parallelalgorithms for the minimum spanning tree problem, based on the sequential algorithm of Boruvka. the target architectures for our algorithm are asynchronous, distributed-memory machines. Analysis of o...
详细信息
ISBN:
(纸本)0818672552
We study parallelalgorithms for the minimum spanning tree problem, based on the sequential algorithm of Boruvka. the target architectures for our algorithm are asynchronous, distributed-memory machines. Analysis of our parallel algorithm, on a simple model that is reminiscent of the LogP model, shows that in principle a speedup proportional to the number of processors can be achieved, but that communication costs can be significant. To reduce these costs, we develop a new randomized linear work pointer jumping scheme that performs better than previous linear work algorithms. We also consider empirically the effects of data imbalance on the running time. For the graphs used in our experiments, load balancing schemes result in little improvement in running times. Our implementations on sparse graphs with 64,000 vertices on thinking Machine's CM-5 achieve a speedup factor of about 4 on 16 processors. On this environment, packaging of messages turns out to be the most effective way to reduce communication costs.
A universal spatial automaton, called WAVE, for highly parallelprocessing in arbitrary distributed systems is described. the automaton is based on a virus principle where recursive programs, or waves, self-navigate i...
详细信息
ISBN:
(纸本)0818675829
A universal spatial automaton, called WAVE, for highly parallelprocessing in arbitrary distributed systems is described. the automaton is based on a virus principle where recursive programs, or waves, self-navigate in networks of data or processes in multiple cooperative parts while controlling and modifying the environment they exist in and move through. the layered general organisation of the automaton as well as its distributed implementation in computer networks have been discussed. As the automaton dynamically creates, modifies, activates and processes any knowledge networks arbitrarily distributed in computer networks, it can easily model any other paradigms for parallel and distributed computing. Comparison of WAVE with some known programming models and languages, and ideas of their possible integration have also been given.
In this paper we discuss the runtime support required for the parallelization of unstructured data-parallel applications on nonuniform and adaptive environments. the approach presented is reasonably general and is app...
详细信息
ISBN:
(纸本)0818675829
In this paper we discuss the runtime support required for the parallelization of unstructured data-parallel applications on nonuniform and adaptive environments. the approach presented is reasonably general and is applicable to a wide variety of regular as well as irregular applications. We present performance results for the solution of an unstructured mesh on a cluster of heterogeneous workstations.
We study the scalability of 2-D discrete wavelet transform algorithms on fine-grained parallelarchitectures. the principal operation in the 2-D DWT is the filtering operation used to implement the filter banks of the...
详细信息
TOP-C is a task-oriented parallel C interface. It presents a master-slave task architecture that greatly eases the parallelization of code. It is intended for applications where a compiler would have difficulty recogn...
详细信息
ISBN:
(纸本)0818675829
TOP-C is a task-oriented parallel C interface. It presents a master-slave task architecture that greatly eases the parallelization of code. It is intended for applications where a compiler would have difficulty recognizing opportunities for data-parallelism. the model has been implemented for both shared memory processors and networks of workstations. there is also a sequential version useful during development, which runs the same application code. Ease-of-use has been a strong motivation behind its design. For this reason, TOP-C is organized in a SPMD style, with one primary subroutine call to invoke it. Its main features are: (a) task-parallelism, (b) a single shared, global data structure, and (c) restricted master-slave communication.
暂无评论