Two parallel matrix multiplication algorithms are presented in this paper. These algorithms executes on a grid with toroidal connections. Their novelty is the utilization of communication schemes which theoretically a...
详细信息
Two parallel matrix multiplication algorithms are presented in this paper. These algorithms executes on a grid with toroidal connections. Their novelty is the utilization of communication schemes which theoretically are distance insensitive;the impact on the communication and computational complexities and costs compared with a theoretical analysis, is analyzed and evaluated. The proposed algorithms have been implemented on a MasPartm array. An experimental evaluation of these algorithms is performed. A comparison is made for matrix multiplication between the MasPartm and the SUNtm-4/390.
Rule based systems are computationally very demanding, since a large number of rules has to be evaluated every time new input data are observed in order to undertake a corresponding action. In this paper we study the ...
详细信息
ISBN:
(纸本)0818656506
Rule based systems are computationally very demanding, since a large number of rules has to be evaluated every time new input data are observed in order to undertake a corresponding action. In this paper we study the possible improvements in performance by using parallel processing. Both fine grade algorithms and standard multiprocessor architectures for Mamdani-type fuzzy systems with two inputs and one output are considered. It is shown, that a speed-up close to linear may be achieved. The results may be extended to systems with more than two inputs.
Massively parallel Processing Project started in 1992 as a Priority Area of Research for the Ministry of Education in Japan. The objective of this research project is to establish the basic technology of massively par...
详细信息
Massively parallel Processing Project started in 1992 as a Priority Area of Research for the Ministry of Education in Japan. The objective of this research project is to establish the basic technology of massively parallel processing which is expected to be the fundamental tool to develop the high-level technologies of 21 century. The main goal of this project is to build up a system prototype of massively parallel processing system. This paper describes the organization of this project and discusses the research results up to this time.
In this paper, a distributed algorithm for validating message passing-machines is presented and evaluated. Our approach is based on adaptive distributed diagnosis of multiprocessor systems in a user environment where ...
详细信息
In this paper, a distributed algorithm for validating message passing-machines is presented and evaluated. Our approach is based on adaptive distributed diagnosis of multiprocessor systems in a user environment where a full self-diagnosis is not needed. We analyze the algorithm performance using a model based on an open queueing network.
parallel multi-layer classifier architectures with an increasing hierarchical order have offered much flexibility in design to deal with a wide variety of properties. The model of pipeline processing is especially app...
详细信息
parallel multi-layer classifier architectures with an increasing hierarchical order have offered much flexibility in design to deal with a wide variety of properties. The model of pipeline processing is especially appropriate for realising such architectures. This has provided hierarchical classifiers a distinct advantage in real-time applications to cope with the important demand for high operating speed, in addition to a potentially better classification performance. An example application of a cascaded form of the BWS and FWS networks, both of which are representatives of the array memory based statistical classifier, is described in this paper. As with most pipelined architecture, the complex interactions between successive processing layers of the cascaded network represent a major drawback, and they impose performance bottlenecks which challenge the use of a highly parallel realisation of the classifier. This paper describes an efficient data parallel implementation of the BWS-FWS. For completeness, a brief review of the multi-layer classifiers is first presented. The new algorithm for combining the BWS and FWS networks is described and implemented on two distributed memory processor arrays, the MasPar MP-1 and a network of transputers. An analysis of the performance obtained is also presented.
A virtual parallel machine is presented. The virtual machine includes the following programs: loop parallelization, dependence graph building, scheduling job programs, compiler and simulating programs. The basic princ...
详细信息
A virtual parallel machine is presented. The virtual machine includes the following programs: loop parallelization, dependence graph building, scheduling job programs, compiler and simulating programs. The basic principles and ideas, on which the programs were realized, are expounded. It is enumerated the basic opportunities and advantages of the virtual machine. The results of simulation on the virtual machine are presented.
Dynamic load balancing is crucial for the performance of many parallel algorithms. Random Polling, a simple randomized load balancing algorithm, has proved to be very efficient in practice for applications like parall...
详细信息
Dynamic load balancing is crucial for the performance of many parallel algorithms. Random Polling, a simple randomized load balancing algorithm, has proved to be very efficient in practice for applications like parallel depth first search. This paper presents a detailed analysis of the algorithm taking into account many aspects of the underlying machine and the application to be load balanced. It derives tight scalability bounds which are for the first time able to explain the superior performance of Random Polling analytically. In some cases, the algorithm even turns out to be optimal. Some of the proof-techniques employed might also be useful for the analysis of other parallel algorithms.
We introduce the HYperC language, a Data parallel extension of C intended for portability over a wide range of architectures. We present the main topics of the language: the explicit parallelism through the data, the ...
详细信息
ISBN:
(纸本)0818656026
We introduce the HYperC language, a Data parallel extension of C intended for portability over a wide range of architectures. We present the main topics of the language: the explicit parallelism through the data, the synchronous semantics and the parallel flow control that allows asynchronous execution, new function qualifiers to emphasize locality properties code and at last new communication technics to allow overlap of communications and computations even for irregular computations. All these features are discussed with respect to portability and code reusability issues.
In this paper, we investigate some properties of identification matrices and exhibit some uses of identification matrices in studying the graph isomorphism problem, a well-known long-standing open problem. We show tha...
详细信息
In this paper, we investigate some properties of identification matrices and exhibit some uses of identification matrices in studying the graph isomorphism problem, a well-known long-standing open problem. We show that, given two m×n identification matrices representing two graphs according to a certain relation, isomorphism can be decided efficiently in parallel if an m × (n - c) submatrix, for a constant c, satisfies the consecutive/circular 1's property. The result presented here significantly broadens the class of graphs for which there are known efficient parallel isomorphism testing algorithms.
parallel algorithms developed for CAD problems today suffer from three important drawbacks. first, they are machine specific and tend to perform poorly on architectures other than the one for which they were designed....
详细信息
ISBN:
(纸本)0818656026
parallel algorithms developed for CAD problems today suffer from three important drawbacks. first, they are machine specific and tend to perform poorly on architectures other than the one for which they were designed. Second, they cannot use the latest advances in improved versions of the sequential algorithms for solving the problem. Third, the quality of results degrade significantly during parallel execution. In this paper we address these three problems for an important CAD application: standard cell placement. We have developed a new parallel placement algorithm that is portable across a range of MIMD parallelarchitectures. The algorithm is part of the ProperCAD project which allows the development and implementation of a parallel algorithm such that it can be executed on a wide variety of parallel machines without any change to the source. The parallel placement algorithm is based on an existing implementation of the sequential simulated annealing algorithm, TimberWolfSC 6.0 [1].
暂无评论