In the directed acyclic graph (dag) model of algorithms, consider the following problem for precedence-constrained multiprocessor schedules for array computations: Given a sequence of dags and linear schedules paramet...
详细信息
ISBN:
(纸本)0769515797
In the directed acyclic graph (dag) model of algorithms, consider the following problem for precedence-constrained multiprocessor schedules for array computations: Given a sequence of dags and linear schedules parameterized by n, compute a lower bound on the number of processors required by the schedule as a function of n. This problem is formulated so that the number of tasks that are scheduled for execution during any fixed time step is the number of non-negative integer solutions d(n) to a set of parametric linear Diophantine equations. Generating function methods are then used for constructing a formula for the numbers dn. We implemented this algorithm as a Mathematica program. This paper is an overview of the techniques involved and their applications to well-known schedules for Matrix-Vector Product, Triangular Matrix Product, and Gaussian Elimination dags. Some example runs and automatically produced symbolic formulas for processor lower bounds by the algorithm are given.
Observations about an approximate parallel algorithm for the Point Robot Motion Planning problem are presented The algorithm solves not only the original problem, but related problems as well.
ISBN:
(纸本)0769518672
Observations about an approximate parallel algorithm for the Point Robot Motion Planning problem are presented The algorithm solves not only the original problem, but related problems as well.
Many problems require determining the root of a real-valued function of a single variable. Maple and Marletta [6] presented a parallel algorithm that can be used to determine a simple root of such functions. High leve...
详细信息
ISBN:
(纸本)0769517315
Many problems require determining the root of a real-valued function of a single variable. Maple and Marletta [6] presented a parallel algorithm that can be used to determine a simple root of such functions. High levels of efficiency were proved for functions satisfying a set of conditions. In this work we present a framework and subsequently a piece of software to evaluate the performance of their algorithm under various conditions and for functions with varying characteristics. We also consider the effects of implementing the algorithm over a heterogeneous network, presenting results and recommendations.
Scientific, symbolic, and multimedia applications present diverse computing workloads with different types of inherent of what should be highly parallel algorithms. Retargeting to parallel mechanisms is difficult, but...
详细信息
ISBN:
(纸本)0769517994
Scientific, symbolic, and multimedia applications present diverse computing workloads with different types of inherent of what should be highly parallel algorithms. Retargeting to parallel mechanisms is difficult, but can provide significant increases in efficiency. It is desirable to estimate potential parallelism before undertaking the expensive process of reverse engineering and retargeting. This paper presents a lightweight dynamic analysis technique for characterizing the types of parallelism that are inherent in a given program to estimate the potential benefit of retargeting. The technique is validated on Spec95 and MediaBench benchmarks widely used to evaluate processor performance. Results correlate well with previous experience in parallelizing these well-understood applications.
In the paper the task of concurrent analysis of a Petri net is considered. A Petri net is given, and several processes able to simulate transition firings. The methods of analysis described in this paper are based on ...
详细信息
ISBN:
(纸本)0769517307;0769517315
In the paper the task of concurrent analysis of a Petri net is considered. A Petri net is given, and several processes able to simulate transition firings. The methods of analysis described in this paper are based on the original approach to net decomposition and oriented for the so-called operational nets and a class of cyclic Petri nets. The methods analyze the nets by reduced state space constructing;both their sequential and parallel versions are described. Also the algorithm of decomposition oriented to concurrent analysis is described. The suggested methods of analysis can be implemented as a multithread application.
PVM and MPI, two systems for programming clusters, are often compared. The comparisons usually start with the unspoken assumption that PVM and MPI represent different solutions to the same problem. In this paper we sh...
详细信息
ISBN:
(纸本)0769517455
PVM and MPI, two systems for programming clusters, are often compared. The comparisons usually start with the unspoken assumption that PVM and MPI represent different solutions to the same problem. In this paper we show that, in fact, the two systems often are solving different problems. In cases where the problems do match but the solutions chosen by PVM and MPI are different, we explain the reasons for the differences. Usually such differences can be traced to explicit differences in the goals of the two systems, their origins, or the relationship between their specifications and their implementations. For example, we show that the requirement for portability and performance across many platforms caused MPI to choose approaches different from those made by PVM, which is able to exploit the similarities of network-connected systems.
On the basis of analysis to the thermal process of power plant and simple genetic algorithm, a improved genetic algorithm is introduced to optimize controller parameters of thermal process. In the genetic algorithm, f...
详细信息
ISBN:
(纸本)7506255715
On the basis of analysis to the thermal process of power plant and simple genetic algorithm, a improved genetic algorithm is introduced to optimize controller parameters of thermal process. In the genetic algorithm, floating-point genes, rank-based model, elitist model and parallel algorithm are used. The premature convergence is eliminated, the global and local searching ability is improved. The MATLAB program of genetic algorithm of controller parameters optimization is given. A single loop PID control system and a superheat steam control system is optimized and optimal parameters are given. It is shown by simulation research that the genetic algorithm is better than traditional methods, satisfactory, control results can be got with the optimized parameters.
We present a new parallelization technique that significantly improves performance of certain data-parallel algorithms on heterogeneous clusters of workstations. The two main goals of our technique are to improve exec...
详细信息
ISBN:
(纸本)0769517455
We present a new parallelization technique that significantly improves performance of certain data-parallel algorithms on heterogeneous clusters of workstations. The two main goals of our technique are to improve execution times (compared to traditional parallelization techniques) and to efficiently use the computing resources available in the cluster. The technique is based on a pre-processing phase where information about the cluster is obtained, a load balanced data decomposition is derived, and information is generated to guide the cluster node utilization during the execution of the parallel algorithm. We applied our technique to Gaussian Elimination and Pairwise Interaction Problems, the experiments show speedup improvements up to 133% and 275% respectively and the cluster utilization efficiency improves up to 180% and 300% when compared to traditional parallelization techniques.
In this paper, we propose a new method for the computation of the algorithm of robust Q-mode principal component analysis (RQMPCA) used in statistics. We will show how we can reduce the computation complexity of this ...
详细信息
ISBN:
(纸本)0769516262
In this paper, we propose a new method for the computation of the algorithm of robust Q-mode principal component analysis (RQMPCA) used in statistics. We will show how we can reduce the computation complexity of this algorithm by p, where p is the number of variables. An application, on web document retrieval time, was studied using this algorithm. We will report some statistical results on retrieval time and its relationship with document's size and its number of objects.
In this paper, we first present theoretical results, helping to understand the unfolding algorithm presented in [6,7]. We then propose a modification of this algorithm, which can be efficiently parallelised and admits...
详细信息
ISBN:
(纸本)3540434194
In this paper, we first present theoretical results, helping to understand the unfolding algorithm presented in [6,7]. We then propose a modification of this algorithm, which can be efficiently parallelised and admits a more efficient implementation. Our experiments demonstrate that the degree of parallelism is usually quite high and resulting algorithms potentially can achieve significant speedup comparing with the sequential case.
暂无评论