This paper addresses the problem of scheduling a DAG of unit-length tasks on asynchronous processors, that is, processors having different and changing speeds. The objective is to minimize the makespan, that is, the t...
详细信息
ISBN:
(纸本)9781595936677
This paper addresses the problem of scheduling a DAG of unit-length tasks on asynchronous processors, that is, processors having different and changing speeds. The objective is to minimize the makespan, that is, the time to execute the entire DAG. Asynchrony is modeled by an oblivious adversary, which is assumed to determine the processor speeds at each point in time. The oblivious adversary may change processor speeds arbitrarily and arbitrarily often, but makes speed decisions independently of any random choices of the scheduling algorithm. This paper gives bounds on the makespan of two randomized online firing-squad scheduling algorithms, ALL and LEVEL. These two schedulers are shown to have good makespan even when asynchrony is arbitrarily extreme. Let W and D denote, respectively, the number of tasks and the longest path in the DAG, and let gave denote the average speed of the p processors during the execution. In ALI, each processor repeatedly chooses a random task to execute from among all ready tasks (tasks whose predecessors have been executed). Scheduler ALL is shown to have a makespan T-p = circle minus (W/p pi(ave)), when W/D >= p log p circle minus ((log p)(alpha) W/p pi(ave) + (log p)(1-alpha) D/pi(ave)), when W/D = p(log p)(1-2 alpha), for alpha is an element of [0,1] circle minus (D/pi(ave)), when W/D <= p/log p, both expected and with high probability. A family of DAGs is exhibited for which this analysis is tight. In LEVEL each of the processors repeatedly chooses a random task to execute from among all critical tasks (ready tasks at the lowest level of the DAG). This second scheduler is shown to have a makespan of T-p = circle minus (W/p pi(ave) + [log* p - log* (pD/W)] W/pi(ave)), both expected and with high probability. Thus, LEVEL is always at least as good as ALL asymptotically, and sometimes better.
The proceedings contain 43 papers. The topics discussed include: publish and perish: definition and analysis of an n-person publication impact game;exponential separation of quantum and classical online space complexi...
详细信息
ISBN:
(纸本)1595934529
The proceedings contain 43 papers. The topics discussed include: publish and perish: definition and analysis of an n-person publication impact game;exponential separation of quantum and classical online space complexity;minimizing the stretch when scheduling flows of biological requests;position paper and brief announcement: the FG programming environment - good and good for you;efficient parallelalgorithms for dead sensor diagnosis and multiple access channels;on the communication complexity of randomized broadcasting in random-like graphs;strip packing with precedence constraints and strip packing with release times;on space-stretch trade-offs: lower bounds;a performance analysis of local synchronization;the cache complexity of multithreaded cache oblivious algorithms;and deterministic load balancing and dictionaries in the parallel disk model.
Andrews et al. [Automatic method for hiding latency in high bandwidth networks, in: proceedings of the acmsymposium on Theory of Computing, 1996, pp. 257-265;Improved methods for hiding latency in high bandwidth netw...
详细信息
Andrews et al. [Automatic method for hiding latency in high bandwidth networks, in: proceedings of the acmsymposium on Theory of Computing, 1996, pp. 257-265;Improved methods for hiding latency in high bandwidth networks, in: proceedings of the Eighth annualacmsymposium on parallelalgorithms and architectures, 1996, pp. 52-61] introduced a number of techniques for automatically hiding latency when performing simulations of networks with unit delay links on networks with arbitrary unequal delay links. In their work, they assume that processors of the host network are identical in computational power to those of the guest network being simulated. They further assume that the links of the host are able to pipeline messages, i.e., they are able to deliver P packets in time O(P + d) where d is the delay on the link. In this paper we examine the effect of eliminating one or both of these assumptions. In particular, we provide an efficient simulation of a linear array of homogeneous processors connected by unit-delay links on a linear array of heterogeneous processors connected by links with arbitrary delay. We show that the slowdown achieved by our simulation is optimal. We then consider the case of simulating cliques by cliques;i.e., a clique of heterogeneous processors with arbitrary delay links is used to simulate a clique of homogeneous processors with unit delay links. We reduce the slowdown from the obvious bound of the maximum delay link to the average of the link delays. In the case of the linear array we consider both links with and without pipelining. For the clique simulation the links are not assumed to support pipelining. The main motivation of our results (as was the case with Andrews et al.) is to mitigate the degradation of performance when executing parallel programs designed for different architectures on a network of workstations (NOW). In such a setting it is unlikely that the links provided by the NOW will support pipelining and it is quite probab
Hydra PPS is a collection of annotations, classes, a runtime, and a compiler designed to provide Java programmers with a fairly simple method of producing programs for Symmetric Multiprocessing (SMP) architectures. Th...
ISBN:
(纸本)9781595934529
Hydra PPS is a collection of annotations, classes, a runtime, and a compiler designed to provide Java programmers with a fairly simple method of producing programs for Symmetric Multiprocessing (SMP) architectures. This paper introduces the basics of this new system including the basic constructs for this new programming language and the relationship between the Java VM, the compiler, the runtime, and the parallel program. Hydra will exploit parallelism when the underlying architecture supports it and will run as normal sequential Java program when the architecture does not have support for parallelism. parallelism is expressed through events in Hydra, it is easy to use, and programs run efficiently on parallelarchitectures.
Suppose we have a parallel or distributed system whose nodes have limited capacities, such as processing speed, bandwidth, memory, or disk space. How does the performance of the system depend on the amount of heteroge...
详细信息
ISBN:
(纸本)9781595934529
Suppose we have a parallel or distributed system whose nodes have limited capacities, such as processing speed, bandwidth, memory, or disk space. How does the performance of the system depend on the amount of heterogeneity of its capacity distribution? We propose a general framework to quantify the worst-case effect of increasing heterogeneity in models of parallel systems. Given a cost function g(C,W) representing the system's performance as a function of its nodes' capacities C and workload W (such as the completion time of an optimum schedule of jobs W on machines C), we say that g has price of heterogeneity α when for any workload, cost cannot increase by more than a factor α if node capacities become arbitrarily more heterogeneous. We give constant bounds on the price of heterogeneity of several well-known job scheduling and graph degree/diameter problems, indicating that increasing heterogeneity can never be much of a disadvantage. On the other hand, with the introduction of timing constraints such as release times or precedence constraints on the jobs, the dependence on node capacities becomes more complex, so that increasing heterogeneity may be quite detrimental.
暂无评论