A heterogeneous computing system provides a variety of different machines, orchestrated to perform an application whose subtasks have diverse execution requirements. The subtasks must be assigned to machines (matching...
详细信息
A heterogeneous computing system provides a variety of different machines, orchestrated to perform an application whose subtasks have diverse execution requirements. The subtasks must be assigned to machines (matching) and ordered for execution (scheduling) such that the overall application execution time is minimized. A new dynamic mapping (matching and scheduling) heuristic called the hybrid remapper is presented here. The hybrid remapper is based on a centralized policy and improves a statically, obtained initial matching and scheduling by remapping to reduce the overall execution time. The remapping is non-preemptive and the execution of the hybrid remapper can be overlapped with the execution of the subtasks. During application execution, the hybrid remapper uses run-time values for the subtask completion times and machine availability times whenever possible. Therefore, the hybrid remapper bases its decisions on a mixture of run-time and expected values. The potential of the hybrid remapper to improve the performance of initial static mappings is demonstrated using simulation studies.
General-purpose programmers have come to expect a high degree of portability among widely varying architectures. Advances in run-time systems for parallel programs have been proposed in order to harness available reso...
详细信息
ISBN:
(纸本)9780818685798
General-purpose programmers have come to expect a high degree of portability among widely varying architectures. Advances in run-time systems for parallel programs have been proposed in order to harness available resources as efficiently as possible. Simultaneously, advances in algorithmic methods of dynamically balancing computational load have been proposed in order to respond to variations in actual performance and therefore in run-time. The primary mechanism for harnessing idle resources effectively, task migration, can be used alongside the primary mechanism for dynamic load balancing, data redistribution. Besides the fact that the two methods can be used simultaneously to spur further increases in performance, the run-time information-gathering infrastructure necessary to detect and use idle resources can also benefit dynamically load-balanced applications. This paper describes an architecture for and preliminary implementation of a system that combines data-parallel load balancing with task-parallel load balancing. Performance test results are included as well.
It is increasingly common for computer users to have access to several computers on a network, and hence to be able to execute many of their tasks on any of several computers. The choice of which computers execute whi...
详细信息
It is increasingly common for computer users to have access to several computers on a network, and hence to be able to execute many of their tasks on any of several computers. The choice of which computers execute which tasks is commonly determined by users based on a knowledge of computer speeds for each task and the current load on each computer. A number of task scheduling systems have been developed that balance the load of the computers on the network, but such systems tend to minimize the idle time of the computers rather than minimize the idle time of the users. The paper focuses on the benefits that can be achieved when the scheduling system considers both the computer availabilities and the performance of each task on each computer. The SmartNet resource scheduling system is described and compared to two different resource allocation strategies: load balancing and user directed assignment. Results are presented where the operation of hundreds of different networks of computers running thousands of different mixes of tasks are simulated in a batch environment. These results indicate that, for the computer environments simulated, SmartNet outperforms both load balancing and user directed assignments, based on the maximum time users must wait for their tasks to finish.
Even the more or less "canonical", lower-level architecture of information systems needs to be revisited from time to time. Notions like persistence and transactions belong traditionally to the area of datab...
详细信息
Even the more or less "canonical", lower-level architecture of information systems needs to be revisited from time to time. Notions like persistence and transactions belong traditionally to the area of database management systems. There are, however many applications, such as CAD, VLSI design or simulation, which need persistence and could take advantage of transactions, but require especially fast implementations not provided by DBMS. We describe a low-level transaction concept used to implement our parallel main memory object store (PPOST), to provide main memory access times combined with the safety and convenience of transactions.
The proceedings contains 26 papers from the 11th workshop on parallel and distributedsimulation (PADS'97). Topics discussed include: conservative simulation;architecture and VLSI simulation;event simultaneity;VLS...
详细信息
The proceedings contains 26 papers from the 11th workshop on parallel and distributedsimulation (PADS'97). Topics discussed include: conservative simulation;architecture and VLSI simulation;event simultaneity;VLSI circuit partitioning;optimistic simulation;Petri net simulation;interconnection computer network;distributedsimulation;Hierarchical Tool HIT;Multi-Resolution Entity;bulk synchronous parallel models;and parallelsimulation environments.
This paper shows how the Prosit system, a new C + +-based framework for both sequential and distributed discrete event simulation, developed at INRIA-Sophia-Antipolis, makes an easy and efficient integration of classi...
详细信息
This paper shows how the Prosit system, a new C + +-based framework for both sequential and distributed discrete event simulation, developed at INRIA-Sophia-Antipolis, makes an easy and efficient integration of classical discrete event simulation and of new high speed simulation techniques based on evolution equations possible. This demonstrates in particular the feasibility of distributedsimulations involving simulators of different nature. Important applications of these techniques may be found in the simulation of communication switches, as illustrated in an example.
The unthrottled optimism underlying the Time Warp (TW) parallelsimulation protocol can lead to excessive aggressiveness in memory consumption due to saving state histories, and waste of CPU cycles due to overoptimist...
详细信息
ISBN:
(纸本)081867931X
The unthrottled optimism underlying the Time Warp (TW) parallelsimulation protocol can lead to excessive aggressiveness in memory consumption due to saving state histories, and waste of CPU cycles due to overoptimistically progressing simulations that eventually have to be ''rolled back''. Furthermore, in TW simulations executing in distributed memory environments, the communication overhead induced by the rollback mechanism can cause pathological overall simulation performance. In this work direct optimism control mechanisms are used to overcome these shortcomings by probabilistically controlling simulation progression based on the forecasted time stamp of forthcoming messages. Several forecast methods are presented and their performance is compared for very large Petri net simulation models executed with the TW protocol on the Meiko CS-2.
With two examples we show the suitability of the bulk-synchronous parallel (BSP) model for discrete-event simulation of homogeneous large-scale systems. This model provides a unifying approach for general purpose para...
详细信息
ISBN:
(纸本)0818679654
With two examples we show the suitability of the bulk-synchronous parallel (BSP) model for discrete-event simulation of homogeneous large-scale systems. This model provides a unifying approach for general purpose parallel computing which in addition to efficient and scalable computation, ensures portability across different parallel architectures. A valuable feature of this approach is a simple cost model that enables precise performance prediction of BSP algorithms. We show both theoretically and empirically that systems with uniform event occurrence among their components, such as colliding hard-spheres and ising-spin models, can be efficiently simulated in practice on current parallel computers supporting the BSP model.
This paper presents the tolerant, hybrid synchronization schema and its benefits for the parallel and distributedsimulation of interconnected computer networks. The hybrid schema combines conservative and optimistic ...
详细信息
ISBN:
(纸本)0818679654
This paper presents the tolerant, hybrid synchronization schema and its benefits for the parallel and distributedsimulation of interconnected computer networks. The hybrid schema combines conservative and optimistic synchronization approaches by using lookahead for scheduling special events and using the flexibility of Time Warp in certain cases. In addition to these classical approaches the introduction of the ''tolerance'' allows the distributed modules to simulate further ahead than guaranteed by the conservative synchronization schema. This results in significantly smaller simulation runtimes and many other benefits.
暂无评论