The unthrottled optimism underlying the Time Warp (TW) parallelsimulation protocol can lead to excessive aggressiveness in memory consumption due to saving state histories, and waste of CPU cycles due to overoptimist...
详细信息
ISBN:
(纸本)081867931X
The unthrottled optimism underlying the Time Warp (TW) parallelsimulation protocol can lead to excessive aggressiveness in memory consumption due to saving state histories, and waste of CPU cycles due to overoptimistically progressing simulations that eventually have to be ''rolled back''. Furthermore, in TW simulations executing in distributed memory environments, the communication overhead induced by the rollback mechanism can cause pathological overall simulation performance. In this work direct optimism control mechanisms are used to overcome these shortcomings by probabilistically controlling simulation progression based on the forecasted time stamp of forthcoming messages. Several forecast methods are presented and their performance is compared for very large Petri net simulation models executed with the TW protocol on the Meiko CS-2.
The proceedings contain 7 papers. The topics discussed include: high level architecture for simulation;DEVS formalism as a framework for advanced distributedsimulation;simulation of fine-grained parallel algorithms w...
ISBN:
(纸本)0818677732
The proceedings contain 7 papers. The topics discussed include: high level architecture for simulation;DEVS formalism as a framework for advanced distributedsimulation;simulation of fine-grained parallel algorithms with the ALT (animating language tools) system;event synchronization in multi-user virtual reality environments;feedback control in time warp synchronized parallel simulators;COVERS 3.0 - an object-oriented environment for modeling, simulation, and analysis of real-time concurrent systems;design and simulations of cellular neural-like associative memory;and design of high-speed parallel arithmetic algorithms and architectures.
This paper studies the problem of load balancing for conservative parallelsimulations for execution on a multicomputer. The synchronization protocol makes use of Chandy-Misra null-messages. We propose a dynamic load ...
详细信息
This paper studies the problem of load balancing for conservative parallelsimulations for execution on a multicomputer. The synchronization protocol makes use of Chandy-Misra null-messages. We propose a dynamic load balancing algorithm which assumes no compile time knowledge about the workload parameters. It is based upon a process migration mechanism, and the notion of CPU-queue length, which indicates the workload at each processor. We examine two variations for the algorithm which we refer to as centralized and multi-level hierarchical methods, in the context of queueing network simulation of a torus. The torus was chosen because it of its many cycles aid in the formation of deadlock making it a stress test for any conservative synchronization protocols. Our experiments indicate that our dynamic load balancing schemes significantly reduce the run time of an optimized version of Chandy-Misra null message approach, and decreases by 30-40% the synchronization overhead when compared to the use of a static partitioning algorithm. Significantly, the results obtained also indicate that the multi-level scheme always outperforms both the centralized load balancing approach and the static partitioning algorithm.
It is important to understand and efficiently predict the performance of large codes executing on massively parallel machines. However, these very large machines are scarce, expensive, and generally unavailable to lar...
详细信息
It is important to understand and efficiently predict the performance of large codes executing on massively parallel machines. However, these very large machines are scarce, expensive, and generally unavailable to large segments of the research community. It is therefore important to implement performance analysis tools for such machines on platforms that are readily available to the research community at large. To meet this need, we have ported LAPSE, a parallel direct-execution simulator, from the Intel Paragon to an ordinary cluster of workstations. The goal of this research is to provide researchers the opportunity to study codes designed for execution on a massively parallel machine while physically executing on a workstation cluster. However, we encountered significant performance problems when moving to a workstation cluster, due primarily to high communication and context switching costs. To reduce these costs, we implemented the virtual processors of the simulated system using light-weight threads rather than heavy-weight Unix processes. In this paper, we discuss the issues involved in moving from a process-based to a thread-based simulator, and demonstrate up to a four fold increase in performance by doing so.
The scheduling of tasks in distributed real-time systems has attracted many researchers in the recent past. The distribute real-time system considered here consists of uniprocessor or multiprocessor nodes connected th...
详细信息
ISBN:
(纸本)0818680970
The scheduling of tasks in distributed real-time systems has attracted many researchers in the recent past. The distribute real-time system considered here consists of uniprocessor or multiprocessor nodes connected through a multihop network. Scheduling in such a system involves scheduling of dynamically arriving tasks within a node (local scheduling) and migration of tasks across the network (global scheduling) if it is not possible to schedule them locally. Most of the existing schemes on distributed real-time task scheduling ignore the underlying message scheduling required for global scheduling of tasks. These schemes consider the load on the processors at a node as the basis to migrate tasks from a heavily loaded node (sender) to a lightly loaded node (receiver). We believe that the identification of a receiver node should by based not only on the load on its processors, but also on the availability of a lightly loaded path from the sender to that receiver. In this paper, we present an integrated framework for distributed real-time dynamic task scheduling (i) by proposing algorithms for transfer, location, and information policies which take into account the states of both the processors and the links, and (ii) by proposing interactions among these policies and schedulers so that the guarantee ratio (ratio of number of tasks guaranteed to the number of tasks arrived) is improved as compared to algorithms where only local scheduling is done. For local scheduling, we use a variation of myopic algorithm [10]. The effectiveness of the proposed framework has been evaluated through simulation.
Load balancing is an important component in improving the efficiency of distributed systems because it distributes an even workload over all processors. This paper considers the problem of load balancing a conservativ...
详细信息
Load balancing is an important component in improving the efficiency of distributed systems because it distributes an even workload over all processors. This paper considers the problem of load balancing a conservative parallelsimulation for execution on a multi-computer. The synchronization protocol makes use of Chandy-Misra null-messages. Earlier study conducted by Boukerche and Tropper showed that static load balancing for consecutive parallelsimulation is effective when the workload can be sufficiently well characterized beforehand. In this paper, we present a dynamic load balancing algorithm which assumes no compile time knowledge about the workload parameters. It is based upon a process migration mechanism, and the notion of CPU-queue length, which indicates the workload at each processor. We discuss the algorithm, its implementation, and report on the performance results of simulation of FCFS queueing network models on an Intel Paragon A4.
Presents an algorithm for computing a sum of products, realizing a fundamental compound multiply-and-add operation of high-speed arithmetic. Two new cellular pipelined algorithms and architectures (2D and 3D) are prop...
详细信息
With rapid advances in computer and communication technologies, there is an increasing demand to build and maintain large image repositories. In order to reduce the demands on I/O and network resources, multiresolutio...
详细信息
ISBN:
(纸本)9780897919661
With rapid advances in computer and communication technologies, there is an increasing demand to build and maintain large image repositories. In order to reduce the demands on I/O and network resources, multiresolution representations are being proposed for the storage organization of images. Image decomposition techniques such as wavelets can be used to provide these multiresolution images. The original image is represented by several coefficients, one of them with visual similarity to the original image, but at a lower resolution. These visually similar coefficients can be thought of as thumbnails or icons of the original image. This paper addresses the problem of storing these multiresolution coefficients on disks so that thumbnail browsing as well as image reconstruction can be performed efficiently. Several strategies are evaluated to store the image coefficients on parallel disks. These strategies can be classified into two broad classes depending on whether the access pattern of the images is used in the placement. Disk simulation is used to evaluate the performance of these strategies. simulation results are validated with results from experiments with real disks and are found to be in good agreement. The results indicate that significant performance improvements can be achieved with as few as four disks by placing image coefficients based upon browsing access patterns.
Presents an experimental approach toward designing a language interface for fine-grained parallel algorithms simulation. The deep integration of graphical and textual elements at the level of the source code is its ma...
详细信息
The proceedings contains 66 papers. Topics discussed include computer system requirement analysis and specifications, parallel and distributed systems, model based system design, software engineering, computer simulat...
详细信息
The proceedings contains 66 papers. Topics discussed include computer system requirement analysis and specifications, parallel and distributed systems, model based system design, software engineering, computer simulation, object oriented design and development, real time systems and mechatronics, computer based information systems, system design methodologies.
暂无评论