This paper presents the results of an experimental study to evaluate the effectiveness of multiple synchronization protocols and partitioning algorithms in reducing the execution time of switch-level models of VLSI ci...
ISBN:
(纸本)9780818679650
This paper presents the results of an experimental study to evaluate the effectiveness of multiple synchronization protocols and partitioning algorithms in reducing the execution time of switch-level models of VLSI circuits. Specific contributions of this paper include: (i) parallelizing an existing switch-level simulator such that the model can be executed using conservative and optimistic simulation protocols with minor changes, (ii) evaluating effectiveness of several partitioning algorithms for parallelsimulation, and (iii) demonstrating speedups with both conservative and optimistic simulation protocols for seven circuits, ranging in size from 3K transistors to about 87K transistors.
Internet data traffic is doubling each year, yet bandwidth does not appear to be growing as fast as expected and thus short falls in available bandwidth, particularly at the "last mile" may result. To addres...
详细信息
ISBN:
(纸本)9780769519708
Internet data traffic is doubling each year, yet bandwidth does not appear to be growing as fast as expected and thus short falls in available bandwidth, particularly at the "last mile" may result. To address these bandwidth allocation and congestion problems, researchers are proposing new overlay networks that provide a high quality of service and a near lossless guarantee. However, the central question raised by these new services is what impact will they have in the large? To address these and other network engineering research questions, high performance simulation tools are required. However, to date, optimistic techniques have been viewed as operating outside of the performance envelope for Internet protocols, such as TCP, OSPF and BGP. We dispel those views and demonstrate that optimistic protocols are able to efficiently simulate large scale TCP scenarios for realistic, network topologies using a single Hyper-Threaded computing system costing less than $7000 USD. For our real world topology, we use the core AT&T US network. Our optimistic simulator yields extremely high efficiency and many of our performance runs produce zero rollbacks. Our compact modelling framework reduces the amount of memory required per TCP connection and thus our memory overhead per connection for one of our largest experimental network topologies was 2.6 KB. That value was comprised of all events used to model TCP packets, TCP connection state and routing information.
Large-scale ecological simulations are natural candidates for distributed discrete event simulation. In optimistic simulation of spatially explicit models, a difficult problem arises when individuals migrate between p...
详细信息
ISBN:
(纸本)9780818679650
Large-scale ecological simulations are natural candidates for distributed discrete event simulation. In optimistic simulation of spatially explicit models, a difficult problem arises when individuals migrate between physical regions simulated by different logical processes. We present a solution to this problem that uses shared object states. Shared states allow for efficient communication between LPs and for early detection of canceled events. We briefly describe an optimistic simulation environment called EcoKit, which operates on top of the WarpKit implementation of Time Warp. Our experiments with this system on a shared memory multiprocessor show that EcoKit promises to scale well both with the number of processors and the number of individuals simulated.
This paper addresses the issue of efficient and accurate performance prediction of large-scale message-passing applications on high performance architectures using simulation. Such simulators are often based on parall...
详细信息
ISBN:
(纸本)076951104X
This paper addresses the issue of efficient and accurate performance prediction of large-scale message-passing applications on high performance architectures using simulation. Such simulators are often based on parallel discrete event simulation, typically using the conservative protocol to synchronize the simulation threads. The paper considers how a compiler can be used to automatically extract information about the lookahead present in the application and how this can be used to improve the performance of the null protocol used for synchronization. These techniques are implemented in the MPI-Sim simulator and dHPF compiler which had previous been extended to work together for optimizing the simulation of local computational components of an application. The results show that the availability of lookahead ranging improves the runtime of the simulator by factors ranging front 9% up to two orders of magnitude, with 30-60% improvements being typical for the real-world codes. The experiments also show that these improvements are directly correlated with reductions by the number of null messages required by the simulations.
Scalability is recognised as a primary factor to be considered in the design of distributed systems. The scalability of object-oriented middleware CORBA is becoming a major concern as it has emerged as a standard arch...
详细信息
Scalability is recognised as a primary factor to be considered in the design of distributed systems. The scalability of object-oriented middleware CORBA is becoming a major concern as it has emerged as a standard architecture for distributed object computing. In this paper, a systematic scalability analysis of the basic components of the CORBA specification is attempted. From this analysis, the Portable Object Adapter (POA) and the Implementation Repository (IR) are identified to influence the scale of a CORBA-based system. The specification of the POA provides enough feasibility for the application designer to handle scalability. The existing implementations of IR have a tradeoff between scalability and object migration. A scalable design of the IR is proposed which allows individual objects to migrate without compromising scalability. A performance comparison of the proposed model with existing IR designs is made using a simulation study.
Recent experiments have shown that conservative methods can achieve good performance by exploiting the characteristics of the system being simulated. In this paper we focus on the interrelationship between run time an...
ISBN:
(纸本)9781565550551
Recent experiments have shown that conservative methods can achieve good performance by exploiting the characteristics of the system being simulated. In this paper we focus on the interrelationship between run time and synchronization requirements of a distributedsimulation. A metric that considers the effect of lookahead and the physical rate of transmission of messages, and an arrival approximation that models the effect of synchronization requirements on the run time are developed. It is shown that even when good lookahead is exploited in the system, poor run-time performance is achieved if an inefficient mapping of LPs to processors is used.
The scheduling of tasks in distributed real-time systems has attracted many researchers in the recent past. The distributed real-time system considered here consists of uniprocessor or multiprocessor nodes connected t...
详细信息
The scheduling of tasks in distributed real-time systems has attracted many researchers in the recent past. The distributed real-time system considered here consists of uniprocessor or multiprocessor nodes connected through a multihop network. Scheduling in such a system involves scheduling of dynamically arriving tasks within a node (local scheduling) and migration of tasks across the network (global scheduling) if it is not possible to schedule them locally. Most of the existing schemes on distributed real-time task scheduling ignore the underlying message scheduling required for global scheduling of tasks. These schemes consider the load on the processors at a node as the basis to migrate tasks from a heavily loaded node (sender) to a lightly loaded node (receiver). We believe that the identification of a receiver node should be based not only on the load on its processors, but also on the availability of a lightly loaded path from the sender to that receiver. In this paper we present an integrated framework for distributed real-time dynamic task scheduling (i) by proposing algorithms for transfer location, and information policies which take into account the states of both the processors and the links, and (ii) by proposing interactions among these policies and schedulers so that the guarantee ratio (ratio of number of tasks guaranteed to the number of tasks arrived) is improved as compared to algorithms where only local scheduling is done. For local scheduling, we use a variation of myopic algorithm. The effectiveness of the proposed framework has been evaluated through simulation.
暂无评论