If-conversion and predicated execution are widely adopted to eliminate branch misprediction penalty. Previous predication execution depends on compiler to generate explicit predicated instructions, In this paper, a tr...
详细信息
ISBN:
(纸本)3540296395
If-conversion and predicated execution are widely adopted to eliminate branch misprediction penalty. Previous predication execution depends on compiler to generate explicit predicated instructions, In this paper, a trace-based predicate mechanism named RIMP (Runtime IMplicit Predication) is discussed. The candidates of if-conversion will be identified during dynamic execution. Conventional trace cache has been modified to store RIMP traces, which include instructions both from fall-through and target block following the conditional branch. Hardware extension will add predication to RIMP trace automatically. With the help of RIMP, legacy applications can benefit from predication mechanism without recompiling source code. simulation of RIMP implementation under diverse microarchitecture configurations is presented in the paper. Results have shown promising performance improvement. In general, RIMP with 64kB trace storage delivers an average 10.3% IPC improvement while actually speeding up the execution time by over 7%.
Compute-intensive simulations are currently good candidates for being executed on distributed computers and Grids, in particular for applications with a large number of input data whose values change throughout the si...
详细信息
ISBN:
(数字)9783540321323
ISBN:
(纸本)3540297391
Compute-intensive simulations are currently good candidates for being executed on distributed computers and Grids, in particular for applications with a large number of input data whose values change throughout the simulation time and where the communications are not a critical factor. Although the number of computations usually depends on the bulk of input data, there are applications in which the computational load depends on the particular values of some input data. We propose a general methodology to deal with the problem of improving load balance in these cases. It is divided into two main stages. The first one is an exhaustive study of the parallel code structure, using performance tools, with the aim of establishing a relationship between the values of the input data and the computational effort. The next stage uses this information and provides a mechanism to distribute the load of any particular simulating situation among the computational nodes. A load balancing strategy for the particular case of STEM-II, a compute-intensive application that simulates the behavior of pollutant factors in the air, has been developed, obtaining an important improvement in execution time.
The proceedings contain 69 papers. The special focus in this conference is on distributed Computing. The topics include: Performance of fair distributed mutual exclusion algorithms;a framework for automatic identifica...
ISBN:
(纸本)9783540240761
The proceedings contain 69 papers. The special focus in this conference is on distributed Computing. The topics include: Performance of fair distributed mutual exclusion algorithms;a framework for automatic identification of the best checkpoint and recovery protocol;distributed computation for swapping a failing edge;flexible cycle synchronized algorithm in parallel and distributedsimulation;rule mining for dynamic databases;a novel P2P based e-learn heuristic-based scheduling to maximize throughput of data-intensive grid applications;failure recovery in grid database systems;on design of cluster and grid computing environment toolkit for bioinformatics applications;study of scheduling strategies in a dynamic data grid environment;virtual molecular computing – emulating DNA molecules;complexity of compositional model checking of computation tree logic on simple structures;a multi-agent framework based on communication and concurrency;statistical analysis of a P2P query graph based on degrees and their time-evolution;t-UNITY – a formal framework for modeling and reasoning about timing constraints in real-time systems;finding pareto-optimal set of distributed vectors with minimum disclosure;a fair medium access protocol using adaptive flow-rate control through cooperative negotiation among contending flows in Ad Hoc;an adaptive transmission power control protocol for mobile ad hoc networks;a macro-mobility scheme for reduction in handover delay and signaling traffic in MIPv6;path stability based adaptation of MANET routing protocols;agent-based distributed intrusion alert system;a soft computing intrusion detection system and effect of data encryption on wireless ad hoc network performance.
This paper presents the design and implementation of a reinforcement learning agent that automatically selects appropriate loop scheduling algorithms for parallel loops embedded in time-stepping scientific application...
详细信息
This paper presents the design and implementation of a reinforcement learning agent that automatically selects appropriate loop scheduling algorithms for parallel loops embedded in time-stepping scientific applications executing on clusters. There may be a number of such loops in an application, and the loops may have different load balancing requirements. Further, loop characteristics may also change as the application progresses. Following a model-free learning approach, the learning agent assigned to a loop selects from a library the best scheduling algorithm for the loop during the lifetime of the application. The utility of the learning agent is demonstrated by its successful integration into the simulation of wave packets - an application arising from quantum mechanics. Results of statistical analysis using pairwise comparison of means on the running time of the simulation with and without the learning agent validate the effectiveness of the agent in improving the parallel performance of the simulation.
The proceedings contain 23 papers from the conference on the 18th workshop on parallel and distributedsimulation, PADS 2004. The topics discussed include: simulation validation using direct execution of wireless ad-h...
详细信息
ISBN:
(纸本)0769521118
The proceedings contain 23 papers from the conference on the 18th workshop on parallel and distributedsimulation, PADS 2004. The topics discussed include: simulation validation using direct execution of wireless ad-hoc routing protocols;detailed OFDM modeling in network simulation of mobile ad-hoc networks;event reconstruction in time warp;just-in-time cloning;towards grid-aware time warp;and the effect of detail on ethernet simulation.
Copyright and Reprint Permissions: Abstracting is permitted with credit to the source. Libraries may photocopy beyond the limits of US copyright law, for private use of patrons, those articles in this volume that carr...
Copyright and Reprint Permissions: Abstracting is permitted with credit to the source. Libraries may photocopy beyond the limits of US copyright law, for private use of patrons, those articles in this volume that carry a code at the bottom of the first page, provided that the per-copy fee indicated in the code is paid through the Copyright Clearance Center. The papers in this book comprise the proceedings of the meeting mentioned on the cover and title page. They reflect the authors' opinions and, in the interests of timely dissemination, are published as presented and without change. Their inclusion in this publication does not necessarily constitute endorsement by the editors or the Institute of Electrical and Electronics Engineers, Inc.
If a model shall be executed in a parallel, distributed instead of a sequential manner typically the entire simulation engine has to be exchanged. To adapt the simulation layer more easily to the requirements of a con...
详细信息
ISBN:
(纸本)0769521118
If a model shall be executed in a parallel, distributed instead of a sequential manner typically the entire simulation engine has to be exchanged. To adapt the simulation layer more easily to the requirements of a concrete model to be run in a specific environment a component based simulation layer has been developed for JAMES. A set of different simulator components demonstrates that a component-based design facilitates the exchange of simulators and their combination.
暂无评论