This tutorial will present an overview of techniques for architectural-level performance and power analysis of computersystems. It starts with a discussion of metrics for both performance and power, followed by an ov...
详细信息
This tutorial will present an overview of techniques for architectural-level performance and power analysis of computersystems. It starts with a discussion of metrics for both performance and power, followed by an overview of some widely used benchmarks including SPEC, Mediabench, and MiBench. It then illustrates the use of these benchmarks with some published performance results. After this initial overview, the tutorial will focus on a discussion of architectural simulators to measure performance and *** simulators model systems on a (clock) cycle-by-cycle basis. Their operation will be illustrated with two popular examples: SimpleScalar and M5. Besides performance analysis, these simulators can be extended to include power estimation. Full simulations of complete applications can be extremely time consuming. The tutorial will explain how sampling techniques can be used to reduce simulation time. Finally, it will conclude with a discussion on the accuracy that can be expected from architectural simulators.
The proceedings contains 55 papers from the conference on SIGMETRICS 2004/Performance 2004: Joint international on measurement and modeling of computersystems. The topics discussed includes: flow classification by hi...
详细信息
The proceedings contains 55 papers from the conference on SIGMETRICS 2004/Performance 2004: Joint international on measurement and modeling of computersystems. The topics discussed includes: flow classification by histograms or how to go on safari in the internet;two-level processor-sharing scheduling disciplines: mean delay analysis;performance aware tasking for environmentally powered sensor networks;the impact of BGP dynamics on intra-domain traffic;wireless data performance in multi-cell scenarios;a quantitative analysis of partitioning in mobile ad hoc networks and isolating the performance impacts of network interface cards through microbenchmarks.
The peerpressure algorithms for automatic troubleshooting are discussed. Troubleshooting misconfigured applications are an important part of the technical support. Troubleshooting effectiveness and automation are two ...
详细信息
The peerpressure algorithms for automatic troubleshooting are discussed. Troubleshooting misconfigured applications are an important part of the technical support. Troubleshooting effectiveness and automation are two essential goals in designing a troubleshooting system. A PeerPressure troubleshooting system is used to correct the sick machines through the use of "App Tracer".
A method for controlling multiple-tiered Web site performance, both by bounding response times and preventing overload, was discussed. A completely self-tuning admission controller for 3-tiered websites based on class...
详细信息
A method for controlling multiple-tiered Web site performance, both by bounding response times and preventing overload, was discussed. A completely self-tuning admission controller for 3-tiered websites based on classical control theoretic ideas was designed. The controller was implemented in the form of a proxy, called Yaksha. It was shown that Yaksha is able to bound response times for requests while maintaining a high throughput under overload.
The various approaches to estimating the distribution of queuing delays on a particular remote network link are discussed. One technique requires only the ability to send a packet over a given link, to the tail of the...
详细信息
The various approaches to estimating the distribution of queuing delays on a particular remote network link are discussed. One technique requires only the ability to send a packet over a given link, to the tail of the link, without any restrictions on the path taken to get to that link. Another alternative is to indirectly infer internal network statistics by injecting probe packets from one source to multiple destinations and correlating the observed packet behavior on the resulting tree topology. The third approach, implemented by tools such as cing and Tulip, is based on direct measurement using ICMP Timestamp packets, which provide accurate one-way per-hop delay estimates, and use only existing infrastructure.
The effect of connectivity of networks on the performance of distributed algorithms in mobile ad hoc networks (MANET) is discussed. Network partitioning occurs in cases where connectivity is low due to the node mobili...
详细信息
The effect of connectivity of networks on the performance of distributed algorithms in mobile ad hoc networks (MANET) is discussed. Network partitioning occurs in cases where connectivity is low due to the node mobility. An extensive simulation study has been conducted to show the impact of node mobility, density and transmission range on the metrics describing the partitioning of MANET. The simulation shows that as the size of network increases, the difference between the node rate and the system-wide rate become smaller.
Performance of binary search tree (BST) variants in different real-world scenarios was investigated. It was shown that BST data structures and node representations should be chosen based on expected patterns in the in...
详细信息
Performance of binary search tree (BST) variants in different real-world scenarios was investigated. It was shown that BST data structures and node representations should be chosen based on expected patterns in the input and the mix of operations to be performed. It was found that in selecting data structures, unbalanced BSTs are best when randomly ordered input can be relied upon. For node representation, the parent pointers were found to be generally fastest indicating their preference as long as the cost of an additional pointer field per node is not important.
This paper describes a toolkit for semi-automatically measuring and modeling static and dynamic characteristics of applications in an architecture-neutral fashion. For predictable applications, models of dynamic chara...
详细信息
This paper describes a toolkit for semi-automatically measuring and modeling static and dynamic characteristics of applications in an architecture-neutral fashion. For predictable applications, models of dynamic characteristics have a convex and differentiable profile. Our toolkit operates on application binaries and succeeds in modeling key application characteristics that determine program performance. We use these characterizations to explore the interactions between an application and a target architecture. We apply our toolkit to SPARC binaries to develop architecture-neutral models of computation and memory access patterns of the ASCI Sweep3D and the NAS SP, BT and LU benchmarks. From our models, we predict the L1, L2 and TLB cache miss counts as well as the overall execution time of these applications on an Origin 2000 system. We evaluate our predictions by comparing them against measurements collected using hardware performance counters.
Balancing peer-to-peer graphs, including zone-size distributions, has recently become an important topic of peer-to-peer (P2P) research. To bring analytical understanding into the various peer-join mechanisms, we stud...
详细信息
Balancing peer-to-peer graphs, including zone-size distributions, has recently become an important topic of peer-to-peer (P2P) research. To bring analytical understanding into the various peer-join mechanisms, we study how zone-balancing decisions made during the initial sampling of the peer space affect the resulting zone sizes and derive several asymptotic results for the maximum and minimum zone sizes that hold with high probability.
Optimization of quality of scalable video streams on peer-to-peer (P2P) networks is discussed. The video can be encoded into layers using scalable coding to obtain a high fidelity copy. On-line algorithms have been de...
详细信息
Optimization of quality of scalable video streams on peer-to-peer (P2P) networks is discussed. The video can be encoded into layers using scalable coding to obtain a high fidelity copy. On-line algorithms have been developed to coordinate the pre-fetching of scalably-coded variable bit-rate video components. The on-line algorithms pre-fetch the layers of future portions of video in small chunks and reduce the waste, smoothness and variability.
暂无评论