We report the results of a large-scale empirical study of web traffic. Our study is based on over 500 GB of TCP/IP protocol- header traces collected in 1999 and 2000 (approximately one year apart) from the high-speed ...
详细信息
We report the results of a large-scale empirical study of web traffic. Our study is based on over 500 GB of TCP/IP protocol- header traces collected in 1999 and 2000 (approximately one year apart) from the high-speed link connecting The University of North Carolina at Chapel Hill to its Internet service provider. We also use a set of smaller traces from the NLANR repository taken at approximately the same times for comparison. The principal results from this study are: (1) empirical data suitable for constructing traffic generating models of contemporary web traffic, (2) new characterizations of TCP connection usage showing the effects of HTTP protocol improvement, notably persistent connections (e.g., about 50% of web objects are now transferred on persistent connections), and (3) new characterizations of web usage and content structure that reflect the influences of "banner ads," server load balancing, and content distribution. A novel aspect of this study is a demonstration that a relatively light-weight methodology based on passive tracing of only TCP/IP headers and off-line analysis tools can provide timely, high quality data about web traffic. We hope this will encourage more researchers to undertake on-going data collection and provide the research community with data about the rapidly evolving characteristics of web traffic.
We consider large cellular networks. The traffic entering the network is assumed to be correlated in both space and time. The space dependency captures the possible correlation between the arrivals to different nodes ...
详细信息
We consider large cellular networks. The traffic entering the network is assumed to be correlated in both space and time. The space dependency captures the possible correlation between the arrivals to different nodes in the network, while the time dependency captures the time correlation between arrivals to each node. We model such traffic with a Markov-Modulated Poisson Process(MMPP). It is shown that even in the single node environment, the problem is not mathematically tractable. A model with an infinite number of circuits is used to approximate the finite model. A novel recursive methodology is introduced in finding the joint moments of the number of busy circuits in different cells in the network leading to accurate determination of blocking probability. A simple mixed-Poisson distribution is introduced as an accurate approximation of the distribution of the number ofbusy circuits. We show that for certain cases, in the system with an infinite number of circuits in each cell, there is no effect of m obility on the performance of the system. Our numerical results indicate that the traffic burstiness has a major impact on the system performance. The mixed-Poisson approximation is found to be a very good fit to the exact finite model. The performance of this approximation using few moments is affected by traffic burstiness and average load. We find that in a reasonable range of traffic burstiness, the mixed-Poisson distribution provides a close approximation.
This paper studies the memory behavior of important Java workloads used in benchmarking Java Virtual Machines (JVMs), based on instrumentation of both application and library code in a state-of-theart JVM, and provide...
详细信息
This paper studies the memory behavior of important Java workloads used in benchmarking Java Virtual Machines (JVMs), based on instrumentation of both application and library code in a state-of-theart JVM, and provides structured information about these workloads to help guide systems' design. We begin by characterizing the inherent memory behavior of the benchmarks, such as information on the breakup of heap accesses among different categories and on the hotness of references to fields and methods. We then provide detailed information about misses in the data TLB and caches, including the distribution of misses over different kinds of accesses and over different methods. In the process, we make interesting discoveries about TLB behavior and limitations of data prefetching schemes discussed in the literature in dealing with pointer-intensive Java codes. Throughout this paper, we develop a set of recommendations to computer architects and compiler writers on how to optimize computersystems and system sof tware to run Java programs more efficiently. This paper also makes the first attempt to compare the characteristics of SPECjvm98 to those of a server-oriented benchmark, pBOB, and explain why the current set of SPECjvm98 benchmarks may not be adequate for a comprehensive and objective evaluation of JVMs and just-in-time (JIT) compilers. We discover that the fraction of accesses to array elements is quite significant, demonstrate that the number of "hot spots" in the benchmarks is small, and show that field reordering cannot yield significant performance gains. We also show that even a fairly large L2 data cache is not effective for many Java benchmarks. We observe that instructions used to prefetch data into the L2 data cache are often squashed because of high TLB miss rates and because the TLB does not usually have the translation information needed to prefetch the data into the L2 data cache. We also find that co-allocation of frequently used method tables can red
Application performance tuning is a complex process that requires correlating many types of information with source code to locate and analyze performance problems bottle-necks. Existing performance tools don't ad...
ISBN:
(纸本)9781581133349
Application performance tuning is a complex process that requires correlating many types of information with source code to locate and analyze performance problems bottle-necks. Existing performance tools don't adequately support this process in one or more dimensions. We describe two performance tools, MHsim and HPCView, that we built to support our own work on data layout and optimizing compilers. Both tools report their results in scope-hierarchy views of the corresponding source code and produce their output as HTML databases that can be analyzed portably and collaboratively using a commodity browser.
This paper presents a new methodology for system-level power and performance analysis of wireless multimedia systems. More precisely, we introduce an analytical approach based on concurrent processes modeled as Stocha...
详细信息
ISBN:
(纸本)9780780372498
This paper presents a new methodology for system-level power and performance analysis of wireless multimedia systems. More precisely, we introduce an analytical approach based on concurrent processes modeled as Stochastic Automata Networks (SANs) that can be effectively used to integrate power and performance metrics in system-level design. We show that 1) under various input traces and wireless channel conditions, the average-case behavior of a multimedia system consisting of a video encoder/decoder pair is characterized by very different probability distributions and power consumption values and 2) in order to identify the best trade-off between power and performance figures, one must take into consideration the entire environment (i.e., encoder, decoder and channel) for which the system is being designed. Compared to using simulation, our analytical technique reduces the time needed to find the steady-state behavior by orders of magnitude, with some limited loss in accuracy compared to the exact solution. We illustrate the potential of our methodology using the MPEG-2 video as the driver application.
As system integration evolves and tighter design constraints must be met, it becomes necessary to account for the non-ideal behavior of all the elements in a system. Certain devices common in high-frequency integrated...
详细信息
ISBN:
(纸本)9780780372498
As system integration evolves and tighter design constraints must be met, it becomes necessary to account for the non-ideal behavior of all the elements in a system. Certain devices common in high-frequency integrated circuit applications, such as spiral inductors, SAW filters, etc., are often described and studied in the frequency domain. Models take the form of frequency domain data obtained through measurement or through physical simulation. Usually the available data is sampled, incomplete, noisy, and covers only a finite range of the spectrum. In this paper we present a methodology for generating guaranteed passive time-domain models of frequency-described subsystems. The methodology presented is based on convex programming based algorithms for fixed denominator system identification. The algorithm is guaranteed to produce a passive system model that is optimal in the sense of having minimum weighted square error in the frequency band of interest over all models with a prescribed set of system poles. An incremental-fitting reformulation of the problem is also introduced that trades optimality for efficiency while still guaranteeing passivity. Results of the application of the proposed methodologies to the modeling of a variety of subsystems are presented and discussed.
Recent efforts to add new services to the Internet have increased the interest in software-based routers that are easy to extend and evolve. This paper describes our experiences implementing a software-based router, w...
ISBN:
(纸本)9781581133349
Recent efforts to add new services to the Internet have increased the interest in software-based routers that are easy to extend and evolve. This paper describes our experiences implementing a software-based router, with a particular focus on the main difficulty we encountered: how to schedule the router's CPU cycles. The scheduling decision is complicated by the desire to differentiate the level of service for different packet flows, which leads to two fundamental conflicts: (1) assigning processor shares in a way that keeps the processes along the forwarding path in balance while meeting QoS promises, and (2) adjusting the level of batching in a way that minimizes overhead while meeting QoS promises.
In this paper we perform the statistical analysis of an Internet communication channel. Our study is based on a Hidden Markov Model (HMM). The channel switches between different states; to each state corresponds the p...
详细信息
ISBN:
(纸本)9781581133349
In this paper we perform the statistical analysis of an Internet communication channel. Our study is based on a Hidden Markov Model (HMM). The channel switches between different states; to each state corresponds the probability that a packet sent by the transmitter will be lost. The transition between the different states of the channel is governed by a Markov chain; this Markov chain is not observed directly, but the received packet flow provides some probabilistic information about the current state of the channel, as well as some information about the parameters of the model. In this paper we detail some useful algorithms for the estimation of the channel parameters, and for making inference about the state of the channel. We discuss the relevance of the Markov model of the channel; we also discuss how many states are required to pertinently model a real communication channel.
暂无评论