The authors describe and evaluate a watermark-based lazy control policy of the buddy system for managing dynamic memory space. This policy achieves low operational costs by capitalizing on the steady-state behavior of...
详细信息
The authors describe and evaluate a watermark-based lazy control policy of the buddy system for managing dynamic memory space. This policy achieves low operational costs by capitalizing on the steady-state behavior of memory demands. The lazy control policy employs two watermark indices, the lazy ratio and the speedup ratio, to track the use of buffers for each size (class). They built a prototype lazy buddy system to manage UNIX(R) System STREAMS buffers and used it to measure and compare the results of the lazy policy against those from the regular buddy system.
This paper considers performance issues associated with a class of interconnection networks called hierarchical interconnection networks. It focuses specifically on an unbuffered Delta network. The performance of such...
详细信息
This paper considers performance issues associated with a class of interconnection networks called hierarchical interconnection networks. It focuses specifically on an unbuffered Delta network. The performance of such networks under a nonuniform (hot-spot) traffic pattern is analyzed. Particular attention is given to characterizing the overflow traffic of unsuccessfully transmitted packets. It is shown that the overflow traffic from a packet-switched Delta network is (fractionally) hotter than the offered load. Using simulation techniques, this result is extended to a more realistic circuit-switched network in discrete time with nonnegligible set-up times and limited retrials.
We describe a practical mathematical formulation and solution of the so-called 'File Assignment Problem' (FAP) for computer disks. Our FAP solution has been implemented in a PL/I program known as the Placement...
详细信息
We describe a practical mathematical formulation and solution of the so-called 'File Assignment Problem' (FAP) for computer disks. Our FAP solution has been implemented in a PL/I program known as the Placement Optimization Program (POP). The algorithm consists of three major components -- two heuristic optimization models and a queueing network model. POP has been used in validation studies to assign files to disks in two IBM MVS complexes. The resulting savings in I/O response times were 22% and 25%, respectively. Throughout the paper we shall emphasize the real-world nature of our approach to the disk FAP, which we believe sets it apart from previous attempts.
Trace-driven simulation is an important aid in performance analysis of computersystems. Capturing address traces for these simulations is a difficult problem for single processors and particularly for multicomputers....
详细信息
Trace-driven simulation is an important aid in performance analysis of computersystems. Capturing address traces for these simulations is a difficult problem for single processors and particularly for multicomputers. A new technique is presented which modifies the executable code to dynamically collect the address trace from the user code and analyzes this trace during the execution of the program. This method helps resolve the I/O and storage problems and facilitates parallel analysis of the address trace. If a trace stored on disk is desired, the generated trace information can also be written to files during execution, with a resultant drop in program execution speed. An initial implementation on the Intel iPSC/2 hypercube multicomputer is detailed, and sample simulation results are presented. The effect of this trace collection method on execution time is illustrated.
This paper describes a statistical approach to diagnosing intermittent performance problems when the relationships among measurement variables are expressed qualitatively as monotone relationships (e.g., paging delays...
详细信息
This paper describes a statistical approach to diagnosing intermittent performance problems when the relationships among measurement variables are expressed qualitatively as monotone relationships (e.g., paging delays increase with the number of logged-on users. We present a nonparametric test for monotonicity (NTM) that evaluates monotone relationships based on FA, the fraction of observation-pairs that agree with the monotone relationship. An interpretation of FA in terms of statistical significance levels is presented, and NTM is compared to least-squares regression. Based on NTM, an algorithm for diagnosing intermittent performance-problems is presented. NTM and our diagnosis algorithm are applied to measurements of four similarly configured IBM 9370 model 60s running IBM's operating-system Virtual Machine System Product (VM SP).
Markov models are widely used for the analysis of availability of computer/communication systems. Realistic models often involve state space cardinalities that are so large that it is impractical to generate the trans...
ISBN:
(纸本)9780897913157
Markov models are widely used for the analysis of availability of computer/communication systems. Realistic models often involve state space cardinalities that are so large that it is impractical to generate the transition rate matrix let alone solve for availability measures. Various state space reduction methods have been developed, particularly for transient analysis. In this paper we present an approximation technique for determining steady state availability. Of particular interest is that the method also provides bounds on the error. Examples are given to illustrate the method.
The authors describe a different approach to the study of concurrent algorithms. They use generalized stochastic Petri Nets (GSPNs) and their extension to colored Petri nets (CGSPNs) as a means for constructing realis...
详细信息
The authors describe a different approach to the study of concurrent algorithms. They use generalized stochastic Petri Nets (GSPNs) and their extension to colored Petri nets (CGSPNs) as a means for constructing realistic models of a concurrent algorithm. A detailed colored GSPN model is derived directly from the code of the algorithm by applying straightforward statement translation rules. The structural properties of the resulting CGSPN model can be used as a first step in proving correctness. The (exact) solution of the CGSPN model can provide insights into the actual performance of the concurrent algorithm. L. Lamport's fast mutual exclusion algorithm is used as an illustration of the proposed technique.
Execution traces can be significantly compressed using their referencing locality. A simple observation leads to a technique capable of compressing execution traces by an order of magnitude; instruction-only traces ar...
ISBN:
(纸本)9780897913157
Execution traces can be significantly compressed using their referencing locality. A simple observation leads to a technique capable of compressing execution traces by an order of magnitude; instruction-only traces are compressed by two orders of magnitude. This technique is unlike previously reported trace compression techniques in that it compresses without loss of information and, therefore, does not affect trace-driven simulation time or accuracy.
This paper examines the performance implications of several data structure and algorithm alternatives for thread management in shared-memory multiprocessors. Both experimental measurements and analytical model project...
详细信息
This paper examines the performance implications of several data structure and algorithm alternatives for thread management in shared-memory multiprocessors. Both experimental measurements and analytical model projections are presented. For applications with fine-grained parallelism, small differences in thread management are shown to have significant performance impact, often posing a tradeoff between throughput and latency. Per-processor data structures can be used to improve throughput, and in some circumstances to avoid locking, improving latency as well. The method used by processors to queue for locks is also shown to affect performance significantly. Normal methods of critical resource waiting can substantially degrade performance with moderate numbers of waiting processors. We present an Ethernet-style backoff algorithm that largely eliminates this effect.
This paper gives a simple, accurate first order asymptotic analysis of the transient and steady state behavior of a network which is closed, not product-form and has multiple classes. One of the two nodes of the netwo...
详细信息
This paper gives a simple, accurate first order asymptotic analysis of the transient and steady state behavior of a network which is closed, not product-form and has multiple classes. One of the two nodes of the network is an infinite server and the discipline in the other node is discriminatory processor-sharing. This work has applications to data networks. For the asymptotic regime of high loading of the processor and high processing capacity, we derive the explicit first order transient behavior of the means of queue lengths. We also give explicit expressions for the steady state mean values and a simple procedure for finding the time constants (eigenvalues) that govern the approach to steady state. Some numerical experiments show that the analysis is quite accurate.
暂无评论