The complexity of integer sorting is investigated on two random access machine (RAM) models. The main results show that (i) on a RAM with addition, subtraction, multiplication and integer division, n integers in the r...
详细信息
The complexity of integer sorting is investigated on two random access machine (RAM) models. The main results show that (i) on a RAM with addition, subtraction, multiplication and integer division, n integers in the range [0, 2 cn ) can be sorted in O ( n (1 + log c )) steps, (ii) on a RAM with addition, subtraction, and left and right shifts, n integers in any range can be sorted in linear time, and (iii) on either of the above models, n integers in the range [0, n c can be sorted in O ( n (1 + log c )) steps, even if all register addresses and capacities are bounded above by n c .
This paper discusses a systematic new methodology for analyzing and predicting the performance of computing systems. The approaches considered address analysis of the computing system viewed in terms of an architectur...
详细信息
This paper discusses a systematic new methodology for analyzing and predicting the performance of computing systems. The approaches considered address analysis of the computing system viewed in terms of an architectural framework consisting of the applications, the system software, and the underlying hardware. The methodology discussed enables analysis of the interdependent effects of these layers to the behavior of the system. To enable that, the layers and components of the system are described in multiple levels of detail and in multiple modes (analytical and simulation approaches as well as integrated performance measurements). Key approaches will be the ability to use these multilevel and multimodal methods of describing such systems and their subcomponents and incorporate them into performance frameworks in a plug-and-play fashion, to enable the capability to describe system behavior. In addition, it is envisioned that this technology will not only support the design of the complex systems but also be used in the runtime (control) and management cycles of these systems. These new methods will allow the design of individual components and also provide capabilities for system analysis and design, as well as provide a path from understanding component behavior to understanding system behavior and from understanding component behavior to predicting system behavior. The term performance engineering technology is used here to describe the technology that supports the capabilities discussed here.
We present a simple deterministic parallel algorithm that runs on a CRCW PRAM and sorts n integers of size polynomial in n in time O (log n ) using O( n log log n log n ) processors. It is closer to optimality than an...
详细信息
We present a simple deterministic parallel algorithm that runs on a CRCW PRAM and sorts n integers of size polynomial in n in time O (log n ) using O( n log log n log n ) processors. It is closer to optimality than any previously known deterministic algorithm that solves the stated restricted sorting problem in polylog time.
This paper considers the premise that, in addition to trying to solve the virtual-memory-system performance problem by devising a storage management strategy suitable for the broad spectrum of behavior exhibited by pr...
详细信息
This paper considers the premise that, in addition to trying to solve the virtual-memory-system performance problem by devising a storage management strategy suitable for the broad spectrum of behavior exhibited by programs, efforts also be made to tailor the behavior of each program to the model underlying the storage management strategy under which the program will have to run. It is observed that a viable approach to program tailoring is offered by restructuring techniques. The application of dynamic off-line techniques to the tailoring problem is discussed, and an algorithm which may be used to fit program behavior to the working set model is described in detail as an example. The performance of this algorithm in dealing with two real-program traces is experimentally evaluated under a variety of conditions and found to be always satisfactory.
作者:
LIN, ANASA
LEWIS RES CTRINST COMPUTAT MECH PROPULTCLEVELANDOH 44135
In the present paper we discuss a general approach to solve boundary value problems numerically in a parallel environment. The basic algorithm consists of two steps: the local step, where all the P available processor...
详细信息
In the present paper we discuss a general approach to solve boundary value problems numerically in a parallel environment. The basic algorithm consists of two steps: the local step, where all the P available processors work in parallel, and the global step, where one processor solves a tridiagonal linear system of the order P . The main advantages of this approach are twofold: First, this suggested approach is very flexible, especially in the local step, and thus the algorithm can be used with any number of processors and with any of the SIMD or MIMD machines. Second, the communication complexity is very small and thus can be used as easily with shared memory machines. Several examples uing this strategy are discussed.
Recent publications suggest that use of decision tables in analysing conventional computer programs. In this paper, it is argued that the classical decision table format is not well suited to represent flow-chart-like...
详细信息
Recent publications suggest that use of decision tables in analysing conventional computer programs. In this paper, it is argued that the classical decision table format is not well suited to represent flow-chart-like *** alternative decision table conventions are investigated. Finally, a basic distinction is made between the use of decision tables in the problem statement phase and in programming.
A trace is a record of the execution of a computer program, showing the sequence of operations executed. Dynamic traces are obtained by executing the program and depend upon the input. Static traces, on the other hand...
详细信息
A trace is a record of the execution of a computer program, showing the sequence of operations executed. Dynamic traces are obtained by executing the program and depend upon the input. Static traces, on the other hand, describe potential sequences of operations extracted statically from the source code. Static traces offer the advantage that they do not depend upon input data. This paper describes a new automatic technique to extract static traces for individual stack and heap objects. The extracted static traces can be used in many ways, such as protocol recovery and validation in particular and program understanding in general. In addition, this article describes four case studies we conducted to explore the efficiency of our algorithm, the size of the resulting static traces, and the influence of the underlying points-to analysis on this size. (c) 2004 Elsevier Inc. All rights reserved.
Measurement of interactive systems should be a continuous process. It need not be done on a twenty-four-hour basis, but it should be done regularly, such as by partial measurements everyday or complete measurements ev...
详细信息
Measurement of interactive systems should be a continuous process. It need not be done on a twenty-four-hour basis, but it should be done regularly, such as by partial measurements everyday or complete measurements every few days. Collection during all hours when large numbers of users are present is most desirable.
The pattern language described here helps select synchronization primitives for parallel programs, avoiding primitives that interact with a given program's locking design.
The pattern language described here helps select synchronization primitives for parallel programs, avoiding primitives that interact with a given program's locking design.
Megiddo introduced a technique for using a parallel algorithm for one problem to construct an efficient serial algorithm for a second problem. This paper provides a general method that trims a factor of O(log n) time ...
详细信息
Megiddo introduced a technique for using a parallel algorithm for one problem to construct an efficient serial algorithm for a second problem. This paper provides a general method that trims a factor of O(log n) time (or more) for many applications of this technique.
暂无评论