This paper considers the programming of passive (cold and warm standbys) and active replicated systems in Ada 95. We show that it is relatively easy to develop systems which act as standbys using the facilities provid...
详细信息
This paper considers the programming of passive (cold and warm standbys) and active replicated systems in Ada 95. We show that it is relatively easy to develop systems which act as standbys using the facilities provided by the language and the Distributed systems Annex. Arguably, active replication in Ada 95 can be supported in a manner which is transparent to the application. However, this is implementation-dependent, requires a complex distributed consensus algorithm (or a carefully chosen subset of the language to be used) and has little flexibility. We therefore consider two extensions to the Distributed systems Annex to help give the application programmer more control. The first is a via a new categorization pragma which specifies that a RCI package can be replicated in more than one partition. The second is through the introduction of a coordinated type which has a single primitive operation. Objects which are created from extensions to coordinated types can be freely replicated across the distributed system. When the primitive operation is called, the call is posted to all sites where a replica resides, effectively providing a broadcast (multicast) facility. We also consider extensions to the partition communication subsystem which implement these new features.
For each value to be sorted in the process of the parallel bubble sort computation, we evaluate the exact time necessary to route the value to its final position. Using this evaluation we design some efficient paralle...
详细信息
For each value to be sorted in the process of the parallel bubble sort computation, we evaluate the exact time necessary to route the value to its final position. Using this evaluation we design some efficient parallel sorting algorithms that can be implemented on a mesh-connected processor array and analyze their time complexities. Our algorithms are some combinations of the parallel bubble sorts in different directions, and their control hardware is very simple. Although the time complexities of our algorithms are O(N 1 2 log N) , they are as fast as the implementations of Batcher's bitonic sort and odd-even merge sort on the mesh-connected processor array for practical values of N , 1 ≤ N ≤ 128 2 . We also show a parallel sort that is very fast in the average case for practical values of N , 1 ≤ N ≤ 128 2 .
SDEF, a systolic array programming system, is presented. It is intended to provide (1) systolic algorithm researchers/ developers with an executable notation, and (2) the software systems community with a target notat...
详细信息
SDEF, a systolic array programming system, is presented. It is intended to provide (1) systolic algorithm researchers/ developers with an executable notation, and (2) the software systems community with a target notation for the development of higher-level systolic software tools. The design issues associated with such a programming system are identified. A spacetime representation of systolic computations is described briefly in order to motivate SDEF's program notation. The programming system treats a special class of systolic computations, called atomic systolic computations, any one of which can be specified as a set of properties: the computation's (1) index set (S), (2) domain dependencies (D), (3) spacetime embedding (E), and nodal function (F). These properties are defined and illustrated. SDEF's user interface is presented. It comprises an editor, a translator, a domain type database, and a systolic array simulator used to test SDEF programs. The system currently runs on a Sun 3/50 operating under Unix and Xwindows. Key design choices affecting this implementation are described. SDEF is designed for portability. The problem of porting it to a Transputer array is discussed.
Discussed in this paper are the underlying principles of a programmer productivity measuring system. The key measures (or metrics) are people and lines of code. Definitions of these metrics are refined and qualified, ...
详细信息
Discussed in this paper are the underlying principles of a programmer productivity measuring system. The key measures (or metrics) are people and lines of code. Definitions of these metrics are refined and qualified, according to the conditions under which they are used. Presented also is a data base design for retaining and retrieving these metrics under a wide variety of applications and other circumstances. Depending on definitions, applications, and other circumstances, productivity measurements may differ widely. On the other hand, after suitable productivity metrics have been defined, consistency of application of the same metrics yields comparable results from project to project.
We prove that (1+√6/2)n ≈ 2.22n is a time lower bound independent of indexing schemes for sorting n2 items on an n × n mesh-connected processor array. We distinguish between indexing schemes by showing that the...
详细信息
We prove that (1+√6/2)n ≈ 2.22n is a time lower bound independent of indexing schemes for sorting n2 items on an n × n mesh-connected processor array. We distinguish between indexing schemes by showing that there exists an indexing scheme which is provably worse than the snake-like row-major indexing for sorting. We also derive lower bounds for various indexing schemes. All these results are obtained by using the chain argument which we provide in this paper.
A new approach to accelerating parallel sorting processes is introduced in this paper. This approach involves the design of a new type of memory chip with sorting functions. This type of sorting memory chip is feasibl...
详细信息
A new approach to accelerating parallel sorting processes is introduced in this paper. This approach involves the design of a new type of memory chip with sorting functions. This type of sorting memory chip is feasible with today's VLSI techniques. A memory module organizing several sorting memory chips associated with additional ECL or TTL control logic circuits is also presented. Using the sorting memory modules in a shared memory parallel processor machine, parallel sorting algorithms such as the column sort method can reduce the row access time significantly and avoid data collisions in the interconnection network. Experimental simulation results on the practical speedup achieved and the memory utilization for the proposed approach are described.
We consider the problem of deterministic sorting of integers on a parallel RAM (PRAM). The best previous result (T. Hagerup, 1987, Inform. and Comput.75, 39–51) states that n integers of size polynomial in n can be s...
详细信息
We consider the problem of deterministic sorting of integers on a parallel RAM (PRAM). The best previous result (T. Hagerup, 1987, Inform. and Comput.75, 39–51) states that n integers of size polynomial in n can be sorted in time O(log n) on a Priority CRCW PRAM with O(n log log nlog n) processors. We prove that n integers drawn from a set {0, …, m−1} can be sorted on an Arbitrary CRCW PRAM in time O(log nlog log n + log log m) with a time-processor product of O(n log log m). In particular, if m = n(log n)O(1), the time and number of processors used are O(log nlog log n) and O(n(log log n)2log n), respectively. This improves the previous result in several respects: The new algorithm is faster, it works on a weaker PRAM model, and it is closer to optimality for input numbers of superpolynomial size. If log log m = O(log nlog log n), the new algorithm is optimally fast, for any polynomial number of processors, and if log log m = (1 + Ω(1)) log log n and log log m = 0(log n), it has optimal speedup relative to the fastest known sequential algorithm. The space needed is O(nmε), for arbitrary but fixed ε > 0. The sorting algorithm derives its speed from a fast solution to a special list ranking problem of possible independent interest, the monotonic list ranking problem. In monotonic list ranking, each list element has an associated key, and the keys are known to increase monotonically along the list. We show that monotonic list ranking problems of size n can be solved optimally in time O(log nlog log n). We also discuss and attempt to solve some of the problems arising in the precise description and implementation of parallel recursive algorithms. As part of this effort, we introduce a new PRAM variant, the allocated PRAM.
The traditional approach to the implementation of process administration in multiprogrammed systems is to make it part of the run-time system or ‘kernel’. This implies first, that the implementation is written in as...
详细信息
The traditional approach to the implementation of process administration in multiprogrammed systems is to make it part of the run-time system or ‘kernel’. This implies first, that the implementation is written in assembler language, and secondly that the process administration will be very inflexible. This article outlines a high level language PoMP, a Pascal extension. Following the trend set by Concurrent Pascal and Modula towards integrating ever increasing parts of the ‘kernel’ in the individual application program, PoMP provides language constructs for implementing process administration. It is shown that the multiprogramming language constructs of a number of languages may be ‘imitated’ in PoMP.
In this paper a review of the approaches to modelling adopted by programming languages currently available for discrete-system simulation leads to a close look at some of the advantages of languages which employ an &q...
详细信息
In this paper a review of the approaches to modelling adopted by programming languages currently available for discrete-system simulation leads to a close look at some of the advantages of languages which employ an "activity scan." These advantages include faster run-times for highly interrelated systems, simpler event routines, and increased security for the model. Decision tables provide a clear and concise format for specifying a complex set of conditions and the various consequent courses of action. They are therefore ideal for describing the conditions for interaction between component parts of a model as specified in the user-defined event routines. A preprocessor to make decision tables computer- readable would greatly enhance the process of modeZ ling and would allow a considerable extension of the range of conditioned statements to be used in the condition stubs of the table. A decision-table facility would form a useful extension to the many event-oriented simulation programming languages.
BOTTOM-UP HEAPSORT is a variant of HEAPSORT which beats on average even the clever variants of QUICKSORT, if n is not very small. Up to now, the worst case complexity of BOTTOM-UP HEAPSORT has been able to be estimate...
详细信息
BOTTOM-UP HEAPSORT is a variant of HEAPSORT which beats on average even the clever variants of QUICKSORT, if n is not very small. Up to now, the worst case complexity of BOTTOM-UP HEAPSORT has been able to be estimated only by 1.5 n log n . McDiarmid and Reed (1989) have presented a variant of BOTTOM-UP HEAPSORT which needs extra storage for n bits. The worst case number of comparisons of this (almost internal) sorting algorithm is estimated by n log n + 1.1 n . It is discussed how many comparisons can be saved on average.
暂无评论