A software system is described for the compression of a large look-up table to a smaller one, consistent with a worst-case error predefined by the user. The tables and a suitable source code for accessing them are aut...
详细信息
A software system is described for the compression of a large look-up table to a smaller one, consistent with a worst-case error predefined by the user. The tables and a suitable source code for accessing them are automatically generated, with very little user intervention. The techniques of linear interpolation and the partitioning of one table into several are shown to be particularly attractive for reducing the table size, especially when the considerable effort of manual generation to a known accuracy is removed. The use of linear interpolation incurs only a small speed penalty when executed on a digital signal processor and the large reductions in table size thus achieved can make the method a faster and more reliable alternative to either the exact or approximate evaluation of many functions.
This paper considers the programming of passive (cold and warm standbys) and active replicated systems in Ada 95. We show that it is relatively easy to develop systems which act as standbys using the facilities provid...
详细信息
This paper considers the programming of passive (cold and warm standbys) and active replicated systems in Ada 95. We show that it is relatively easy to develop systems which act as standbys using the facilities provided by the language and the Distributed systems Annex. Arguably, active replication in Ada 95 can be supported in a manner which is transparent to the application. However, this is implementation-dependent, requires a complex distributed consensus algorithm (or a carefully chosen subset of the language to be used) and has little flexibility. We therefore consider two extensions to the Distributed systems Annex to help give the application programmer more control. The first is a via a new categorization pragma which specifies that a RCI package can be replicated in more than one partition. The second is through the introduction of a coordinated type which has a single primitive operation. Objects which are created from extensions to coordinated types can be freely replicated across the distributed system. When the primitive operation is called, the call is posted to all sites where a replica resides, effectively providing a broadcast (multicast) facility. We also consider extensions to the partition communication subsystem which implement these new features.
SDEF, a systolic array programming system, is presented. It is intended to provide (1) systolic algorithm researchers/ developers with an executable notation, and (2) the software systems community with a target notat...
详细信息
SDEF, a systolic array programming system, is presented. It is intended to provide (1) systolic algorithm researchers/ developers with an executable notation, and (2) the software systems community with a target notation for the development of higher-level systolic software tools. The design issues associated with such a programming system are identified. A spacetime representation of systolic computations is described briefly in order to motivate SDEF's program notation. The programming system treats a special class of systolic computations, called atomic systolic computations, any one of which can be specified as a set of properties: the computation's (1) index set (S), (2) domain dependencies (D), (3) spacetime embedding (E), and nodal function (F). These properties are defined and illustrated. SDEF's user interface is presented. It comprises an editor, a translator, a domain type database, and a systolic array simulator used to test SDEF programs. The system currently runs on a Sun 3/50 operating under Unix and Xwindows. Key design choices affecting this implementation are described. SDEF is designed for portability. The problem of porting it to a Transputer array is discussed.
For each value to be sorted in the process of the parallel bubble sort computation, we evaluate the exact time necessary to route the value to its final position. Using this evaluation we design some efficient paralle...
详细信息
For each value to be sorted in the process of the parallel bubble sort computation, we evaluate the exact time necessary to route the value to its final position. Using this evaluation we design some efficient parallel sorting algorithms that can be implemented on a mesh-connected processor array and analyze their time complexities. Our algorithms are some combinations of the parallel bubble sorts in different directions, and their control hardware is very simple. Although the time complexities of our algorithms are O(N 1 2 log N) , they are as fast as the implementations of Batcher's bitonic sort and odd-even merge sort on the mesh-connected processor array for practical values of N , 1 ≤ N ≤ 128 2 . We also show a parallel sort that is very fast in the average case for practical values of N , 1 ≤ N ≤ 128 2 .
Discussed in this paper are the underlying principles of a programmer productivity measuring system. The key measures (or metrics) are people and lines of code. Definitions of these metrics are refined and qualified, ...
详细信息
Discussed in this paper are the underlying principles of a programmer productivity measuring system. The key measures (or metrics) are people and lines of code. Definitions of these metrics are refined and qualified, according to the conditions under which they are used. Presented also is a data base design for retaining and retrieving these metrics under a wide variety of applications and other circumstances. Depending on definitions, applications, and other circumstances productivity measurements may differ widely.
A new approach to accelerating parallel sorting processes is introduced in this paper. This approach involves the design of a new type of memory chip with sorting functions. This type of sorting memory chip is feasibl...
详细信息
A new approach to accelerating parallel sorting processes is introduced in this paper. This approach involves the design of a new type of memory chip with sorting functions. This type of sorting memory chip is feasible with today's VLSI techniques. A memory module organizing several sorting memory chips associated with additional ECL or TTL control logic circuits is also presented. Using the sorting memory modules in a shared memory parallel processor machine, parallel sorting algorithms such as the column sort method can reduce the row access time significantly and avoid data collisions in the interconnection network. Experimental simulation results on the practical speedup achieved and the memory utilization for the proposed approach are described.
作者:
KUNDE, MUNIV KIEL
INST INFORMAT & PRAKT MATH D-2300 KIEL 1 FED REP GER
Lower bounds for sorting on mesh-connected arrays of processors are presented. For sorting N=n1 n 2...n r elements on an n 1×n2×... ×n r array 2(n 1+...+n r?1)+n r data interchange steps are needed asym...
详细信息
Lower bounds for sorting on mesh-connected arrays of processors are presented. For sorting N=n1 n 2...n r elements on an n 1×n2×... ×n r array 2(n 1+...+n r?1)+n r data interchange steps are needed asymptotically. For two dimensions these bounds are asymptotically best possible provided that n 1 and n 2 are powers of 2. In this case the generalized s 2-way merge sort of Thompson and Kung turns out to be asymptotically optimal. The minimal asymptotic bound of 2 √2N interchange steps can be obtained only by sorting algorithms suitable for √N/2×√2N meshes. For r≧3 dimensions an analysis of aspect-ratios also demonstrates that there exist mesh-connected architectures which are better suited for sorting than simple r-dimensional cubes.
In this paper a review of the approaches to modelling adopted by programming languages currently available for discrete-system simulation leads to a close look at some of the advantages of languages which employ an &q...
详细信息
In this paper a review of the approaches to modelling adopted by programming languages currently available for discrete-system simulation leads to a close look at some of the advantages of languages which employ an "activity scan." These advantages include faster run-times for highly interrelated systems, simpler event routines, and increased security for the model. Decision tables provide a clear and concise format for specifying a complex set of conditions and the various consequent courses of action. They are therefore ideal for describing the conditions for interaction between component parts of a model as specified in the user-defined event routines. A preprocessor to make decision tables computer- readable would greatly enhance the process of modeZ ling and would allow a considerable extension of the range of conditioned statements to be used in the condition stubs of the table. A decision-table facility would form a useful extension to the many event-oriented simulation programming languages.
The traditional approach to the implementation of process administration in multiprogrammed systems is to make it part of the run-time system or ‘kernel’. This implies first, that the implementation is written in as...
详细信息
The traditional approach to the implementation of process administration in multiprogrammed systems is to make it part of the run-time system or ‘kernel’. This implies first, that the implementation is written in assembler language, and secondly that the process administration will be very inflexible. This article outlines a high level language PoMP, a Pascal extension. Following the trend set by Concurrent Pascal and Modula towards integrating ever increasing parts of the ‘kernel’ in the individual application program, PoMP provides language constructs for implementing process administration. It is shown that the multiprogramming language constructs of a number of languages may be ‘imitated’ in PoMP.
BOTTOM-UP HEAPSORT is a variant of HEAPSORT which beats on average even the clever variants of QUICKSORT, if n is not very small. Up to now, the worst case complexity of BOTTOM-UP HEAPSORT has been able to be estimate...
详细信息
BOTTOM-UP HEAPSORT is a variant of HEAPSORT which beats on average even the clever variants of QUICKSORT, if n is not very small. Up to now, the worst case complexity of BOTTOM-UP HEAPSORT has been able to be estimated only by 1.5n log n. McDiarmid and Reed (1989) have presented a variant of BOTTOM-UP HEAPSORT which needs extra storage for n bits. The worst case number of comparisons of this (almost internal) sorting algorithm is estimated by n log n + 1.1 n. It is discussed how many comparisons can be saved on average.
暂无评论