The HP MAP 3.0 program was a large multidivisional effort with project teams spread over different geographic locations and working under different organizations. To manage the integration of the hardware and software...
详细信息
The HP MAP 3.0 program was a large multidivisional effort with project teams spread over different geographic locations and working under different organizations. To manage the integration of the hardware and software components from these different project teams, a generic integration lifecycle was developed for the HP MAP 3.0 product.
The Array Management System (AMS) is an integrated set of array management tools designed to increase the productivity of technical programmers engaged in intensive matrix computational applications. These include ana...
详细信息
The Array Management System (AMS) is an integrated set of array management tools designed to increase the productivity of technical programmers engaged in intensive matrix computational applications. These include analog circuit simulator, statistical analysis, dense or sparse equation solving, simulation, and in particular, the finite element program development. AMS is composed of a set of easy-to-use in-core and out-of-core data management subroutines written in FORTRAN 77. The in-core array management subroutines of AMS allow dynamic storage allocation to be accomplished with integer, real, and complex data with a minimum of programming effort. The out-of-core array management subroutines of AMS support simple operations to allow array transfer between in-core and out-of-core systems and allow different programs to access the same data. The out-of-core data management provides for a direct access database file to speed up the input/output operations. Multiple databases are allowed to be accessed by a program; this provides an easy way to share data and restart. This integrated database environment is suitable to be the kernel of a software project with several programmers and data communications among them.
The paper describes a set of programming facilities for multiprocessor systems, directed toward a transputer architecture with asynchronous connections. The computational model is described, and the basic rules for de...
详细信息
The paper describes a set of programming facilities for multiprocessor systems, directed toward a transputer architecture with asynchronous connections. The computational model is described, and the basic rules for determining objects, data and conditions are introduced. Examples are given to illustrate the development of parallel programs and the processing of dynamic data structures of varying complexity.
The performance of two basic external sorting algorithms, distributive sorting and mergesort, is compared in an environment where even the main memory usage involves a cost. Performance is measured by total execution ...
详细信息
The performance of two basic external sorting algorithms, distributive sorting and mergesort, is compared in an environment where even the main memory usage involves a cost. Performance is measured by total execution time and main memory space-time integral. For optimal behavior, both algorithms prefer a small block size and a similar order of external sort. Their memory requirement is similar during the external phase, but during the internal phase distributive sorting requires a larger working space. For small records, the optimal behavior of distributive sorting is obtained with less external passes and its space-time integral is smaller. For large records, the number of passes at the optimal point of mergesort is similar or even less than that of distributive sorting, resulting in a smaller space-time integral. In all cases, the performance of distributive sorting degrades more mildly around the local optima.
The Trie Hashing (TH), defined by Litwin, is one of the fastest access methods for dynamic and ordered files. The hashing function is defined in terms of a trie, which is basically a binary tree where a character stri...
详细信息
The Trie Hashing (TH), defined by Litwin, is one of the fastest access methods for dynamic and ordered files. The hashing function is defined in terms of a trie, which is basically a binary tree where a character string is associated implicitly with each node. This string is compared with a prefix of the given key in the search process, and depending on the result either the left or the right child is chosen as the next node to visit. The leaf nodes point to buckets which contain the records. The buckets are on a disk, whereas the trie itself is in the core memory. In this paper we consider concurrent execution of the TH operations. In addition to the usual search, insertion and deletion operations, we also include range queries among the concurrent operations. Our algorithm locks only leaf nodes and at most two nodes need to be locked simultaneously by any operation regardless of the number of buckets being accessed. The modification required in the basic data structure in order to accommodate concurrent operations is very minor.
External sorting is usually accomplished by first creating sorted runs, then merging the runs. In the merge phase, writing and calculating can be overlapped by reading if two input buffers are used for each sorted run...
详细信息
External sorting is usually accomplished by first creating sorted runs, then merging the runs. In the merge phase, writing and calculating can be overlapped by reading if two input buffers are used for each sorted run. If the memory is very large, the input buffers will be large and using two input buffers per sorted run will be more efficient than using only one input buffer per run and risking reduced overlap of reading and writing. In many cases, merging time can be cut in half. We derive a formula for estimating the total time for merging for a given memory size, file size, number of merging passes and for a given disk drive. We present an extreme example where in spite of having two buffers per run, significant non-overlap occurs. However, in realistic problems, we show that making one merge pass with two input buffers per run is near optimal. This contradicts earlier results on merging which do not take large memory into account.
SDEF, a systolic array programming system, is presented. It is intended to provide (1) systolic algorithm researchers/ developers with an executable notation, and (2) the software systems community with a target notat...
详细信息
SDEF, a systolic array programming system, is presented. It is intended to provide (1) systolic algorithm researchers/ developers with an executable notation, and (2) the software systems community with a target notation for the development of higher-level systolic software tools. The design issues associated with such a programming system are identified. A spacetime representation of systolic computations is described briefly in order to motivate SDEF's program notation. The programming system treats a special class of systolic computations, called atomic systolic computations, any one of which can be specified as a set of properties: the computation's (1) index set (S), (2) domain dependencies (D), (3) spacetime embedding (E), and nodal function (F). These properties are defined and illustrated. SDEF's user interface is presented. It comprises an editor, a translator, a domain type database, and a systolic array simulator used to test SDEF programs. The system currently runs on a Sun 3/50 operating under Unix and Xwindows. Key design choices affecting this implementation are described. SDEF is designed for portability. The problem of porting it to a Transputer array is discussed.
A new approach to accelerating parallel sorting processes is introduced in this paper. This approach involves the design of a new type of memory chip with sorting functions. This type of sorting memory chip is feasibl...
详细信息
A new approach to accelerating parallel sorting processes is introduced in this paper. This approach involves the design of a new type of memory chip with sorting functions. This type of sorting memory chip is feasible with today's VLSI techniques. A memory module organizing several sorting memory chips associated with additional ECL or TTL control logic circuits is also presented. Using the sorting memory modules in a shared memory parallel processor machine, parallel sorting algorithms such as the column sort method can reduce the row access time significantly and avoid data collisions in the interconnection network. Experimental simulation results on the practical speedup achieved and the memory utilization for the proposed approach are described.
The problem of merging two large ordered sets A and B is considered. A detailed description of the development of an efficient parallel algorithm is given, starting with a version of the algorithm that demonstrates th...
详细信息
The problem of merging two large ordered sets A and B is considered. A detailed description of the development of an efficient parallel algorithm is given, starting with a version of the algorithm that demonstrates the idea of partitioning the merge problem into independent subproblems and ending with an optimal parallel algorithm for the EREW P- RAM model of computation.
A new measure of presortedness is presented which we call Par. We prove that it is distinct from other common measures of presortedness and we design sorting algorithms that sort a sequence X in O(|X| log Par(X)) comp...
详细信息
A new measure of presortedness is presented which we call Par. We prove that it is distinct from other common measures of presortedness and we design sorting algorithms that sort a sequence X in O(|X| log Par(X)) comparisons. Moreover, we prove that this is Par-optimal in a comparison-based model of computation.
暂无评论