the paper presents a parallel natural language processing system implemented on a marker-passing parallel AI computer, Semantic Network Array Processor (SNAP). the system uses a memory-based parsing approach in which ...
详细信息
this paper presents a divide-and-conquer ray-traced volume rendering algorithm and a parallel image compositing method, along withtheir implementation and performance on the Connection Machine CM-5, and networked wor...
详细信息
ISBN:
(纸本)0818649208
this paper presents a divide-and-conquer ray-traced volume rendering algorithm and a parallel image compositing method, along withtheir implementation and performance on the Connection Machine CM-5, and networked workstations. this algorithm distributes boththe data and the computations to individual processing units to achieve fast, high-quality rendering of high-resolution data. the volume data, once distributed, is left intact. the processing nodes perform local raytracing of their subvolume concurrently. No communication between processing unites is needed during this locally ray-tracing process. A subimage is generated by each processing unit and the final image is obtained by compositing subimages in the proper order, which can be determined a priori. Test results on the CM-5 and a group of networked workstations demonstrate the practicality of our rendering algorithm and compositing method.
We present an algorithm to map the nodes of a 3-dimensional grid to the nodes of its optimal hypercube on a one-to-one basis with dilation at most 5.
We present an algorithm to map the nodes of a 3-dimensional grid to the nodes of its optimal hypercube on a one-to-one basis with dilation at most 5.< >
An SIMD implementation of a method for approximating the stationary distribution vector of a Markov chain is presented. A key feature of the implementation is the simultaneous computation of several matrix inverses. C...
详细信息
An SIMD implementation of a method for approximating the stationary distribution vector of a Markov chain is presented. A key feature of the implementation is the simultaneous computation of several matrix inverses. Computational results from a MasPar MP-1 system are discussed.< >
Data parallel programs on MIMD machines are often structured as alternating phases of local computation and global communication such as reduction, synchronization, broadcast, etc. this paper describes the performance...
详细信息
Data parallel programs on MIMD machines are often structured as alternating phases of local computation and global communication such as reduction, synchronization, broadcast, etc. this paper describes the performance comparison of data parallel operations on the CM-5 and Intel Touchstone Delta multiprocessors and models these primitives.< >
We investigate synchronization activities in application executing on distributed-memory MIMD architectures. three applications are used to quantify the performance impact of synchronization as the number of processor...
详细信息
We investigate synchronization activities in application executing on distributed-memory MIMD architectures. three applications are used to quantify the performance impact of synchronization as the number of processors is increased. We also investigate the performance improvement possible when synchronization is supported in hardware. the results show that significant performance improvement can be achieved. the hardware support should include barrier synchronization, operate-and-broadcast, and operations over subsets of processors.< >
Debugging distributed programs is much more difficult than debugging sequential programs. One of the reasons is the communication among programs (processes) which may happen concurrently and nondeterministically. To b...
详细信息
Debugging distributed programs is much more difficult than debugging sequential programs. One of the reasons is the communication among programs (processes) which may happen concurrently and nondeterministically. To be able to analyze such communication events is therefore an essential task for any distributed program debugger. the paper describes the design and preliminary implementation of a layered distributed program debugger. the debugger helps a user to locate bugs, to analyze a distributed program and to fix bugs.< >
the primary purpose of Cray Research computer systems is the timely solution of complex problems in science and engineering. A few examples illustrate that the CRAY C90 is currently the world's most powerful tool ...
详细信息
this report presents our experiences parallelizing and implementing search problems. We take the sequential A* search and parallelize it by combining with bidirectional search, called parallel bidirectional A* search ...
详细信息
this report presents our experiences parallelizing and implementing search problems. We take the sequential A* search and parallelize it by combining with bidirectional search, called parallel bidirectional A* search (PBiA*S). To identify the effectiveness of the PBiA*S, we implement two search problems, the Eight Puzzle and the Tower of Hanoi, on a Symmetry multiprocessor. Execution results demonstrate that the PBiA*S can be an effective parallel search method as it gives two two 12-fold speedup over the unidirectional A* search for the two search problems.< >
the dramatic improvements in the processing rates of parallel computers are turning many compute-bound jobs into IO-bound jobs. parallel file systems have been proposed to better match IO throughput to processing powe...
详细信息
暂无评论