Speedup is usually used to reflect the effect of parallel processing systems. But the existed speedup models do not consider about the affect of cache, so the affect of cache to several speedup models is analyzed in t...
详细信息
Speedup is usually used to reflect the effect of parallel processing systems. But the existed speedup models do not consider about the affect of cache, so the affect of cache to several speedup models is analyzed in this paper.
The Bulk-Synchronous parallel (BSP) model was proposed by Valiant as a model for general-purpose parallel computation. The objective of the model is to allow the design of parallel programs that can be executed effici...
详细信息
The Bulk-Synchronous parallel (BSP) model was proposed by Valiant as a model for general-purpose parallel computation. The objective of the model is to allow the design of parallel programs that can be executed efficiently on a variety of architectures. While many theoretical arguments in support of the BSP model have been presented, the degree to which the model can be efficiently utilized on existing parallel machines remains unclear. To explore this question, we implemented s small library of BSP functions, called the Green BSP library, on several parallel platforms. We also created a number of parallel applications based on this library. Here, we report on the performance of six of these applications on three different parallel platforms. Our preliminary results suggest that the BSP model can be used to develop efficient and portable programs for a range of machines and applications.
This article assess the state-of-the-art technology in massively parallel processors (MPPs) and clusters of workstations (COWs) for scalable parallel computing. We evaluate the IBM SP2, the Intel Paragon, the Cray T3D...
详细信息
ISBN:
(纸本)0818674601
This article assess the state-of-the-art technology in massively parallel processors (MPPs) and clusters of workstations (COWs) for scalable parallel computing. We evaluate the IBM SP2, the Intel Paragon, the Cray T3D/T3E, and the ASCI TeraFLOPS system recently proposed by Intel. The concept of scalability is characterized from several orthogonal dimensions. Scalable performance attributes are discussed in the context of a newly proposed parallel computer model.
This paper takes a critical look at the following three maxims. 1. parallel architecture is converging on a design based on commodity microprocessor chips. 2. Wormhole routing is decidedly more efficient than store-an...
详细信息
This paper takes a critical look at the following three maxims. 1. parallel architecture is converging on a design based on commodity microprocessor chips. 2. Wormhole routing is decidedly more efficient than store-and-forward routing. 3. The PRAM is an unrealistically ideal model of computation.
Cost sensitive applications for parallel computing require system designs using commodity hardware. Off-the-shelf processing node have already been implemented in parallel systems. This article proposes the use of ATM...
详细信息
ISBN:
(纸本)0818674601
Cost sensitive applications for parallel computing require system designs using commodity hardware. Off-the-shelf processing node have already been implemented in parallel systems. This article proposes the use of ATM (Asynchronous Transfer Mode) for interconnection networks. Because ATM was not designed as communication technology for parallel systems, some adaptation has to be done in order to meet the special requirements of parallel systems. This paper discusses advantages and drawbacks of this approach and shows solutions to adapt the ATM technology for usage in this special environment while preserving some unique features of ATM.
In this paper, we investigate the problem of computing minimal interval and circular arc overlap representations, and give several optimal algorithms. We show that, among other things, given an s×t interval or ci...
详细信息
ISBN:
(纸本)0818674601
In this paper, we investigate the problem of computing minimal interval and circular arc overlap representations, and give several optimal algorithms. We show that, among other things, given an s×t interval or circular arc overlap representation matrix, a minimal interval overlap representation can be obtained in O(log(st)) time with O(st/log(st)) EREW PRAM processors, or in O(log t/log log t) time with O(st log logt/log t) Common CRCW PRAM processors, or in O(1) time with O(st) BSR processors;a minimal circular arc overlap representation can be obtained in O(st) time.
Virtual Reality (VR) is an exciting yet challenging area. Especially in commercial VR systems, one of the main challenges is how to maintain relatively constant performance under various loading and at low-cost. This ...
详细信息
Virtual Reality (VR) is an exciting yet challenging area. Especially in commercial VR systems, one of the main challenges is how to maintain relatively constant performance under various loading and at low-cost. This paper presents a parallel and distributed solution to the problem under the background of a commercial entertainment VR system. In the paper, the architecture of the system is introduced. The strategies of distribution and the mechanism of the parallel processing is discussed.
This paper presents performance and feasibility analyses for important mesh-connected architectures that contain sparse broadcast buses. Two basic architectures, that implement bus intersections differently, are given...
详细信息
This paper presents performance and feasibility analyses for important mesh-connected architectures that contain sparse broadcast buses. Two basic architectures, that implement bus intersections differently, are given special attention. The first architecture simply allows row/column bus crossovers. The second architecture has separable buses and implements such intersections with switches for further flexibility. Both architectures have lower cost than the mesh with multiple broadcast, which has buses spanning each row and each column, but the former architectures maintain to a high extent the powerful properties of the latter mesh. The architecture with separable buses is shown to often perform better than the higher-cost mesh with multiple broadcast. architectures with separable buses that employ store-and-forward routing often perform better than architectures with contiguous buses that employ high-cost wormhole routing. All these architectures are evaluated in reference to cost, and efficiency in implementing several important operations and application algorithms. The results prove that these architectures are very promising alternatives to the mesh with multiple broadcast;in addition, their implementation is cost-effective and feasible.
In this paper, we study a synchronous execution strategy for parallel join computation in multiprocessor systems. Through a further comprehensive investigation of the processor allocation problem and inter-operator pa...
详细信息
In this paper, we study a synchronous execution strategy for parallel join computation in multiprocessor systems. Through a further comprehensive investigation of the processor allocation problem and inter-operator parallelization problem, we present a new algorithm for producing an effective parallelization plan for processing multijoins. Besides theoretical analysis, the efficiency and effectiveness of our new algorithm are supported by our experiments.
Nondeterminacy is an important issue of testing and debugging parallel programs. For a message passing program the inter-process communication is the main cause of nondeterminacy. From a event-based view, the executio...
详细信息
Nondeterminacy is an important issue of testing and debugging parallel programs. For a message passing program the inter-process communication is the main cause of nondeterminacy. From a event-based view, the execution of a message passing parallel program can be modeled as a partially ordered set of events. The nondeterminacy is reflected in the partially ordered set. In this paper, we present a method to analyze the messagewise nondeterminacy of a message passing program based on the execution trace which preserves the partial order relations.
暂无评论