The author parallelized a tertiary structure search algorithm on a distributed memory parallel computer, Cenju-3, utilizing a standard message passing interface (MPI) parallel library. For parallelization scheme, a ma...
详细信息
ISBN:
(纸本)0818674601
The author parallelized a tertiary structure search algorithm on a distributed memory parallel computer, Cenju-3, utilizing a standard message passing interface (MPI) parallel library. For parallelization scheme, a master-workers model is used. I analyzed the total performance by utilizing a M/M/1 queueing model and Jackson's model, and clearly explained the actual turn around times. Tertiary structure search is a computationally intensive task. When test sequences are distributed to all processors, a single key sequence can be tested independently. Thus high parallelization results are anticipated. Since database is allocated on a master processor, worker processors should acquire test sequences from the master processor. Sometimes a worker should wait until other workers obtain test sequences from the master processor. By this waiting, the total performance will saturate when the number of processors proceeds certain level. I analyzed the total performance by utilizing a M/M/1 queueing model and Jackson's model. By using these models, the actual turn around times are explained clearly.
Various tridiagonal solvers have been proposed in recent years for different parallel platforms. In this paper, the performance of three tridiagonal solvers, namely, the parallel partition LU algorithm, the parallel d...
详细信息
Various tridiagonal solvers have been proposed in recent years for different parallel platforms. In this paper, the performance of three tridiagonal solvers, namely, the parallel partition LU algorithm, the parallel diagonal dominant algorithm, and the reduced diagonal dominant algorithm, is studied. These algorithms are designed for distributed-memory machines and are tested on an Intel Paragon and an IBM SP2 machines. Measured results are reported in terms of execution time and speedup. The measured results match analytical results closely. In addition to address implementation issues, performance considerations such as problem sizes and models of speedup are also discussed.
In this paper, we present a mixed MIMD / SIMD execution model for a reconfigurable computer. This model is adapted to the use of a specialized associative coprocessor, embedded in this host machine. A main characteris...
详细信息
ISBN:
(纸本)0818620528
In this paper, we present a mixed MIMD / SIMD execution model for a reconfigurable computer. This model is adapted to the use of a specialized associative coprocessor, embedded in this host machine. A main characteristic of the model is that it uses four types of processes (decoding, calculus, coprocessor communication and transaction manager), and that in principle one process of each type is allowed on each processor. Time intervals are allocated to operations into partitions of the set of processors. Transfers are usually limited to identifiers, logical addresses and locks. Simulations display a high level of processors occupation. Therefore the machine yield may be very high, and the operations should be very fast.
In this paper, the Event Chain Clock synchronization algorithm is presented. This algorithm can maintain a global physical clock that reflects both the partial order and the elapsed time of all events occurred. This a...
详细信息
ISBN:
(纸本)3540241280
In this paper, the Event Chain Clock synchronization algorithm is presented. This algorithm can maintain a global physical clock that reflects both the partial order and the elapsed time of all events occurred. This algorithm, which repeats some basic operations, has good astringency, and is suitable for parallel program performance debugging.
In spatial join processing, a common method to minimize the I/O cost is to partition the spatial objects into clusters and then to schedule the processing of the clusters such that the number of times the same objects...
详细信息
ISBN:
(纸本)3540241280
In spatial join processing, a common method to minimize the I/O cost is to partition the spatial objects into clusters and then to schedule the processing of the clusters such that the number of times the same objects to be fetched into memory can be minimized. The key issue of cluster scheduling is how to produce a better sequence of clusters to guide. the scheduling. This paper describes strategies that apply the ant colony optimization (ACO) algorithm to produce cluster scheduling sequence. Since the structure of the ACO is highly suitable for parallelization, parallel algorithms are also developed to improve the performance of the algorithms. We evaluated and illustrated that that the scheduling sequence produced by the new method is much better than existing approaches.
A new ZKp identity protocol is proposed in this paper. It is more appropriate than the traditional identity protocol in distributed environment without an identical trusted third party. The security of this protocol r...
详细信息
ISBN:
(纸本)3540241280
A new ZKp identity protocol is proposed in this paper. It is more appropriate than the traditional identity protocol in distributed environment without an identical trusted third party. The security of this protocol relies on the discrete logarithm problem on conic over finite fields. It can be designed and implemented easier than those on elliptic curve. A simple solution is proposed to prevent a potential leak of our protocol.
In this paper the control architecture and the characteristics of the synchronization of industrial application are presented. The control procedure is implemented with loosely coupled distributed real time system, in...
详细信息
In this paper the control architecture and the characteristics of the synchronization of industrial application are presented. The control procedure is implemented with loosely coupled distributed real time system, in where parallelprocessing is possible. There are five nodes in the network, one master actuator, three slave actuators and an machine controller. All nodes are implemented using Motorola's 68332 controller. The position and velocity of the master are transmitted as a command to slaves. Network protocol used is CAN (Controlled Area Network). On-line correction and synchronization are done through serial based network. In this paper synchronization methods, the characteristics of CAN, control architecture and electronics used are introduced.
This paper presents an approach to parallelprocessing of test generation for logic circuits in a loosely-coupled distributed network of general purpose computers. We first analyze the relation between the number of p...
详细信息
This paper presents an efficient parallel algorithm for volume rendering of large-scale datasets. Our algorithm focuses on an optimization technique. namely early ray termination (ERT), which aims to reduce the amount...
详细信息
ISBN:
(纸本)3540241280
This paper presents an efficient parallel algorithm for volume rendering of large-scale datasets. Our algorithm focuses on an optimization technique. namely early ray termination (ERT), which aims to reduce the amount of computation by avoiding enumeration of invisible voxels in the visualizing volume. The novelty of the algorithm is that it incorporates this technique into a distributed volume rendering system with global reduction of the computational amount. The algorithm also is capable of statically balancing the processor work-loads. The experimental results show that our algorithm with global ERT further achieves the maximum reduction of 33% compared to an earlier algorithm with local ERT. As a result, our load-balanced algorithm reduces the execution time to at least 66%, not only for dense objects but also for transparent objects.
We present a simple and general parallel sorting scheme, ZZ-sort, which can be used to derive a class of efficient in-place sorting algorithms on realistic parallel machine models. We prove a tight bound for the worst...
详细信息
We present a simple and general parallel sorting scheme, ZZ-sort, which can be used to derive a class of efficient in-place sorting algorithms on realistic parallel machine models. We prove a tight bound for the worst case performance of ZZ-sort. We also demonstrate the average performance of ZZ-sort by experimental results obtained on a MasPar parallel computer. Our experiments indicate that ZZ-sort can be incorporated into a distributed memory parallel computer system as a standard routine, and this routine is useful for space critical situations. Finally, we show that ZZ-sort can be used to convert a non-adaptive parallel sorting algorithm into an in-place and adaptive one by considering the problem of sorting an arbitrarily large input on fixed-size reconfigurable meshes.
暂无评论