A limited-area numeric weather prediction model specifically targeted for parallel computers has been successfully implemented an an IBM SP2 distributed-memory parallel computer. the model employs an explicit finite-d...
详细信息
A limited-area numeric weather prediction model specifically targeted for parallel computers has been successfully implemented an an IBM SP2 distributed-memory parallel computer. the model employs an explicit finite-difference scheme and was parallelised using a simple domain decomposition technique. On a twelve processor SP2, a 24 hour forecast using archived operational data and including a sophisticated representation of physical processes was run at a range of resolutions between 150 km and 19 km and near-linear speedups were achieved. Major weather centres have indicated a requirement for regional prediction models to be run at resolutions of approximately 5 km by the end of the decade. Based on this work, it appears that this target can be achieved through the use of scalable parallel computers.< >
We develop and experiment with a new parallel algorithm to approximate the maximum weight cut in a weighted undirected graph. Our implementation is based on the recent new algorithm of Goemans and Williamson for this ...
详细信息
We develop and experiment with a new parallel algorithm to approximate the maximum weight cut in a weighted undirected graph. Our implementation is based on the recent new algorithm of Goemans and Williamson for this problem. However, our work aims for an efficient, practical formulation of the algorithm with close to optimal parallelization. Our theoretical analysis and an implementation on the Connection Machine CM5 show that our parallelization achieves linear speedup. We test our implementation on several large graphs (more than 9000 vertices), particularly on large instances of the Ising model.< >
Barrier algorithms are central to the performance of numerous algorithms on scalable, high-performance architectures. Numerous barrier algorithms have been suggested and studied for Non-Uniform Memory Access (NUMA) ar...
详细信息
ISBN:
(纸本)0818656026
Barrier algorithms are central to the performance of numerous algorithms on scalable, high-performance architectures. Numerous barrier algorithms have been suggested and studied for Non-Uniform Memory Access (NUMA) architectures, but less work has been done for Cache Only Memory Access (COMA) or attraction memory [1] architectures such as the KSR-1. In this paper, we presented two new barrier algorithmsthat offer the best performance we have recorded on the KSR-1 distributed cache multiprocessor. We discuss the trade-offs and the performance of seven algorithms on two architectures. the new barrier algorithms adapt well to a hierarchical caching memory model and take advantage of parallel communication offered by most multiprocessor interconnection networks,. Performance results are shown for a 256-processor KSR-1 and a 20-processor Sequent Symmetry.
A novel reconfigurable architecture based on a Multi-Ring Multiprocessor Network is described. the reconfigurable architecture is shown to combine low network diameter with a low degree of connectivity for each node i...
详细信息
this work deals with evaluation of hardware implementations of image processingalgorithms for real time applications, using SRAM based Field Programmable Gate Arrays. We discuss a generic architectural model adapted ...
详细信息
Matching is an important pari of a model-based object recognition system. Matching is a difficult task, for a number of reasons. First, in a number of recognition systems matching is formulated as a combinatorial prob...
详细信息
the aim of the paper is to estimate the contribution of the polarization diversity in high frequency (3 - 30 MHz) direction finding systems. We first describe the peculiarities of H.F. propagation and the resulting si...
详细信息
ISBN:
(纸本)0819416207
the aim of the paper is to estimate the contribution of the polarization diversity in high frequency (3 - 30 MHz) direction finding systems. We first describe the peculiarities of H.F. propagation and the resulting signal model involved in computer simulations. Next, we analyze the behavior of some particular direction finding systems using linear and circular geometries and polarization diversity. Some algorithms (non linear frequential analysis, M.U.S.I.C.) are tested in several conditions (narrowband or broadband signals, polarization filtering reiterated or no, sub-sampling). theoretical and experimental results show that polarization diversity based upon the knowledge of the antenna complex responses improves greatly the efficiency of direction finding.
Segmentation and other image processing operations rely on convolution calculations with heavy computational and memory access demands. this paper presents an analysis of a texture segmentation application containing ...
详细信息
ISBN:
(纸本)0818656026
Segmentation and other image processing operations rely on convolution calculations with heavy computational and memory access demands. this paper presents an analysis of a texture segmentation application containing a 96x96 convolution. Sequential execution required several hours on single processors systems with over 99% of the time spent performing the large convolution. 70% to 75% of execution time is attributable to cache misses within the convolution. We implemented the same application on CM-5, iPSC/860 and PVM distributed memory multicomputers, tailoring the parallelalgorithms to each machine's architectures. parallelization significantly reduced execution time, taking 49 second on a 512 node CM-5 and 6.5 minutes on a 32 node iPSC/860.
the Accurate Automation Corporation (AAC) neural network processor (NNP) module is a fully programmable multiple instruction multiple data (MIMD) parallel processor optimized for the implementation of neural networks....
详细信息
ISBN:
(纸本)0819415472
the Accurate Automation Corporation (AAC) neural network processor (NNP) module is a fully programmable multiple instruction multiple data (MIMD) parallel processor optimized for the implementation of neural networks. the AAC NNP design fully exploits the intrinsic sparseness of neural network topologies. Moreover, by using a MIMD parallelprocessing architecture one can update multiple neurons in parallel with efficiency approaching 100 percent as the size of the network increases. Each AAC NNP module has 8 K neurons and 32 K interconnections and is capable of 140,000,000 connections per second with an eight processor array capable of over one billion connections per second.
parallelalgorithms developed for CAD problems today suffer from three important drawbacks. first, they are machine specific and tend to perform poorly on architectures other than the one for which they were designed....
详细信息
ISBN:
(纸本)0818656026
parallelalgorithms developed for CAD problems today suffer from three important drawbacks. first, they are machine specific and tend to perform poorly on architectures other than the one for which they were designed. Second, they cannot use the latest advances in improved versions of the sequential algorithms for solving the problem. third, the quality of results degrade significantly during parallel execution. In this paper we address these three problems for an important CAD application: standard cell placement. We have developed a new parallel placement algorithm that is portable across a range of MIMD parallelarchitectures. the algorithm is part of the ProperCAD project which allows the development and implementation of a parallel algorithm such that it can be executed on a wide variety of parallel machines without any change to the source. the parallel placement algorithm is based on an existing implementation of the sequential simulated annealing algorithm, TimberWolfSC 6.0 [1].
暂无评论