We describe a set of representations for polynomials and sparse matrices suited for use with fine-grain parallelism on a distributed memory multiprocessor system. Our aim is to support use of supercomputers withthis ...
详细信息
ISBN:
(纸本)3540664432
We describe a set of representations for polynomials and sparse matrices suited for use with fine-grain parallelism on a distributed memory multiprocessor system. Our aim is to support use of supercomputers withthis style of architecture to perform computations that would exceed the main memory capacity of more traditional computers: although such systems have very high performance communication networks it is still essential to avoid letting any one part of the network become a bottleneck. We use randomised data placement both to avoid hot-spots in the communication patterns and to balance (in a probabilistic sense) the memory load placed upon each processing element. the expected application areas for such a system will be those where intermediate expression swell means that the huge primary memory available on MPP systems will be needed if the smaller final result is to be successfully computed.
A parallel Asset & Liability Management (ALM) code was developed by SMART and Prometeia Calcolo for the EC project PALMA (parallel Asset and Liability MAnagement). the code implements a stochastic approach based o...
详细信息
ISBN:
(纸本)3540664432
A parallel Asset & Liability Management (ALM) code was developed by SMART and Prometeia Calcolo for the EC project PALMA (parallel Asset and Liability MAnagement). the code implements a stochastic approach based on a dynamic ALM model specially tailored for the Italian financial market. this paper reports the performances obtained on the Gray T3E at CINECA, running the code on real data provided by Credito Italiano. Very good scalability and efficiency have been achieved. Anyway the code is easily portable on other, possibly heterogeneous, high performance computing platforms.
Domain Decomposition Methods is used for the parallelization of the LODYC ocean general circulation model. the local dependencies problem is solved by using a pencil splitting and an overlapping strategy. Two differen...
详细信息
Generic load balancing policies for irregular parallel applications may be efficiently implemented by integrating preemptive thread migration into the runtime support. In this context, a delicate issue is to manage po...
详细信息
ISBN:
(纸本)3540664432
Generic load balancing policies for irregular parallel applications may be efficiently implemented by integrating preemptive thread migration into the runtime support. In this context, a delicate issue is to manage pointer validity in a migration-safe way. In [1] we presented an iso-address approach to this problem. this paper discusses the impact of the iso-address migration scheme on the runtime of the Adaptor [4] HPF compiler. this runtime (previously modified so as to generate multithreaded code for our PM2 runtime system [3]) now provides a generic support for dynamic load balancing, using preemptive thread migration. We report some encouraging results obtained with our system on a HPF flame simulation code, a motivating application of HPF 2.0 [7].
the parallel implementation of unstructured adaptive tetrahedral meshes for the solution of transient flows requires many complex stages of communication. this is due to the irregular data sets and their dynamically c...
详细信息
ISBN:
(纸本)3540664432
the parallel implementation of unstructured adaptive tetrahedral meshes for the solution of transient flows requires many complex stages of communication. this is due to the irregular data sets and their dynamically changing distribution. this paper describes the use of Shared Abstract Data Types (SADTs) in the restructuring of such a code, called PTETRAD. SADTs are an extension of an ADT withthe notion of concurrent access. the potential for increased performance and simplicity of code is demonstrated, while maintaining software portability. It is shown how SADTs can raise the programmer's level of abstraction away from the details of how data sharing is supported. Performance results are provided for the SGI Origin2000 and the Gray T3E machines.
High Performance Fortran (HPF) is a data-parallel language providing the user with a high-level interface for programming scientific applications, while delegating to the compiler the task of producing explicitly para...
详细信息
ISBN:
(纸本)3540664432
High Performance Fortran (HPF) is a data-parallel language providing the user with a high-level interface for programming scientific applications, while delegating to the compiler the task of producing explicitly parallel code. In this paper, we give an overview of the motivation and the results of the ESPRIT project "HPF+". the project succeeded in demonstrating that HPF, with a small set of language extensions and an appropriate compiler and tool infrastructure, has the potential to be efficient for advanced industrial applications, sometimes approaching the performance of manually written message-passing code. We introduce the applications which were used to guide and evaluate the development work in the project, provide an overview of the HPF+ language and discuss the Vienna Fortran Compiler (VFC) as well as the performance obtained for the project benchmarks.
We consider an approach to efficient parallel implementation of the high order Control Volume Padé-type Differences (CVPD) applied to spatial time-dependent flow in the mixing tanks. this numerical technology all...
详细信息
Massively parallel computers consisting of a large number of processing elements have been developed and expected as high performance computers in advanced science and technology. Practical parallel computation model ...
详细信息
Massively parallel computers consisting of a large number of processing elements have been developed and expected as high performance computers in advanced science and technology. Practical parallel computation model has been required to analyze parallelalgorithms on massively parallel computers. We present a practical parallel computation model LogPQ taking account of communication queues into the LogP model. the LogPQ model has three queues for each communication line, and four supplement parameters in addition to the LogP model. this paper addresses the performance of parallel matrix multiplication using the LogPQ model. the parallel performances on the parallel machine CM-5 are compared between the LogP and LogPQ model. It is seen that the LogPQ model expects the execution times more accurately than the LogP model.
In this paper, a practical postal numeral segmentation and recognition system for Chinese business letters is presented. Line information for the address blocks is gained from the envelope image by projection, then th...
详细信息
Diplectanum aequans, an ectoparasite of the sea bass, can cause pathological problems especially in fish farms. A discrete mathematical model describes the demographic strategy of such fish and parasite populations. N...
详细信息
ISBN:
(纸本)3540664432
Diplectanum aequans, an ectoparasite of the sea bass, can cause pathological problems especially in fish farms. A discrete mathematical model describes the demographic strategy of such fish and parasite populations. Numerical simulations based on this model mimic some of the observed dynamics, and supply hints about the global dynamics of this host-parasite system. parallelisation is required because execution times of the simulator are too long. In this paper, we introduce the biological problem and the associated numerical simulator. then, a parallel solution is presented, with experimental results on an IBM SP2. the increase in speed has allowed us to improve the accuracy of computation, and to observe new dynamics.
暂无评论