检索结果-内蒙古大学图书馆

International Conference on parallel and Distributed Processing Techniques and Applications (PDPTA 2001)

作者： Santos, EE Virginia Polytech Inst & State Univ Dept Comp Sci Blacksburg VA 24061 USA

Effective design of parallel matrix multiplication algorithms relies on the consideration of many interdependent issues based on the underlying parallel machine or network upon which such algorithms will be implemented, as well as, the type of methodology utilized by an algorithm. In this paper, we determine the parallel complexity of multiplying two (not necessarily square) matrices on parallel distributed-memory machines and/or networks. In other words, we provided an achievable parallel run-time that can not be beaten by any algorithm (known or unknown) for solving this problem. In addition, any algorithm that claims to be optimal must attain this run-time. In order to obtain results that are general and useful throughout a span of machines, we base our results on the well-known LogP model. Furthermore, three important criteria must be considered in order to determine the running time of a parallel algorithm;namely, (i) local computational tasks, (ii) the initial data layout, and (iii) the communication schedule. We provide optimality results by first proving general lower bounds on parallel run-time. These lower bounds lead to significant insights on (i)-(iii) above. In particular, we present what types of data layouts and communication schedules are needed in order to obtain optimal run-times. We prove that no one data layout can achieve optimal running times for all cases. Instead, optimal layouts depend on the dimensions of each matrix, and on the number of processors. Lastly, optimal algorithms are provided.

关键词： parallel complexity matrix multiplication LogP model parallel algorithms lower bounds and optimality numerical algorithms linear algebra parallel models

来源：评论

学校读者我要写书评

暂无评论

parallel calculation of accurate path lines using multi-block CFD datasets with changing geometry

Parallel calculation of accurate path lines using multi-bloc...

引用

International Conference on parallel and Distributed Processing Techniques and Applications

作者： Gerndt, A Schirski, M Kuhlen, T Bischof, C Rhein Westfal TH Aachen Ctr Comp & Commun Aachen Germany

ISBN: (纸本)1892512459

In many cases, an elliptical system of partial differential equations (PDEs) has to be solved in order to compute a given flow problem. For domain decomposition, mainly the multi-block grid approach is used. A variety of flows are unsteady, thus the calculation of path lines is a common way of exploring the flow field. However, computing path lines is more complicated if the underlying grid geometry changes over time. We make use of a fragmented multi-block dataset for a parallelization approach to compute path lines. We describe our enhancements of VTK, the used basic toolkit for scientific visualization, which neither supports multi-block nor time-dependent datasets. Our extensions include the handling of unsteady datasets as well as adaptive step-size control and time-position-interpolation. Finally, we depict the results of our efforts in order to speed-up Computational Fluid Dynamics (CFD) explorations in Virtual Environments.

关键词： computational fluid dynamics multi grid interpolation parallel algorithms high-performance computing virtual reality

来源：评论

学校读者我要写书评

暂无评论

The parallel computation of time-dependent Monte Carlo transport

The parallel computation of time-dependent Monte Carlo trans...

引用

International Conference on parallel Processing

作者： Deng, L Jie, L Zhang, WY Yuan, GX Huang, ZF Xu, HY Wang, RH Shu, L Inst Appl Phys & Computat Math Lab Com Phys Beijing 100088 Peoples R China

ISBN: (纸本)0769520189

parallel Monte Carlo methods are successful because particles are typically independent and easily distributed to multiple processors. For time-dependent Monte Carlo particle transport problem, due to the communication of each time-step about scattering source attribute and meshes, it reduces the parallel efficiency and limits enlarge of parallel scale. We research parallel computation of two types of time-dependent particle transport problems. Adaptive processor assignment in parallel computation and three parallel I/O models with low-cost communication are presented. The optimized processor choice is obtained. We propose a scheme that is based upon Monte Carlo layered sample technique. It is used to treat communication of scattering source. The parallel expandability is greatly improved. The larger speedups over the basic methods are obtained.

关键词： Monte Carlo methods parallel algorithms input-output programs multiprocessing systems processor scheduling Monte Carlo method multiple processors particle transport problem parallel computation I/O models low-cost communication scattering source communication

来源：评论

学校读者我要写书评

暂无评论

On the topological properties of the arrangement-star network

引用

JOURNAL OF SYSTEMS ARCHITECTURE 2003年第11-12期48卷 325-336页

作者： Awwad, AM Al-Ayyoub, A Ould-Khaoua, M Zarka Private Univ Dept Comp Sci Safat 13110 Kuwait Arab Open Univ Fac Comp Studies Amman 11953 Jordan Univ Glasgow Dept Comp Sci Glasgow G12 8RZ Lanark Scotland

This paper proposes a new interconnection network, referred to as the arrangement-star network, which is constructed from the product of the star and arrangement networks. Studying this new network is motivated by the good qualities it exhibits over its constituent networks, the star and arrangement networks. The star network has been a research focus for quite a long time until recently when the algorithm development on the star network turned out to be cumbersome. The arrangement network as a generalized class for the star network offers no solution in that direction. The arrangement-star network, on the other hand, makes it possible to efficiently embed grids, pipelines, as well as other computationally important topologies in a very natural manner. Furthermore, the fact that the product of the star and arrangement networks comes with little increase in the network diameter and a better result on communication cost, motivates further investigation for this new alternative, the arrangement-star network. (C) 2003 Elsevier Science B.V. All rights reserved.

关键词： star network arrangement network product network hierarchical structure vertex symmetry parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

The particle swarm optimization algorithm: convergence analysis and parameter selection

引用

INFORMATION PROCESSING LETTERS 2003年第6期85卷 317-325页

作者： Trelea, IC INA PG UMR Genie & Microbiol Proc Alimentaires F-78850 Thiverval Grignon France

The particle swarm optimization algorithm is analyzed using standard results from the dynamic system theory. Graphical parameter selection guidelines are derived. The exploration-exploitation tradeoff is discussed and illustrated. Examples of performance on benchmark functions superior to previously published results are given. (C) 2002 Elsevier Science B.V. All rights reserved.

关键词： particle swarm optimization stochastic optimization analysis of algorithms parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel algorithm for mining maximal frequent patterns

引用

5th International Workshop on Advanced parallel Processing Technologies

作者： Wang, H Xiao, ZT Zhang, HJ Jiang, SY Huazhong Univ Sci & Technol Comp Sch Wuhan 430074 Peoples R China Wuhan Commun Coll Wuhan 430010 Peoples R China

ISBN: (纸本)3540200541

We present a novel and powerful parallel algorithm for mining maximal frequent patterns, called Par-MinMax. It decomposes the search space by prefix-based equivalence classes, distributes work among the processors and selectively duplicates databases in such a way that each processor can compute the maximal frequent patterns independently. It utilizes multiple level backtrack pruning strategy and other novel pruning strategies, along with vertical database format, counting frequency by simple tid-list intersection operation. These techniques eliminate the need for synchronization, drastically cutting down the I/O overhead. The analysis and experimental results demonstrate the superb efficiency of our approach in comparison with the existing work.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Numerical performance of preconditioning techniques for the solution of complex sparse linear systems

引用

COMMUNICATIONS IN NUMERICAL METHODS IN ENGINEERING 2003年第1期19卷 37-48页

作者： Mazzia, A Pini, G Univ Padua Dipartimento Metodi & Modelli Matemat Sci Applica I-35131 Padua Italy

Preconditioning techniques based on ILU decomposition, on Frobenius norm minimization and on factorized sparse approximate inverse are considered. These algorithms are applied with conjugate gradient-type methods, namely Bi-CGSTAB, QMR and TFQMR for the solution of complex, large, sparse linear systems. The results of numerical experiments in scalar environment with matrices arising from transport in porous media, quantum chemistry, structural dynamics and electromagnetism are analysed. The preconditioner that appears most significant in parallel environment (based on factorized sparse approximate inverse) is then employed on a Cray T3E supercomputer. The experimental results show the satisfactory parallel performance of the proposed algorithm. Copyright (C) 2003 John Wiley Sons, Ltd.

关键词： complex sparse linear systems preconditioned iterative methods incomplete factorizations approximate inverses parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel delaunay triangulation based on circum-circle criterion 03

Parallel delaunay triangulation based on circum-circle crite...

引用

Spring Conference on Computer Graphics, SCCG 2003 - Conference Proceedings

作者： Kohout, Josef Kolingerová, Ivana Ctr. Comp. Graphics/Data V. Dept. of Comp. Sci. and Engineering University of West Bohemia Pilsen Czech Republic

ISBN: (纸本)158113861X

This paper describes a newly proposed simple and efficient parallel algorithm for the construction of the Delaunay triangulation (DT) in E 2 by randomized incremental insertion. The construction of the DT is one of the fundamental problems in computer graphics. The proposed algorithm is designed for parallel systems with shared memory and several processors. Such hardware (especially with two-processors) became available in the last few years thanks to low prices and at present, there is still a lack of parallel algorithms that are simple to implement and efficient enough to be an attractive alternative to long existing serial algorithms. The designed algorithm incorporates new method for synchronization among PEs based on the simple geometric test (i.e. if no other points lie in the circum-circle of accessed triangle, this triangle can be modified independently on others PEs). We implemented the algorithm in C++ and tested it on workstations up to four processors where we reached relatively good speed-up to our serial implementation. When only two processors were used we reached even super-linear speed-up.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel split-step fourier methods for the CMKdV equation

Parallel split-step fourier methods for the CMKdV equation

引用

Proceedings of the International Conference on parallel and Distributed Processing Techniques and Applications

作者： Taha, Thiab R. Liu, Ruihua Department of Computer Science University of Georgia Athens GA 30602 United States

ISBN: (纸本)1892512416

The class of complex modified Korteweg-de Vriet (CMKdV) equations has many applications. One form of the CMKdV equation has been used to create models for the nonlinear evolution of plasma waves [5], for the propagation of transverse waves in a molecular chain [3], Another form of the CMKdV equation has been used for the traveling-wave and for a double homoclinic orbit [4]. In this paper we introduce sequential and parallel split-step Fourier methods for numerical simulations of the above-equation. The parallel methods are implemented on the Origin 2000 multiprocessor computer. Our numerical experiments have shown that these methods give considerable speedup.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallelized three-dimensional unstructured Euler solver for unsteady aerodynamics

引用

JOURNAL OF AIRCRAFT 2003年第2期40卷 348-354页

作者： Oktay, E Akay, HU Uzun, A Indiana Univ Purdue Univ Dept Mech Engn Indianapolis IN 46202 USA Missiles Ind Inc Roketsan Aerodynam Dept TR-06780 Ankara Turkey

A parallel algorithm for the solution of unsteady Euler equations on unstructured and moving meshes is developed. A cell-centered finite volume scheme is used. The temporal discretization involves an implicit time-integration scheme based on backward-Euler time differencing. The movement of the computational mesh is accomplished by means of a dynamically deforming mesh algorithm. The parallelization is based on decomposition of the domain into a series of subdomains with overlapped interfaces. The scheme is computationally efficient, time accurate, and stable for large time increments. Detailed descriptions of the solution algorithm are given, and computations for airflow around a NACA0012 airfoil and a missile configuration are presented to demonstrate the applications.

关键词： parallel algorithms Solvers Scheme Euler equations unsteady aerodynamics time-discrete Euler MOVING MESH computational grids

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：