检索结果-内蒙古大学图书馆

Fast parallel algorithms for matrix reduction to normal forms

APPLICABLE ALGEBRA IN ENGINEERING COMMUNICATION AND COMPUTING 1997年第6期8卷 511-537页

作者： Villard, G Imag Lab Grenoble LMC F-38041 Grenoble 9 France

We investigate fast parallel algorithms to compute normal forms of matrices and the corresponding transformations. Given a matrix B in M-n,M-n(K), where K is an arbitrary commutative field, we establish that computing a similarity transformation P such that F = P-1 BP is in Frobenius normal form can be done in NCK2. Using a reduction to this first problem, a similar fact is then proved for the Smith normal form S(x) of a polynomial matrix A(x) in M-n,M-m(K[x]);to compute unimodular matrices U(x) and V(x) such that S(x) = U(x)A(x)V(x) can be done in NCK2. We get that over concrete fields such as the rationals, these problems are in NC2. Using our previous results we have thus established that the problems of computing transformations over a field extension for the Jordan normal form, and transformations over the input field for the Frobenius and the Smith normal form are all in NCK2. As a corollary we establish a polynomial-time sequential algorithm to compute transformations for the Smith form over K[x].

关键词： parallel algorithm NCK2 matrix normal forms unimodular matrices similarity matrices

来源：评论

学校读者我要写书评

暂无评论

An efficient implementation of the finite-element time-domain algorithm on parallel computers using a finite-element tearing and interconnecting algorithm

引用

MICROWAVE AND OPTICAL TECHNOLOGY LETTERS 1997年第4期16卷 204-208页

作者： Navsariwala, UD Gedney, SD Department of Electrical Engineering University of Kentucky Lexington Kentucky 40506-0046

An efficient algorithm for implementing the finite-element time-domain (FETD) method on parallel computers is presented. An unconditionally stable implicit FETD algorithm is combined with the finite-element tearing and interconnecting (FETI) method. This domain decomposition algorithm converges at a rate dependent solely on the subdomain size, resulting in a highly scalable algorithm. (C) 1997 John Wiley & Sons, Inc.

关键词： FETD FEM FETI parallel algorithm domain decomposition

来源：评论

学校读者我要写书评

暂无评论

A fast parallel cholesky decomposition algorithm for tridiagonal symmetric matrices

引用

SIAM JOURNAL ON MATRIX ANALYSIS AND APPLICATIONS 1997年第2期18卷 403-418页

作者： BarOn, I Codenotti, B Leoncini, M TECHNION ISRAEL INST TECHNOL DEPT COMP SCIIL-32000 HAIFAISRAEL CNR IST MATEMAT COMPUTAZI-56126 PISAITALY UNIV PISA DIPARTIMENTO INFORMATI-56125 PISAITALY

In this paper we present a new parallel algorithm for computing the LL(T) decomposition of real symmetric positive-definite tridiagonal matrices. The algorithm consists of a preprocessing and a factoring stage. In the preprocessing stage it determines a rank-(p - 1) correction to the original matrix (p,= number of processors) by precomputing selected components x(k) of the L factor, k = 1,...,p - 1. In the factoring stage it performs independent factorizations of p matrices of order n/p. The algorithm is especially suited for machines with both vector and processor parallelism, as confirmed by the experiments carried out on a Connection Machine CM5 with 32 nodes. Let <(x)over cap(k)>, and <(x)over cap'(k)> denote the components computed in the preprocessing stage and the corresponding values (re)computed in the factorization stage, respectively. Assuming that is small, k = 1,...,p-1, we are able to prove that the algorithm is stable in the backward sense. The above assumption is justified both experimentally and theoretically. In fact, we have found experimentally that is small even for ill-conditioned matrices, and we have proven by an a priori analysis that the above ratios are small provided that preprocessing is performed with suitably larger precision.

关键词： parallel algorithm Cholesky decomposition LR and QR algorithms eigenvalues symmetric, tridiagonal and band matrices CM5

来源：评论

学校读者我要写书评

暂无评论

A parallel Spectral Fourier-Nonlinear Galerkin algorithm for Simulation of Turbulence

引用

Numerical Methods for Partial Differential Equations 1997年第6期13卷 699-715页

作者： Averbuch, A. Ioffe, L. Israeli, M. Vozovoi, L. School of Mathematical Sciences Tel Aviv University Tel Aviv 69978 Israel Faculty of Computer Science Technion Haifa 32000 Israel

We present a high-order parallel algorithm, which requires only the minimum interprocessor communication dictated by the physical nature of the problem at hand. The parallelization is achieved by domain decomposition. The discretization in space is performed using the Local Fourier Basis method. The continuity conditions on the interfaces are enforced by adding homogeneous solutions. Such solutions often have fast decay properties, which can be utilized to minimize interprocessor communication. In effect, the predominant part of the computation is performed independently in the subdomains (processors) or using only local communication. A novel element of the present parallel algorithm is the incorporation of a Nonlinear Galerkin strategy to accelerate the computation and stabilize the time integration process. The basic idea of this approach consists of decomposition of the variables into large scale and small scale components with different treatment of these large and small scales. The combination of the Multidomain Fourier techniques with the Nonlinear Galerkin (NLG) algorithm is applied here to solve incompressible Navier-Stokes equations. Results are presented on direct numerical simulation of two-dimensional homogeneous turbulence using the NLG method. © 1997 John Wiley & Sons, Inc.

关键词： Domain decomposition Local Fourier Basis Nonlinear Galerkin method parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

A parallel sort-balance mutual range-join algorithm on hypercube computers 3

A parallel sort-balance mutual range-join algorithm on hyper...

引用

3rd International Conference on algorithms and Architectures for parallel Processing (ICA(3)PP)

作者： Wong, R Topor, R Shen, H Griffith Univ Sch Comp & Informat Technol Nathan Qld 4111 Australia

ISBN: (纸本)0780342291

This paper presents an efficient parallel algorithm for computing the mutual range-join of N sets of numbers on shared-nothing hypercube computers. The algorithm iteratively joins each set to the mutual range-join of the preceding sets. Each join is performed on all processors of the hypercube in parallel. The algorithm uses a global sorting method to distribute the elements of the first set evenly across all processors in increasing order, a new data balancing technique to distribute the elements of subsequent sets to match the intermediate set at each processor and to compensate for join skew, and a new efficient local range-join procedure. We analyse the performance of this algorithm and demonstrate that it improves on the best previously published algorithm for this problem when the join selectivity factor is small. The method can also be applied to similar problems such as band-join and equi-join.

关键词： relational database query evaluation range-join hypercube parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

Scalable massively parallel algorithms for computational nanoelectronics

引用

parallel COMPUTING 1997年第14期22卷 1931-1963页

作者： Wang, XD Roychowdhury, VP Balasingam, P UNIV CALIF LOS ANGELES DEPT ELECT ENGN LOS ANGELES CA 90095 USA META SOFTWARE CORP CAMPBELL CA 95008 USA

There is at present a worldwide effort to overcome the technological barriers to nanoelectronics. Microscopic simulation can significantly enhance our understanding of the physics of nanoscale structures, and constitutes a valuable tool for designing nanoelectronic functional devices. In nanodevices, novel physics effects are used to attain logic functionality which conventional technology can not achieve. Therefore it is necessary to develop quantum-transport simulation methods which include novel physical effects. Moreover, Simulation of realistic nanodevices require enormous computing resource, necessitating parallel supercomputing. In this paper, we present massively parallel algorithms for simulating large-scale nanoelectronic networks based on the single-electron tunneling effect, which is arguably the quantum effect of greatest significance to nanoelectronic technology. A MIMD implementation of our simulation algorithm is carried out on a 64-processor nCUBE 2, and a SIMD implementation is carried out on a 16,384-processor MasPar MP-1. By exploiting massive parallelism, both parallel implementations achieve very high parallel efficiency and nearly linear scalability. The result of this work is that we are able to simulate large-scale nanoelectronic network, within a reasonable time period, which would be impractical on conventional workstations.

关键词： nanoelectronic networks electron dynamics Monte Carlo simulation parallel algorithm MIMD SIMD scalability analysis

来源：评论

学校读者我要写书评

暂无评论

Serial and parallel algorithms for computing distances between unrooted and cyclically ordered trees

引用

ELECTRONICS AND COMMUNICATIONS IN JAPAN PART II-ELECTRONICS 1997年第9期80卷 35-49页

作者： Yamamoto, M Tanaka, E Kobe Univ Grad Sch Sci & Technol Kobe Hyogo 657 Japan Kobe Univ Fac Engn Kobe Hyogo 657 Japan

For unrooted and cyclically ordered trees (CO-tree), the distance based on the ancestor-descendant relation-preserving mapping (TD), the distance based on the structure-preserving mapping (SPD), and the distance based on the strongly structure-preserving mapping (SSPD) have been defined and their computing methods have been proposed. Let T-a and T-b be CO-trees. Let the numbers of their vertices be N-a and N-b, and the numbers of their leaves be L-a and L-b respectively. Let. the maximum orders of the vertices of T-a and T-b be m(a) and m(b), respectively. In this paper, improved serial computing methods for TD and SPD are presented. The time complexities of those methods are O(NaNbLaLb) and O(m(a)N(a)N(b)L(b)), respectively. Based on the improved serial computing methods for TD and SPD, the parallel computing methods for TD, SPD, and SSPD are proposed. The time complexities of those methods are O(max{N-a, N-b}) The numbers of required processors are O(NaLaLb), O(m(a)N(a)L(b)), and O(m(a)m(b)max{N-a,N-b}), respectively. Each of those parallel methods is optimal in the sense that the cost is equal to the time complexity of the serial computing method. (C) 1998 Scripta Technica.

关键词： tree distance parallel algorithm serial algorithm pattern matching

来源：评论

学校读者我要写书评

暂无评论

New parallel algorithms for direct solution of sparse linear systems: Part I - Symmetric coefficient matrix

引用

INTERNATIONAL JOURNAL OF HIGH SPEED COMPUTING 1997年第4期9卷 259-290页

作者： Gopalan, K Murthy, CSR SUNY Stony Brook Dept Comp Sci Stony Brook NY 11794 USA Indian Inst Technol Dept Comp Sci & Engn Madras 600036 Tamil Nadu India

In this paper, we propose a new parallel bidirectional algorithm, based on Cholesky factorization, for the solution of sparse symmetric system of linear equations. Unlike the existing algorithms, the numerical factorization phase of our algorithm is carried out in such a manner that the entire back substitution component of the substitution phase is replaced by a single step division. Since there is a substantial reduction in the time taken by the repeated execution of the substitution phase, our algorithm is particularly suited for the solution of systems with multiple b-vectors. The effectiveness of our algorithm is demonstrated by comparing it with the existing parallel algorithm, based on Cholesky factorization, using extensive simulation studies on two-dimensional problems discretized by FEM.

关键词： linear equation sparse symmetric system Cholesky factorization parallel algorithm bidirectional scheme multiprocessor

来源：评论

学校读者我要写书评

暂无评论

parallel MULTIPLICATIVE ITERATIVE METHODS FOR CONVEX PROGRAMMING

引用

Acta Mathematica Scientia 1997年第2期17卷 205-210页

作者：陈忠费浦生 WUHAN UNIV DEPT MATH WUHAN 430072 PEOPLES R CHINA

In this paper, we present two parallel multiplicative algorithms for convex programming. If the objective function has compact level sets and has a locally Lipschitz continuous gradient, we discuss convergence of the ... 详细信息

关键词： parallel algorithm convex programming

来源：评论

学校读者我要写书评

暂无评论

A row based parallel Gaussian elimination algorithm for the connection machine CM-2

引用

INTERNATIONAL JOURNAL OF HIGH SPEED COMPUTING 1997年第1期9卷 13-24页

作者： Hoh, SH Moon, SM SEOUL NATL UNIV SCH ELECT ENGN SEOUL 151742 SOUTH KOREA

This paper presents an algorithm for the Gaussian elimination problem that reduces the length of the critical path compared to the algorithm of Lord et al. This is done by redefining the notion of a task. For all practical purposes, the issues of communication overhead and pivoting cannot be overlooked. We consider these issues for the new algorithm as well. Timing results of this algorithm as executed on the CM-2 model of the Connection Machine are presented. Another contribution of this paper is the use of logical pivoting for stable computation of the Gaussian elimination algorithm. Pivoting is essential is producing stable results. When pivoting occurs, an interchange of two rows is required. A physical interchange of the values can be avoided by providing a permutation vector in a globally accessible location. We show experimental results that substantiate the use of logical pivoting.

关键词： Gaussian elimination parallel algorithm Connection Machine

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：