检索结果-内蒙古大学图书馆

A NOTE ON THE parallel computation THESIS

INFORMATION PROCESSING LETTERS 1983年第4期17卷 203-205页

作者： BLUM, N Fachbereich 10 der Universität des Saarlandes Angewandte Mathematik und Informatik 6600 Saarbrücken Germany Fed. Rep. Germany

The parallel computation thesis states that time-bounded parallel machines are polynomially related to space-bounded computers. Dymond and Cook (1980) state an extended parallel computation thesis: 1. parallel time and hardware requirements are simultaneously polynomially related to sequential (TM) reversal and space requirements. 2. parallel time and hardware are polynomially related. It is proved that every set that can be accepted by a TM in time T(n) can be accepted by a parallel machine in time log T(n), which gives evidence that both the parallel computation thesis and the extended parallel computation thesis are incorrect. The proof of the theorem crucially depends on the properties that: 1. An arbitrary finite number of processors can be activated in one parallel step. 2. Each memory cell of an arbitrary large finite global memory can be accessed from each processor. However, in many papers, at least the first property does not hold.

关键词： parallel computation Turing machines complexity simulation

来源：评论

学校读者我要写书评

暂无评论

THE parallel computation OF EIGENVALUES AND EIGENVECTORS OF LARGE HERMITIAN MATRICES USING THE AMT DAP 510

引用

CONCURRENCY-PRACTICE AND EXPERIENCE 1991年第3期3卷 179-185页

作者： WESTON, JS CLINT, M BLEAKNEY, CW UNIV ULSTER COLERAINE DEPT COMP SCICOLERAINE BT52 1SANORTH IRELAND QUEENS UNIV BELFAST CTR PARALLEL COMPBELFAST BT7 1NNANTRIMNORTH IRELAND QUEENS UNIV BELFAST DEPT COMP SCIBELFAST BT7 1NNANTRIMNORTH IRELAND

The solution of the algebraic eigenvalue problem is an important component of many applications in science and engineering. With the advent of novel architecture machines, much research effort is now being expended in the search for parallel algorithms for the computation of eigensystems which can gainfully exploit the processing power which these machines provide. Among important recent work References 1-4 address the real symmetric eigenproblem in both its dense and sparse forms, Reference 5 treats the unsymmetric eigenproblem, and Reference 6 investigates the solution of the generalized eigenproblem. In this paper two algorithms for the parallel computation of the eigensolution of Hermitian matrices on an array processor are presented. These algorithms are based on the parallel Orthogonal Transformation algorithm (POT) for the solution of real symmetric matrices[7,8]. POT was developed to exploit the SIMD parallelism supported by array processors such as the AMT DAP 510. The new algorithms use the highly efficient implementation strategies devised for use in POT. The implementations of the algorithms permit the computation of the eigensolution of matrices whose order exceeds the mesh size of the array processor used. A comparison of the efficiency of the two algorithms for the solution of a variety of matrices is given.

关键词： LINEAR ALGEBRA HERMITIAN MATRICES parallel computation ORTHOGONAL TRANSFORMATIONS ARRAY PROCESSORS

来源：评论

学校读者我要写书评

暂无评论

Optimal equi-partition of rectangular domains for parallel computation

引用

JOURNAL OF GLOBAL OPTIMIZATION 1996年第1期8卷 15-34页

作者： Christou, IT Meyer, RR UNIV WISCONSIN DEPT COMP SCICTR PARALLEL OPTIMIZATMADISONWI 53706

We present an efficient method for the partitioning of rectangular domains into equi-area sub-domains of minimum total perimeter. For a variety of applications in parallel computation, this corresponds to a load-balanced distribution of tasks that minimize interprocessor communication. Our method is based on utilizing, to the maximum extent possible, a set of optimal shapes for sub-domains. We prove that for a large class of these problems, we can construct solutions whose relative distance from a computable lower bound converges to zero as the problem size tends to infinity. PERIX-GA, a genetic algorithm employing this approach, has successfully solved to optimality million-variable instances of the perimeter-minimization problem and for a one-billion-variable problem has generated a solution within 0.32% of the lower bound. We report on the results of an implementation on a CM-5 supercomputer and make comparisons with other existing codes.

关键词： graph partitioning parallel computation genetic algorithms

来源：评论

学校读者我要写书评

暂无评论

A load balancing strategy for parallel computation of sparse?permanents

引用

NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS 2012年第6期19卷 1017-1030页

作者： Wang, Lei Liang, Heng Bai, Fengshan Huo, Yan Tsinghua Univ Dept Math Sci Beijing 100084 Peoples R China China Cit Bank Beijing 100027 Peoples R China

The research in parallel machine scheduling in combinatorial optimization suggests that the desirable parallel efficiency could be achieved when the jobs are sorted in the non-increasing order of processing times. In this paper, we find that the time spending for computing the permanent of a sparse matrix by hybrid algorithm is strongly correlated to its permanent value. A strategy is introduced to improve a parallel algorithm for sparse permanent. Methods for approximating permanents, which have been studied extensively, are used to approximate the permanent values of submatrices to decide the processing order of jobs. This gives an improved load balancing method. Numerical results show that the parallel efficiency is improved remarkably for the permanents of fullerene graphs, which are of great interests in nanoscience. Copyright (c) 2012 John Wiley & Sons, Ltd.

关键词： sparse matrix approximate algorithm permanent parallel computation load balancing accelerated ratio

来源：评论

学校读者我要写书评

暂无评论

SEARCHING, MERGING, AND SORTING IN parallel computation

引用

IEEE TRANSACTIONS ON COMPUTERS 1983年第10期32卷 942-946页

作者： KRUSKAL, CP Department of Computer Science University of Illinois Abstract Authors References Cited By Keywords Metrics Similar Download Citation Email Print Request Permissions

We study the number of comparison steps required for searching, merging, and sorting with P processors. We present a merging algorithm that is optimal up to a constant factor when merging two lists of equal size (independent of the number of processors); as a special case, with N processors it merges two lists, each of size N, in 1.893 lg lg N + 4 comparison steps. We use the merging algorithm to obtain a sorting algorithm that, in particular, sorts N values with N processors in 1.893 lg N lg lg N/lg lg lg N(plus lower order terms) comparison steps. The algorithms can be implemented on a shared memory machine that allows concurrent reads from the same location with constant overhead at each comparison step.

关键词： Comparison problems computational complexity merging parallel computation searching sorting

来源：评论

学校读者我要写书评

暂无评论

ON ONE-SIDED JACOBI METHODS FOR parallel computation

引用

SIAM JOURNAL ON ALGEBRAIC AND DISCRETE METHODS 1987年第4期8卷 790-796页

作者： EBERLEIN, PJ

Convergence proofs are given for one-sided Jacobi/Hestenes methods for the singular value problem. The limiting form of the matrix iterates for the Hestenes method with optimization when the original matrix is normal is derived; this limiting matrix is block diagonal, where the blocks are multiples of unitary matrices. A variation in the algorithm to guarantee convergence to a diagonal matrix for the symmetric eigenvalue problem is shown. Implementation techniques for parallel computation, in particular, on the hypercube are indicated.

关键词： 65F10 65H20 65F05 15 parallel computation one-sided Jacobi methods Hestenes method multiprocessors hypercube singular values eigenvalue problem

来源：评论

学校读者我要写书评

暂无评论

LOWER BOUNDS FOR THRESHOLD AND SYMMETRICAL FUNCTIONS IN parallel computation

引用

SIAM JOURNAL ON COMPUTING 1992年第2期21卷 329-338页

作者： AZAR, Y STANFORD UNIV DEPT COMP SCISTANFORDCA 94305

The family of decision problems of the threshold languages L(g) is considered. A threshold language L(g) is the set of n bit vectors having at least g(n) "1"s. Using a new technique for controlling the size and structure of a hypergraph by a potential function, lower bounds are proven for these decision problems on a PRIORITY PRAM with m shared memory cells and any polynomial number of processors. The lower bounds are almost tight for the admissible range (m less-than-or-equal-to n is-an-element-of). By combining these results with the results of Vishkin and Wigderson and the results of Li and Yesha, this paper is able to show a complexity gap between an m cell PRIORITY PRAM having an exponential (or unlimited) number of processors and one having only a polynomial number. A consequence of these results is that PRIORITY PRAM and ARBITRARY PRAM with m shared memory cells and any given polynomial number of processors have the same power (up to a small factor) for computing symmetric functions.

关键词： THRESHOLD PRAM LOWER BOUNDS SYMMETRICAL FUNCTIONS parallel computation

来源：评论

学校读者我要写书评

暂无评论

A time cost model for distributed objects parallel computation

引用

FUTURE GENERATION COMPUTER SYSTEMS 2002年第6期18卷 807-812页

作者： Shevchenko, R Doroshenko, A NASU Inst Software Syst UA-13187 Kiev Ukraine Gradsoft Kiev Ukraine

A time cost model for parallel computation in CORBA-distributed objects is introduced and a methodology for enhancing performance of distributed applications is proposed. A new four-tiered architecture, against traditional three-tiered one, is derived form constructed cost model for Internet distributed applications. (C) 2002 Elsevier Science B.V. All rights reserved.

关键词： distributed objects parallel computation CORBA

来源：评论

学校读者我要写书评

暂无评论

CUDA-Based parallel computation Model for State Estimation 39

CUDA-Based Parallel Computation Model for State Estimation

引用

39th Chinese Control Conference (CCC)

作者： Liu, Yueqi Kong, Xiangyu Jin, Yao Tianjin Univ Key Lab Smart Grid Minist Educ Tianjin 300072 Peoples R China State Grid Tianjin Chengnan Elect Power Supply Co Tianjin 300201 Peoples R China

ISBN: (纸本)9789881563903

With the development of the active distribution network (ADN), distributed state estimation (DSE) has become an inevitable trend for state estimation (SE). The efficient partitioned strategy is an essential prerequisite for DSE. However, the existing methods only consider the basic requirements of equilibrium connectivity, and the similarity of buses in one sub-region is ignored. A new partitioned strategy is proposed in this paper, this method learns from the idea of hierarchical clustering, the buses of the distribution network are aggregated to form sub-regions, and CUDA platform is used to realize the parallel computation of SE. The improved IEEE33-node system and a real distribution network are analyzed as a case study. The results show that, compared with traditional CSE, the estimation accuracy of DSE is higher than that of CSE, which indicates that the estimation accuracy of the proposed method is higher than that of CSE. Besides, compared with serial computation, this method can effectively reduce the running time of DSE in each sub-region and improve the overall calculation efficiency.

关键词： DSE network splitting parallel computation CUDA

来源：评论

学校读者我要写书评

暂无评论

Development of high performance casting analysis software by coupled parallel computation

引用

China Foundry 2007年第3期4卷 215-219页

作者： Sang Hyun CHO Jeong Kil CHOI Center For e-Design Korea Institute of Industrial Technology994-32Dongchun-DongYeonsu-GuIncheon406-800Korea

Up to now,so much casting analysis software has been continuing to develop the new access way to real casting processes. Those include the melt flow analysis,heat transfer analysis for solidification calculation,mechanical property predictions and microstructure predictions. These trials were successful to obtain the ideal results comparing with real situations,so that CAE technologies became inevitable to design or develop new casting processes. But for manufacturing fields,CAE technologies are not so frequently being used because of their difficulties in using the software or insufficient computing performances. To introduce CAE technologies to manufacturing field,the high performance analysis is essential to shorten the gap between product designing time and prototyping time. The software code optimization can be helpful,but it is not enough,because the codes developed by software experts are already optimized enough. As an alternative proposal for high performance computations,the parallel computation technologies are eagerly being applied to CAE technologies to make the analysis time shorter. In this research,SMP (Shared Memory Processing) and MPI (Message Passing Interface) (1) methods for parallelization were applied to commercial software "Z-Cast" to calculate the casting processes. In the code parallelizing processes,the network stabilization,core optimization were also carried out under Microsoft Windows platform and their performances and results were compared with those of normal linear analysis codes.

关键词： parallel computation message passing interface casting analysis SMP performance improvement

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：