检索结果-内蒙古大学图书馆

The queue-read queue-write pram model: Accounting for contention in parallel algorithms

SIAM JOURNAL ON COMPUTING 1998年第2期28卷 733-769页

作者： Gibbons, PB Matias, Y Ramachandran, V AT&T Bell Labs Lucent Technol Murray Hill NJ 07974 USA Univ Texas Dept Comp Sci Austin TX 78712 USA

This paper introduces the queue-read queue-write (QRQW) parallel random access machine (pram) model, which permits concurrent reading and writing to shared-memory locations, but at a cost proportional to the number of readers/writers to any one memory location in a given step. Prior to this work there were no formal complexity models that accounted for the contention to memory locations, despite its large impact on the performance of parallel programs. The QRQW pram model reflects the contention properties of most commercially available parallel machines more accurately than either the well-studied CRCW pram or EREW pram models: the CRCW model does not adequately penalize algorithms with high contention to shared-memory locations, while the EREW model is too strict in its insistence on zero contention at each step. The QRQW PRAM is strictly more powerful than the EREW pram. This paper shows a separation of root log n between the two models, and presents faster and more efficient QRQW algorithms for several basic problems, such as linear compaction, leader election, and processor allocation. Furthermore, we present a work-preserving emulation of the QRQW pram with only logarithmic slowdown on Valiant's bsp model, and hence on hypercube-type noncombining networks, even when latency, synchronization, and memory granularity overheads are taken into account. This matches the best-known emulation result for the EREW pram, and considerably improves upon the best-known efficient emulation for the CRCW pram on such networks. Finally, the paper presents several lower bound results for this model, including lower bounds on the time required for broadcasting and for leader election.

关键词： models of parallel computation parallel algorithms PRAM memory contention work-time framework

来源：评论

学校读者我要写书评

暂无评论

MASSIVELY parallel algorithms FOR SCATTERING IN OPTICAL LITHOGRAPHY

引用

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 1991年第9期10卷 1091-1100页

作者： GUERRIERI, R TADROS, KH GAMELIN, J NEUREUTHER, AR UNIV CALIF BERKELEY DEPT ELECT ENGN & COMP SCIBERKELEYCA 94720

A promising new massively parallel technique for rigorous simulation of topography scattering in optical lithography has been developed and tested. The method is equivalent to the time-domain finite-difference method (TDFDM) used in electromagnetic scattering simulations, but exploits the parallel nature of wave propagation and the power of recent massively parallel architectures such as the Connection Machine. A working code called TEMPEST has been implemented on a Connection Machine CM-2 having 1 to 32 K processors with up to 1 M virtual processors. Numerical accuracy comparable with that of other fully rigorous methods was achieved. The convergence in the time domain for lithographic problems was found to be dominated by the physical process of multiple scattering of incident radiation. A very significant finding was that the solution required constant time per iteration for problems ranging from a few thousand unknowns up to one million, providing the ratio between the problem size and the number of processors is kept constant. The suitability of TEMPEST for solving a large class of topography structures important in alignment, metrology, and lithography is illustrated by examples from linewidth measurement with thin-film interference effects and projection printing in the presence of specular reflections from curved substrates.

关键词： Optical scattering parallel algorithms Lithography Electromagnetic scattering Surfaces Testing Time domain analysis Finite difference methods Electromagnetic propagation Optical propagation

来源：评论

学校读者我要写书评

暂无评论

PROCESSOR-TIME OPTIMAL parallel algorithms FOR DIGITIZED IMAGES ON MESH-CONNECTED PROCESSOR ARRAYS

引用

ALGORITHMICA 1991年第5期6卷 698-733页

作者： ALNUWEIRI, HM KUMAR, VKP 1. EEB-244 Department of Electrical Engineering Systems University of Southern California 90089-2562 Los Angeles CA USA

We present processor-time optimal parallel algorithms for several problems on n x n digitized image arrays, on a mesh-connected array having p processors and a memory of size O(n2) words. The number of processors p can vary over the range [1, n3/2] while providing optimal speedup for these problems. The class of image problems considered here includes labeling the connected components of an image;computing the convex hull, the diameter, and a smallest enclosing box of each component;and computing all closest neighbors. Such problems arise in medium-level vision and require global operations on image pixels. To achieve optimal performance, several efficient data-movement and reduction techniques are developed for the proposed organization.

关键词： DIGITIZED IMAGE PROBLEMS parallel algorithms PROCESSOR-TIME TRADEOFFS MESH ARRAYS

来源：评论

学校读者我要写书评

暂无评论

Very fast parallel algorithms for approximate edge coloring

引用

DISCRETE APPLIED MATHEMATICS 2001年第3期108卷 227-238页

作者： Han, YJ Liang, WF Shen, XJ Univ Missouri Comp Sci Telecommun Program Kansas City MO 64110 USA Elect Data Syst Inc CPS Troy MI 48098 USA Univ Queensland CRC Distributed Syst Technol Brisbane Qld 4072 Australia

This paper presents very fast parallel algorithms for approximate edge coloring. Let log((1)) n = log n, log((k)) n = log(log((k-1)) n), and log*(n) = min{k \ log((k)) n /log(c/4) log*(n)])2 colors in O(log log*(n)) t... 详细信息

This paper presents very fast parallel algorithms for approximate edge coloring. Let log((1)) n = log n, log((k)) n = log(log((k-1)) n), and log*(n) = min{k \ log((k)) n < 1}. It is shown that a graph with n vertices and in edges can he edge colored with (2[log(1/4)log*(n)])(c) ([/log(c/4) log*(n)])2 colors in O(log log*(n)) time using O(m + n) processors on the EREW PRAM, where Delta is the maximum vertex degree of the graph and c is an arbitrarily large constant. It is also shown that the graph can he edge colored using at most [4 Delta (1+4/log log log*(Delta)) log(1/2) log*(Delta )1 colors in O(log Delta log log*(Delta)/log log log* (Delta) + log log*(n)) time using O(m + n) processors on the same model. O 2001 Elsevier Science B.V. All rights reserved.

关键词： analysis of algorithms graph algorithms parallel algorithms edge coloring PRAM

来源：评论

学校读者我要写书评

暂无评论

HARDWARE IMPLEMENTATION OF PARTITIONED-parallel algorithms IN LINEAR PREDICTION

引用

SIGNAL PROCESSING 1991年第3期24卷 253-269页

作者： CARAYANNIS, G KOUKOUTSIS, E HALKIAS, CC National Technical University of Athens Division of Computer Science Zografou GR-157 73 Athens Greece

In this paper, a partitioned-parallel strategy for the solution of the Toeplitz equations appearing in the linear prediction case is introduced. From the one extreme, i.e., the use of the Schur recursions for an order recursive implementation with O(1) processors which perform O(p2) MADs (multiplications and divisions), to the other extreme, i.e., the use of the same recursions for a fully parallel implementation with O(p) processors which perform O(p) MADs, there exists a compromise: The hardware designer can 'cut' the computational scheme into suitable partitions, which are executed one after the other, with all the computations of each partition organized in a parallel manner. This way he can achieve increased flexibility, especially in relation to the model order, which can become totally independent of the available number of processors. Moreover, in this paper an abatement methodology is introduced which significantly reduces the number of multiplications of the above computational schemes, as well as the overall algorithm complexity in the case of the parallel design.

关键词： LINEAR PREDICTION PARCOR COEFFICIENTS parallel algorithms PARTITIONED-parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

SYNERGY IN parallel algorithms

引用

parallel COMPUTING 1989年第1期11卷 17-35页

作者： HENDERSON, ME MIRANKER, WL Department of Mathematical Sciences IBM Watson Research Center P.O. Box 218 Yorktown Heights NY 10598 U.S.A.

A property of algorithms called synergy is introduced, and a quantity s , of synergy is defined. When synergized, both parallel and serial algorithms run faster, the parallel algorithms benefiting from a cooperation b... 详细信息

关键词： parallel algorithms parallel processing performance analysis synergy of an iterative algorithm

来源：评论

学校读者我要写书评

暂无评论

The alternating group explicit parallel algorithms for convection dominated diffusion problem of variable coefficient

引用

INTERNATIONAL JOURNAL OF COMPUTER MATHEMATICS 2004年第7期81卷 823-834页

作者： Zhang, ZY Wang, TK Chinese Acad Sci Inst Atmospher Phys LASG Beijing 100029 Peoples R China Nanjing Normal Univ Sch Math & Comp Sci Nanjing 210097 Peoples R China Tianjin Normal Univ Sch Math Sci Tianjin 300074 Peoples R China

A new alternating group explicit method for solving convection dominated diffusion problem of variable coefficient is presented, and the energy method is used to prove the stability of the method.

关键词： parallel algorithms alternating group explicit variable coefficient convection-dominated diffusion problem

来源：评论

学校读者我要写书评

暂无评论

EFFICIENT parallel algorithms FOR GRAPH PROBLEMS

引用

ALGORITHMICA 1990年第1期5卷 43-64页

作者： KRUSKAL, CP RUDOLPH, L SNIR, M UNIV ILLINOIS URBANAIL 61801 CARNEGIE MELLON UNIV PITTSBURGHPA 15213 NYU COURANT INST MATH SCINEW YORKNY 10012 HEBREW UNIV JERUSALEM INST MATH & COMP SCIJERUSALEMISRAEL

We present an efficient technique for parallel manipulation of data structures that avoids memory access conflicts. That is, this technique works on the Exclusive Read/Exclusive Write (EREW) model of computation, which is the weakest shared memory, MIMD machine model. It is used in a new parallel radix sort algorithm that is optimal for keys whose values are over a small range. Using the radix sort and known results for parallel prefix on linked lists, we develop parallel algorithms that efficiently solve various computations on trees and “unicycular graphs.” Finally, we develop parallel algorithms for connected components, spanning trees, minimum spanning trees, and other graph problems. All of the graph algorithms achieve linear speedup for all but the sparsest graphs.

关键词： Biconnected components Connected components Minimum spanning trees parallel algorithms parallel processing PRAM Radix sort Spanning trees Tree computations

来源：评论

学校读者我要写书评

暂无评论

Designing irregular parallel algorithms with mutual exclusion and lock-free protocols

引用

JOURNAL OF parallel AND DISTRIBUTED COMPUTING 2006年第6期66卷 854-866页

作者： Cong, Guojing Bader, David A. Georgia Inst Technol Coll Comp Atlanta GA 30332 USA IBM Corp TJ Watson Res Ctr Yorktown Hts NY USA

Irregular parallel algorithms pose a significant challenge for achieving high performance because of the difficulty predicting memory access patterns or execution paths. Within an irregular application, fine-grained synchronization is one technique for managing the coordination of work;but in practice the actual performance for irregular problems depends on the input, the access pattern to shared data structures, the relative speed of processors, and the hardware support of synchronization primitives. In this paper, we focus on lock-free and mutual exclusion protocols for handling fine-grained synchronization. Mutual exclusion and lock-free protocols have received a fair amount of attention in coordinating accesses to shared data structures from concurrent processes. Mutual exclusion offers a simple programming abstraction, while lock-free data structures provide better fault tolerance and eliminate problems associated with critical sections such as priority inversion and deadlock. These synchronization protocols, however, are seldom used in parallel algorithm designs, especially for algorithms under the SPMD paradigm, as their implementations are highly hardware dependent and their costs are hard to characterize. Using graph-theoretic algorithms for illustrative purposes, we show experimental results on two shared-memory multiprocessors, the IBM pSeries 570 and the Sun Enterprise 4500, that irregular parallel algorithms with efficient fine-grained synchronization may yield good performance. (C) 2006 Elsevier Inc. All rights reserved.

关键词： parallel algorithms irregular algorithm shared memory high-performance algorithm engineering

来源：评论

学校读者我要写书评

暂无评论

A CLASS OF parallel algorithms FOR COMPUTATION OF THE MANIPULATOR INERTIA MATRIX

引用

IEEE TRANSACTIONS ON ROBOTICS AND AUTOMATION 1989年第5期5卷 600-615页

作者： FIJANY, A BEJCZY, AK Jet Propulsion Laboratory California Institute of Technology Pasadena CA USA

A class of parallel and parallel/pipeline algorithms for computation of the manipulator inertial matrix is presented. An algorithm based on the composite rigid-body spatial inertia method, which results in less data dependency and hence better parallelization efficiency, is used for computation of the inertia matrix. Two parallel algorithms are developed which achieve the time lower bound of O((log/sub 2/ n))+O(1) in the computation with O(n/sup 2/) processors. The architectural features required for perfect mapping of these algorithms and their communication complexity are analyzed. The performance of the algorithms when mapped on two- and one-dimensional (linear) processor arrays with nearest-neighbor connection is investigated. Mapping on the linear array results in new algorithms with a computational complexity of k/sub 1/n(log/sub 2/n)+k/sub 2/(log/sub 2/n)+k/sub 3/. A parallel/pipeline algorithm is also presented which achieves the computation time of k/sub 1/n+k/sub 2/(log/sub 2/ n)+k/sub 3/ on the linear array. An architecture-oriented approach is used in the design of the algorithms.< >

关键词： parallel algorithms Concurrent computing Aerodynamics Computational modeling Forward contracts Algorithm design and analysis Manipulator dynamics Space technology Pipelines Complexity theory

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：