检索结果-内蒙古大学图书馆

proceedings of the 1995 30th National Heat Transfer conference. Part 9

作者： Mousseau, V.A. Hansen, G.A. McHugh, P.R. Knoll, D.A. Idaho Natl Engineering Lab Idaho Falls United States

Results are presented showing parallel implementations of domain based preconditioners used in conjunction with a Newton-Krylov solver for calculating natural convection in a square cavity. Newton-Krylov techniques are based on the use of Newton's method to linearize the discrete equations and a Krylov projection method to solve the resulting linear systems. The calculations are based on a finite volume discretization of the incompressible Navier-Stokes equations and an enemy equation in primitive variable form on a staggered grid. Viability of the Newton-Krylov technique often depends on the effectiveness of the preconditioner. Consequently, effective preconditioning can be the most CPU and memory intensive operation within the solution algorithm. For these reasons, domain decomposition based preconditioners are used because of their inherent parallelism. Results are presented for strip-wise, domain-based preconditioners on two different computational architectures: single CPU and distributed computing cluster. These parallel results are compared and contrasted to the use of global, Incomplete Lower-Upper (ILU) factorization type preconditioners in a serial implementation.

关键词： Natural convection

来源：评论

学校读者我要写书评

暂无评论

Potentials and limitations of parallel computing on a cluster of workstations

Potentials and limitations of parallel computing on a cluste...

引用

International conference on parallel and distributed Systems (ICPADS)

作者： M. Hamdi Yi Pan B. Hamidzadeh F.M. Lim Department of Computer Science Hong Kong University of Science and Technology Kowloon Hong Kong China Department of Computer Science University of Dayton Dayton OH USA Department of Electrical Engineering University of British Columbia Vancouver Canada

parallel computing on clusters of workstations is receiving much attention from the research community. Unfortunately, many aspects of parallel computing over this parallel computing engine is not very well understood. Some of these issues include the workstation architectures, the network protocols, the communication-to-computation ratio, the load balancing strategies, and the data partitioning schemes. The aim of this paper is to assess the strengths and limitations of a cluster of workstations by capturing the effects of the above issues. This has been achieved by evaluating the performance of this computing environment in the execution of a parallel ray tracing application through analytical modeling and extensive experimentation.

关键词： parallel processing Workstations Engines Computer architecture Protocols Load management Concurrent computing Ray tracing Performance analysis Analytical models

来源：评论

学校读者我要写书评

暂无评论

Analysis of multidimensional loops with non-uniform dependences

Analysis of multidimensional loops with non-uniform dependen...

引用

proceedings of the advances in parallel and distributed computing

作者： J.-C. Sogno INRIA France

For a parallelizing compiler, mainly based on loop transformations, dependence information that is as complete and precise as possible is required. In this paper, we propose a generalized method for computing, in any multi-dimensional loop, information which proved to be useful in the case of irregular dependences. Firstly, we solve the basic problem of the existence of a dependence with an algorithm composed of a preprocessing phase of reduction and of an integer simplex resolution. If a solution exists, we compute by integer simplex the bounds of the distances associated with loop indices. Depending on the values of these bounds, we finally define problems consisting in evaluating the bounds of slopes of dependence vectors, which we solve by integer linear fractional programming. The amount of computation for each new problem is very low. This algorithm has been implemented as an extension of the Janus Test, which was presented in a previous work.

关键词： Multidimensional systems Information analysis Vectors Integer linear programming Testing Performance evaluation Program processors parallel processing Computational efficiency

来源：评论

学校读者我要写书评

暂无评论

Market-based massively parallel Internet computing

Market-based massively parallel Internet computing

引用

Working conference on Massively parallel Programming Models

作者： P. Cappello B.O. Christiansen M.O. Neary K.E. Schauser Department of Computer Science University of California슠Santa Barbara Santa Barbara CA USA

Recent advances in Internet connectivity and implementations of safer distributed computing through languages such as Java provide the foundation for transforming computing resources into tradable commodities. We have developed Javelin, a Java-based prototype of a globally distributed heterogeneous, high-performance computational infrastructure that conveniently enables rapid execution of massively parallel applications. Our infrastructure consists of three entities: Hosts, clients, and brokers. Our goal is to allow users to buy and sell computational power using supply and demand, and market mechanisms to marshal computational power far beyond what can be achieved via conventional techniques. Several research issues must be worked out to make this vision a reality: allocating resources between computational objects via market mechanisms; expressing and enforcing scheduling and quality of service constraints; modeling programming in a global computing ecosystem; supporting heterogeneous execution without sacrificing computational speed; ensuring host security; global naming and communication; and client privacy.

关键词： Internet Concurrent computing distributed computing Computer vision Java Prototypes Supply and demand Resource management Processor scheduling Quality of service

来源：评论

学校读者我要写书评

暂无评论

Molecular mechanical calculations on parallel and distributed processors

Proceedings of the Conference on High Performance Computing ...

引用

proceedings of the conference on High Performance computing on the Information Superhighway, HPC Asia'97 1997年 752-755页

作者： Won, Youngdo Hanyang Univ Seoul Korea Republic of

The molecular mechanical potential function is widely used in molecular modeling and simulation researches. The most CPU time consuming parts of molecular mechanical potential are the nonbonding interaction terms. An efficient parallel algorithm for nonbonding energy calculation is outlined and its implementation is tested on a variety of parallel and distributed processing elements. As minimal parallel constructs are added, the current implementation does not modify nor slow down the serial algorithm. Load balancing flexible enough to accommodate local loads of each PE and optimization of list updating procedure are desired and under developments.

关键词： Molecular dynamics

来源：评论

学校读者我要写书评

暂无评论

A multithreaded processor designed for distributed shared memory systems

A multithreaded processor designed for distributed shared me...

引用

proceedings of the advances in parallel and distributed computing

作者： W. Grunewald T. Ungerer Department of Computer Design and Fault Tolerance University of Karlsruhe Karlsruhe Germany

The multithreaded processor-called Rhamma-uses a fast context switch to bridge latencies caused by memory accesses or by synchronization operations. Load/store, synchronization, and execution operations of different threads of control are executed simultaneously by appropriate functional units. A fast context switch is performed whenever a functional unit comes across an operation that is destined for another unit. The overall performance depends on the speed of the context switch. We present two techniques to reduce the context switch cost to at most one processor cycle: A context switch is explicitly coded in the opcode, and a context switch buffer is used. The load/store unit shows up as the principal bottleneck. We evaluate four implementation alternatives of the load/store unit to increase processor performance.

关键词： Process design Switches Yarn Bridges Multiprocessing systems Microprocessors Context Laboratories Standards development Microcomputers

来源：评论

学校读者我要写书评

暂无评论

ATOLL: a high-performance communication device for parallel systems

ATOLL: a high-performance communication device for parallel ...

引用

proceedings of the advances in parallel and distributed computing

作者： U. Bruening L. Schaelicke Department of Computer Engineering University of Mannheim Germany Department of Computer Science University of Utah Salt Lake UT USA

Fast and efficient communication is one of the major design goals not only for parallel systems but also for clusters of workstations. The proposed model of the high performance communication device ATOLL features very low latency for the start of communication operations and reduces the software overhead for communication specific functions. To close the gap between off-the-shelf microprocessors and the communication system a highly sophisticated processor interface implements atomic start of communication, MMU support, and a flexible event scheduling scheme. The interconnectivity of ATOLL provided by four independent network ports combined with cut-through routing allows the configuration of a large variety of network topologies. A software transparent error correction mechanism significantly reduces the required protocol overhead. The presented simulation results promise high performance and low-latency communication.

关键词： Routing Delay Hardware Network topology Error correction Protocols Operating systems Bandwidth Message passing Computer science

来源：评论

学校读者我要写书评

暂无评论

A distributed memory MIMD multi-computer with reconfigurable custom computing capabilities

A distributed memory MIMD multi-computer with reconfigurable...

引用

International conference on parallel and distributed Systems (ICPADS)

作者： Hyun-Kyu Yun H.F. Silverman LEMS Brown University Providence RI USA

Armstrong III is a multi node multi-computer designed and built at the Laboratory for Engineering Man/Machine System (LEMS) of Brown University. Each node contains a RISC processor and reconfigurable resources implemented with FPGAs. The primary benefit in using FPGAs is that the resulting hardware is neither rigid nor permanent but is in-circuit reprogrammable. This allows each node to be tailored to the computational requirements of an application. This paper describes the Armstrong III architecture and concludes with a substantive example application that performs HMM Training for speech recognition with the reconfigurable platform.

关键词： distributed computing Field programmable gate arrays Reduced instruction set computing Hidden Markov models Computer architecture Communication cables Application software Computer performance Computer applications Sun

来源：评论

学校读者我要写书评

暂无评论

Minimizing communication in bitonic sorting software

Minimizing communication in bitonic sorting software

引用

International conference on parallel and distributed Systems (ICPADS)

作者： Jae-Dong Lee Kyung-Hee Kwon Young-Beom Park Department of Computer Science Dankook University South Korea

ISBN: (纸本)0818682272

Two parallel sorting algorithms, GENERAL-BS and MINIMIZING-BS, which are implemented on shared-memory parallel computers, are presented in this paper. A parity strategy which gives an idea for the efficient usage of the local memory associated with each processor is introduced. The number of network accesses(or communications) of the algorithm MINIMIZING-BS is reduced by approximately one half compared with the algorithm GENERAL-BS. On the basis of decreasing the communication, the algorithm MINIMIZING-BS results in a significant improvement of performance.

关键词： Sorting Concurrent computing parallel processing Computer networks Phased arrays Computer science High performance computing Hypercubes Computational modeling Multiprocessor interconnection networks

来源：评论

学校读者我要写书评

暂无评论

Enlarging the scope of vector-based computations: extending Fortran 90 by nested data parallelism

Enlarging the scope of vector-based computations: extending ...

引用

proceedings of the advances in parallel and distributed computing

作者： K.T.P. Au M.M.T. Chakravarty J. Darlington Y. Guo S. Jahnichen M. Kohler G. Keller W. Pfannenstiel M. Simons Department of Computing Imperial College London London UK Forschugsgruppe Softwaretechnik Technische Universitat Berlin Berlin Germany GMD FIRST Berlin Germany

This paper describes the integration of nested data parallelism into Fortran 90. Unlike flat data parallelism, nested data parallelism directly provides means for handling irregular data structures and certain forms of control parallelism, such as divide-and-conquer algorithms thus enabling the programmer to express such algorithms far more naturally. Existing work deals with nested data parallelism in a functional environment, which does help avoid a set of problems, but makes efficient implementations more complicated. Moreover functional languages are not readily accepted by programmers used to languages such as Fortran and C, which are currently predominant in programming parallel machines. In this paper, we introduce the imperative data-parallel language Fortran 90V and give an overview of its implementation.

关键词： Concurrent computing parallel processing Libraries Computer architecture parallel programming Data structures Computer languages Program processors Gold Educational institutions

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：