检索结果-内蒙古大学图书馆

5th international symposium on parallel architectures, algorithms and Networks (I-SPAN 2000)

作者： Matsumura, T Nakamura, M Tamaki, S Onaga, K Univ Ryukyus Dept Informat Engn Nishihara Okinawa 9030213 Japan

ISBN: (纸本)0769509363

This paper proposes two parallel meta-heuristics. One is a cooperative parallel tabu search which incorporates with the historical information exchange among processors in addition to its own searching of each processor. The other is a cooperative parallel starch between genetic algorithm and tabu search processes. Through computational experiment we observe the improvement of solutions by our proposed method.

关键词： Tabu search

来源：评论

学校读者我要写书评

暂无评论

A parallel processor architecture for prefetching 5

A parallel processor architecture for prefetching

引用

5th international symposium on parallel architectures, algorithms and Networks (I-SPAN 2000)

作者： Kim, SM Manoharan, S Orion Syst Ltd Auckland New Zealand

ISBN: (纸本)0769509363

Prefetching brings data into the cache before it is expected by the processor, thereby eliminating a potential cache miss. There are two major prefetching schemes. I,I a software scheme, the compiler predicts the memory access pattern and places prefetch instructions into the code. In a hardware scheme the hardware predicts the memory access pattern and brings data into the cache before required by the processor: This paper proposes a hardware prefetching scheme, where a second processor is used for prefetching data for the primary processor. The scheme does not predict memory access patterns, but rather uses the second processor to run ahead of the primary processor so as to detect future memory accesses and prefetch these references.

关键词： parallel architectures

来源：评论

学校读者我要写书评

暂无评论

Uintah: A massively parallel problem solving environment 9

Uintah: A massively parallel problem solving environment

引用

9th IEEE international symposium on High Performance Distributed Computing

作者： Germain, JDDS McCorquodale, J Parker, SG Johnson, CR Univ Utah Sci Comp & Imaging Inst Salt Lake City UT 84112 USA

ISBN: (纸本)0769507840

This paper describes Uintah, a component-based visual problem solving environment (PSE) that is designed to specifically address the unique problems of massively parallel computation, on terascale computing platforms. Uintah supports the entire life cycle of scientific applications by allowing scientific programmers to quickly and easily develop new techniques, debug new implementations, and apply known algorithms to solve novel problems. Uintah is built on three principles: 1) As much as possible, the complexities of parallel execution should be handled for the scientist, 2) software should be reusable at the component level, and 3) scientists should be able to dynamically steer and visualize their simulation results as the simulation executes. To provide this functionality, Uintah builds upon the best features of the SCIRun PSE and the DoE Common Component Architecture (CCA).

关键词： Application software Component architectures Computational modeling Concurrent computing Problem-solving programming profession Runtime Scientific computing Software reusability Visualization

来源：评论

学校读者我要写书评

暂无评论

Optimal parallel merging algorithms on BSR 5

Optimal parallel merging algorithms on BSR

引用

5th international symposium on parallel architectures, algorithms and Networks (I-SPAN 2000)

作者： Xiang, LM Ushijima, K Kyushu Univ Dept Comp Sci & Commun Engn Higashi Ku Fukuoka 8128581 Japan

ISBN: (纸本)0769509363

Merging is one of the most fundamental problems in computer science. it is well known that Omega (N/p+log log N) time is required to merge two sorted sequences each of length N on CRCW PRAM with p processors. where p less than or equal to N log(alpha) N for any constant alpha. In this paper, we describe two optimal O(1) time solutions to the problem for p = N on BSR (Broadcasting with Selective Reduction). They are the first constant time solutions to the problem on any model of computation We also give an optimal solution to the problem for p < N on BSR with O(N/p) time, which is the first improvement with non-constant time but still better than the lower bound for PRAM.

关键词： Merging

来源：评论

学校读者我要写书评

暂无评论

How to vectorize the algebraic multilevel iteration

引用

ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE 2000年第2期26卷 293-309页

作者： Grosz, L Massey Univ Inst Informat & Math Sci N Shore Mail Ctr Auckland New Zealand

We consider the algebraic multilevel iteration (AMLI) for the solution of systems of linear equations as they arise from a finite-difference discretization on a rectangular grid. Key operation is the matrix-vector product, which can efficiently be executed on vector and parallel-vector computer architectures if the nonzero entries of the matrix are concentrated in a few diagonals. In order to maintain this structure for all matrices on all levels coarsening in alternating directions is used. In some cases it is necessary to introduce additional dummy grid hyperplanes. The data movements in the restriction and prolongation are crucial, as they produce massive memory conflicts on vector architectures. By using a simple performance model the best of the possible vectorization strategies is automatically selected at runtime. Examples show that on a Fujitsu VPP300 the presented implementation of AMLI reaches about 85% of the useful performance, and scalability with respect to computing time can be achieved.

关键词： algorithms performance large linear systems preconditioned iterative solver vector computer multigrid method numerical software parallel processing

来源：评论

学校读者我要写书评

暂无评论

parallel real-time computation: Sometimes quantity means quality 5

Parallel real-time computation: Sometimes quantity means qua...

引用

5th international symposium on parallel architectures, algorithms and Networks (I-SPAN 2000)

作者： Akl, SG Queens Univ Dept Comp & Informat Sci Kingston ON K7L 3N6 Canada

ISBN: (纸本)0769509363

We show that within the paradigm of real-time computation, some classes of problems have the pl-ope,tv that a solution to a problem in the class, when computed in parallel, is far superior in quality than the best one obtained on a sequential computer. Example from these classes are presented. In each case, the solution obtained in parallel is significantly, provably, and consistently better than a sequential one. It is important to note that the purpose of this paper is not to demonstrate merely that a parallel computer can obtain a better solution to a computational problem than one derived sequentially. The latter is an interesting (and often surprising) observation in its own right, but we wish to go further. It is shown here that the improvement in quality can be arbitrarily high (and certainly superlinear irt the number of processors used by the parallel computer). This result is akin to superlinear speedup-a phenomenon itself originally thought to be impossible.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Efficient parallel algorithms for multi-dimensional matrix operations 5

Efficient parallel algorithms for multi-dimensional matrix o...

引用

5th international symposium on parallel architectures, algorithms and Networks (I-SPAN 2000)

作者： Liu, JS Lin, JY Chung, YC Feng Chia Univ Dept Informat Engn Taichung 407 Taiwan

ISBN: (纸本)0769509363

Matrix operations are the core of many linear systems. Efficient matrix multiplication is critical to many numerical applications, such as climate modeling, molecular dynamics computational fluid dynamics and etc. Much research work has been done to improve the performance of matrix operations. However, the majority of these works is focused on two-dimensional (2D) matrix. Very little research work has been done on three or higher dimensional matrix. Recently. a new structure called Extended Karnaugh Map Representation (EKMR) for n-dimensional (nD) matrix representation has been proposed, which provides better matrix operations performance compared to the Traditional matrix representation (TMR). The main idea of EKMR is to represent any no matrix by 2D matrices. Hence, efficient algorithms design for no matrices becomes less complicated. parallel matrix operation algorithms based oil EKMR and TMR are presented Analysis and experiments are conducted to assess their performance. Both our analysis and experimental result show that parallel algorithms based on EKMR outperform those based on TMR.

关键词： parallel algorithm compiler matrix operation multi-dimensional matrix data structure

来源：评论

学校读者我要写书评

暂无评论

Efficient optimization algorithms for parallel applications

Efficient optimization algorithms for parallel applications

引用

8th symposium on Multidisciplinary Analysis and Optimization 2000

作者： Venter, Gerhard Watson, Brian Vanderplaats Research and Development Inc 1767 S. 8th Street Suite 100 Colorado Springs CO 80906 United States

The present paper investigates parallelization of general purpose numerical optimization algorithms, where the optimization algorithm is coupled with an existing analysis program. Since these optimization algorithms may be coupled to almost any analysis, parallelization of the analysis itself is not considered. The paper considers a typical structural finite element model to investigate the parallel efficiency of a number of existing gradient-based algorithms and proposes a new algorithm for massively parallel applications. The new algorithm is based on statistical design of experiments (DOE). Finally, the paper also investigates the parallel efficiency when implementing these algorithms on different parallel architectures. For the existing gradient-based algorithms considered, the sequential linear programming (SLP) algorithm had the highest parallel efficiency. Initial results for the DOE based algorithm seems promising, especially when coupled with a parallel gradient-based algorithm. Finally, our investigation indicates that a shared memory architecture may not be the best choice for parallel optimization using numerical simulations with significant amounts of disk I/O. © 2000 Vanderplaats Research & Development, Inc.

关键词： Efficiency

来源：评论

学校读者我要写书评

暂无评论

Parametrically described regular semigroup interconnection networks for large-scale multicomputers 5

Parametrically described regular semigroup interconnection n...

引用

5th international symposium on parallel architectures, algorithms and Networks (I-SPAN 2000)

作者： Monakhov, OG Monakhova, EA Russian Acad Sci SD Inst Computat Math & Math Geophys Novosibirsk 630090 Russia

ISBN: (纸本)0769509363

We propose a new topology for multicomputer networks: Parametrically described, Regular, anal based on Semigroups (PRS) networks (or R-s(N, v, g) graphs with the order NI the degree tl, the girth g, and the number of equivalence classes s). Many classes of networks such as hypercubes, circulants, cube-connected cycles, etc. are shown to be special cases of the proposed network. Here. we explore the basic structure, topological properties, optimization of parameters and syntheses of optimal networks having the minimal Corameter for the given parameters of the graph. Correspondingly, we examine the optimal characteristics with respect to transit delays and structural survival in such networks. The PRS networks reaching the lover bounds on the diameter were synthesized. In some cases, we found that the new network has a better diameter than classes of networks described in the literature provided they have the same vertex and edge complexity.

关键词： regular interconnection networks parallel systems circulant networks hypercube topologies

来源：评论

学校读者我要写书评

暂无评论

Hierarchical 3D-torus interconnection network 5

Hierarchical 3D-torus interconnection network

引用

5th international symposium on parallel architectures, algorithms and Networks (I-SPAN 2000)

作者： Horiguchi, S Ooki, T Japan Adv Inst Sci & Technol Grad Sch Informat Sci Tatsunokuchi Ishikawa 9231292 Japan

ISBN: (纸本)0769509363

Three dimensional (3D) stacked implementation has been proposed as a new technology for massively parallel computers. However, two major limitations have hindered the progress in this direction: the technology of vertical interconnects and the cost in terms of area for these vertical interconnects. Each vertical interconnect requires 300 mum 300mum area, thus liberal is prohibited. Clearly, an interconnection philosophy which minimizes these vertical links can contribute to the success of a 3D implementation. Hierarchical 3D-Torus network, called H3D-torus has been proposed to reduce the number of vertical links in 3D stacked implementation but keeping good network feature. This paper addresses the architectural details of H3D-torus network, and explores aspects such as the network diameter, the peak number of vertical links, and VLSI layout area for the H3D-torus network as well as for several commonly used networks for parallel computers.

关键词： massively parallel computer hierarchical interconnection interconnection networks wafer scale integration (WSI) 3D stacked implementation peak number of vertical links

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：