检索结果-内蒙古大学图书馆

Proceedings of the 1996 10th international parallel processing symposium

作者： Palaniappan, K. Faisal, Mohammad Kambhamettu, Chandra Hasler, A.Frederick Universities Space Research Assoc Greenbelt United States

The implementation of a parallel algorithm for estimating non-rigid motion vectors using a semi-fluid motion model applied to time-varying satellite imagery is described. Deformable motion tracking of non-rigid biological objects and remotely sensed objects such as clouds, atmospheric aerosols and gases, polar sea ice, or ocean currents are important application domains for the Semi-fluid Motion Analysis (SMA) algorithm. The focus of this paper is on the parallelization of the SMA algorithm for the MasPar MP-2 architecture. Implementation issues that were evaluated in order to make it feasible to explore dense semi-fluid motion estimates of rapid-scan time-varying geostationary satellite imagery of clouds and weather patterns are described. Cloud motion vectors from the SMA algorithm can be used to estimate the wind field that would be useful in a variety of meteorological applications. Comparisons between the parallel and sequential implementations of the SMA algorithm, and with manual results are briefly discussed.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

A deterministic parallel algorithm for the homing sequence problem

A deterministic parallel algorithm for the homing sequence p...

引用

international symposium on parallel and distributed processing (IPDPS)

作者： B. Ravikumar Department of Computer Science University of Rhode Island Kingston RI USA

Homing sequences play an important role in the testing of finite state systems and have been used in a number of applications such as hardware fault detection, protocol verification, and learning algorithms etc. Recent applications of homing sequences involve large DFAs with thousands of states. Such applications motivate the design of a parallel algorithm for this problem. The author present a deterministic parallel algorithm of time complexity O(/spl radic/nlog/sup 2/n) using a polynomial number of processors on the CREW PRAM model. No faster deterministic parallel algorithm is known for this problem. The author also discusses the parallel complexity of some related problems.

关键词： parallel algorithms Algorithm design and analysis Doped fiber amplifiers Polynomials Automata Application software Hardware Fault detection Protocols Computer science

来源：评论

学校读者我要写书评

暂无评论

Efficient algorithms for block-cyclic redistribution of arrays

Efficient algorithms for block-cyclic redistribution of arra...

引用

international symposium on parallel and distributed processing (IPDPS)

作者： Young Won Lim P.B. Bhat V.K. Prasanna Department of EE-Systems University of Southern California Los Angeles CA USA

We present new algorithmic techniques for a classical research problem, runtime redistribution of an array from one block-cyclic layout to another. Our methodology for reducing communication overheads is based on a generalized circulant matrix formalism. Using this formalism, we derive direct, indirect, and hybrid communication schedules for the cyclic redistribution problem when the block size changes by an integer factor K. We have also developed formulae to estimate the timing performance of each of these schedules for a given parallel machine and redistribution problem. In our indirect communication schedule, blocks are moved from a source processor to a destination processor through intermediate "relay" processors. This reduces the number of communication steps by an order of magnitude, in comparison with previous approaches. This algorithm performs cyclic(x) to cyclic(Kx) redistribution on P processors in [log/sub 2/K]+2 steps. Implementations of these algorithms on the Cray T3D and on the IBM SP-2 show superior performance over previous approaches. Since our algorithms are developed using MPI, they can be easily ported to different application environments. Our techniques can be used in the design of scalable redistribution libraries, in efficient implementations of the REDISTRIBUTE directive of HPF and in developing parallel algorithms for various HPC applications.

关键词： Signal processing algorithms Processor scheduling parallel machines Algorithm design and analysis Libraries parallel algorithms High performance computing Program processors Programming profession Search problems

来源：评论

学校读者我要写书评

暂无评论

Efficient run-time support for irregular task computations with mixed granularities

Efficient run-time support for irregular task computations w...

引用

international symposium on parallel processing

作者： Cong Fu Tao Yang Department of Computer Science Santa Barbara University of California Santa Barbara CA USA

ISBN: (纸本)0818672552

Many irregular scientific computing problems can be modeled by directed acyclic task graphs (DAGs). We present an efficient run-time system for executing general asynchronous DAG schedules on distributed memory machines. Our solution tightly integrates the run-time scheme with a fast communication mechanism to eliminate unnecessary overhead in message buffering and copying, and takes advantage of task dependence properties to ensure the correctness of execution. We demonstrate the applications of this scheme in sparse LU and Cholesky factorizations for which actual speedups have been hard to obtain in the literature because parallelism in these problems is irregular and limited. Our experiments on Meiko CS-2 show the promising results of our approach in exploiting irregular task parallelism with mixed granularities.

关键词： Runtime parallel processing Scientific computing Processor scheduling Mechanical factors

来源：评论

学校读者我要写书评

暂无评论

DAG-consistent distributed shared memory

DAG-consistent distributed shared memory

引用

international symposium on parallel processing

作者： R.D. Blumofe M. Frigo C.F. Joerg C.E. Leiserson K.H. Randall MIT Laboratory for Computer Science Cambridge MA USA

Introduces DAG (directed acyclic graph) consistency, a relaxed consistency model for distributed shared memory which is suitable for multithreaded programming. We have implemented DAG consistency in software for the Cilk multithreaded runtime system running on a CM5 Connection Machine. Our implementation includes a DAG-consistent distributed cactus stack for storage allocation. We provide empirical evidence of the flexibility and efficiency of DAG consistency for applications that include blocked matrix multiplication, Strassen's (1969) matrix multiplication algorithm and a Barnes-Hut code. Although Cilk schedules the executions of these programs dynamically, their performances are competitive with statically scheduled implementations in the literature. We also prove that the number F/sub P/ of page faults incurred by a user program running an P processors can be related to the number F/sub 1/ of page faults running serially by the formula F/sub P//spl les/F/sub 1/+2Cs, where C is the cache size and s is the number of thread migrations executed by Cilk's scheduler.

关键词： Processor scheduling Dynamic scheduling Application software Yarn

来源：评论

学校读者我要写书评

暂无评论

Scheduling from the perspective of the application 96

Scheduling from the perspective of the application

引用

international symposium on High Performance distributed Computing

作者： F. Berman R. Wolski Department of Computer Science and Engineering University of California San Diego La Jolla CA USA University of California San Diego La Jolla CA US Dept. of Comput. Sci. & Eng. California Univ. San Diego La Jolla CA USA

ISBN: (纸本)9780818675829

Metacomputing is the aggregation of distributed and high-performance resources on coordinated networks. With careful scheduling, resource-intensive applications can be implemented efficiently on metacomputing systems at the sizes of interest to developers and users. In this paper, we focus on the problem of scheduling applications on metacomputing systems. We introduce the concept of application-centric scheduling in which everything about the system is evaluated in terms of its impact on the application. Application-centric scheduling is used by virtually all metacomputer programmers to achieve performance on metacomputing systems. We describe two successful metacomputing applications to illustrate this approach, and describe AppLeS (Application-Level Scheduling) agents which generalize the application-centric scheduling approach. Finally, we show preliminary results which compare AppLeS-derived schedules with conventional strip and blocked schedules for a 2D Jacobi code.

关键词： Metacomputing Processor scheduling Concurrent computing Application software parallel processing High performance computing Strips Jacobian matrices Computer networks Contracts

来源：评论

学校读者我要写书评

暂无评论

A new design approach and VLSI implementations of recursive digital filters

A new design approach and VLSI implementations of recursive ...

引用

IEEE international symposium on Circuits and Systems (ISCAS)

作者： Yin-Tsung Hwang Ching-Long Sue Department of Electronic Engineering National Yunlin Institute of Technology Yunlin Taiwan

In this paper we address the design problem of high speed recursive digital filters for real time applications. In contrast to the nonrecursive case, there exist feedback loops in recursive filters which often become the performance bottleneck and prevent the circuits from high speed operation. Design tactics such as pipelining and parallel processing offer little help in this case. To achieve more parallelism, a look-ahead transformation applied at the algorithm level may be employed. This however, often causes a drastic increase in the hardware complexity. In this paper we propose a new design approach based on distributed Arithmetic (DA) for the recursive filters. It can outperform the traditional bit parallel design in both pipelining period and initiation interval. To illustrate this new approach, we present 2 different systolic array designs for the ARMA filter. Finally, both designs are implemented by 0.8 /spl mu/m SPDM CMOS technology. For the 4-tap 8-bit wide designs, the simulation results show they can operate at 142.6 and 142.8 MHz, respectively.

关键词： Very large scale integration Pipeline processing CMOS technology Digital filters Feedback loop Feedback circuits parallel processing Hardware Arithmetic Systolic arrays

来源：评论

学校读者我要写书评

暂无评论

MIXED EXPLICIT/IMPLICIT TIME INTEGRATION OF COUPLED AEROELASTIC PROBLEMS - 3-FIELD FORMULATION, GEOMETRIC CONSERVATION And distributed SOLUTION

引用

international JOURNAL FOR NUMERICAL METHODS IN FLUIDS 1995年第10期21卷 807-835页

作者： FARHAT, C LESOINNE, M MAMAN, N UNIV COLORADO CTR AEROSP STRUCTBOULDERCO 80309

A three-field arbitrary Lagrangian-Eulerian (ALE) finite element/volume formulation for coupled transient aeroelastic problems is presented. The description includes a rigorous derivation of a geometric conservation law for flow problems with moving boundaries and unstructured deformable meshes. The solution of the coupled governing equations with a mixed explicit (fluid)/implicit (structure) staggered procedure is discussed with particular reference to accuracy, stability, distributed computing, I/O transfers, subcycling and parallel processing. A general and flexible framework for implementing partitioned solution procedures for coupled aeroelastic problems on heterogeneous and/or parallel computational platforms is described. This framework and the explicit/implicit partitioned procedures are demonstrated with the numerical investigation on an iPSC-860 massively parallel processor of the instability of flat panels with infinite aspect ratio in supersonic airstreams.

关键词： CFD STRUCTURES AEROELASTICITY parallel processing

来源：评论

学校读者我要写书评

暂无评论

Irregular applications in PROMOTER

Irregular applications in PROMOTER

引用

Proceedings of the 1995 2nd international Conference on Programming Models for Massively parallel Computers

作者： Schramm, A. RWCP Massively Parallel Systems Berlin Germany

parallel computers with distributed memory are gaining popularity on account of their optimal scalability. However, their efficient use requires a locality-preserving mapping of the application's underlying graph structure onto the physical topology of the target platform. PROMOTER is a parallel programming model which supports an automatic mapping by the compiler by making the graph structures explicit and thus processable by the implementation. This article describes how this is done for applications with irregular and dynamic spatial structures.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Scalable Tuple space model for structured parallel programming

Scalable Tuple space model for structured parallel programmi...

引用

Proceedings of the 1995 2nd international Conference on Programming Models for Massively parallel Computers

作者： Corradi, Antonio Zambonelli, Franco Leonardi, Letizia Universita di Bologna Bologna Italy

The paper proposes and analyzes a scalable model of an associative distributed shared memory for massively parallel architectures. The proposed model is hierarchical and fits the modern style of structured parallel programming. If parallel applications are composed of a set of modules with a well-defined scope of interaction, the proposed model can induce a memory access latency time that only logarithmically increases with the number of nodes. Experimental results show the effectiveness of the model with a transputer-based implementation.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：