检索结果-内蒙古大学图书馆

5th international Conference on High Performance Computing, HiPC 1998

ISBN: (纸本)0818691948

the proceedings contain 61 papers. the topics discussed include: new number representation and conversion techniques on reconfigurable mesh;precise control of instruction caches;more on arbitrary boundary packed arithmetic;more on arbitrary boundary packed arithmetic;PERL - a registerless architecture;design alternatives for shared memory multiprocessors;a simple optimal list ranking algorithm;a parallel skeletonization algorithm and its VLSI architecture;improving error bounds for multipole-based treecodes;computation of penetration measures for convex polygons and polyhedra for graphics applications;extrapolation in distributed adaptive integration;and java data parallel extensions with runtime system support.

关键词：

来源：评论

学校读者我要写书评

暂无评论

An empirical comparison of runtime systems for conservative parallel simulation 12th

引用

10 Workshops held in conjunction with 12th international parallel symposium and 9th symposium on parallel and distributed processing, IPPS/SPDP 1998

作者： Lim, Chu-Cheow Low, Yoke-Hean Cai, Wentong Hsu, Wen Jing Huang, Shell Ying Turner, Stephen J. Gintic Institute of Manufacturing 71 Nanyang Drive Singapore639798 Singapore School of Applied Science Nanyang Technological University Singapore639798 Singapore Dept. of Computer Science University of Exeter ExeterEX4 4PT United Kingdom

ISBN: (纸本)3540643591

A main consideration when implementing a parallel simulation application is the choice of the parallel simulation protocol (conservative vs. optimistic). Given a particular protocol, the application programmer then has to determine a suitable parallel runtime system to implement the application. If the choice is an optimistic protocol, there are several parallel simulation libraries intended for application progranamers (e.g. GTW, Warped). For a conservative protocol, the most effective approach is tbr the programmer to use a general parallel nmtime library, and implement optimizations specific to the simulation application and/or model. In this paper, we selected four general parallel rtmtime libraries potentially relevant to parallel simulations, and implemented a conservative protocol on each of them. We study the four libraries on three main aspects: (a) progranamability;(b) performance, and (c) mechanisms for performance ttming. Our target platforms are machines supporting shared address spaces (e.g. SGI Origin200, Sun Enterprise 3000), and we obtained performance figures from a 4-CPU Ultra2 Sun Enterprise 3000. From our experiments, we find that POSIX, though an industry standard, still has relatively high overheads, and cannot efficiently support a protocol with fine-grain LPs. the research libraries all show speedups on 4 processors, but to different extents. Cilk speedup curves improves with larger thread granularity, while Active threads show relatively good speedup even for small thread granularity. BSP processes are naturally coarse-grained, and thus good speedup is achieved in our simulation application. © Springer-Verlag Berlin Heidelberg 1998.

关键词： Libraries

来源：评论

学校读者我要写书评

暂无评论

A shape-adaptive partitioning method for MPEG-4 video encoding 5

A shape-adaptive partitioning method for MPEG-4 video encodi...

引用

5th IEEE international Conference on Electronics, Circuits and Systems, ICECS 1998

作者： He, Yong Ahmad, Ishfaq Liou, Ming L. Department of EEE Hong Kong University of Science and Technology Clear Water Bay Kowloon Hong Kong Department of Computer Engineering Hong Kong University of Science and Technology Clear Water Bay Kowloon Hong Kong

ISBN: (纸本)0780350081

MPEG-4 is a new standard for multimedia applications. Due to the flexible and extensible features of MPEG-4, the software-based implementation seems to be a natural and viable option. While such approaches usually require huge computing power, we can overcome such problem by using parallel and distributed processing. Because the behaviour of MPEG-4 objects may vary with time and such variation cannot be predicted in advance, the issues of data partition and load balancing of the multiprocessor systems need to be addressed carefully in order to achieve real-time operation performance. In this paper, we propose a shape-adaptive data partitioning method to guarantee the load balancing among the multiprocessor systems. the effectiveness of this method has been demonstrated by the experimental results. © 1998 IEEE.

关键词： Data handling

来源：评论

学校读者我要写书评

暂无评论

Jacobi orderings for multi-port hypercubes

Jacobi orderings for multi-port hypercubes

引用

international symposium on parallel processing

作者： D. Royo A. Gonzalez M. Valero-Garcia Department of Computer Architecture Universitat Politecnica de Catalunya Barcelona Spain

the communication cost plays a key role in the performance of many parallel algorithms. In the particular case of the one-sided Jacobi method for symmetric eigenvalue and eigenvector computation the communication cost of previously proposed algorithms is mainly determined by the particular ordering being used. We propose two novel Jacobi orderings: the permuted-BR ordering and the degree-4 ordering, aimed at efficiently exploiting the multi-port capability of a hypercube. It is shown that the former is nearly optimal for some scenarios and the latter outperforms previously known orderings by a factor of two.

关键词： Jacobian matrices Hypercubes Computer architecture Eigenvalues and eigenfunctions Symmetric matrices Concurrent computing Proposals Computer applications distributed computing Pattern matching

来源：评论

学校读者我要写书评

暂无评论

Foreword

引用

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 1998年 1388卷 V页

作者： Rolim, José D. P. University of Geneva Computer Science Center 23 Rue Général Dufour Geneva 4 Switzerland

来源：评论

学校读者我要写书评

暂无评论

Compile-time synchronization optimizations for software DSMs

Compile-time synchronization optimizations for software DSMs

引用

international symposium on parallel processing

作者： Hwansoo Han Chau-Wen Tseng Department of Computer Science University of Maryland College Park MD USA

Sofware distributed-shared-memory (DSM) systems provide a desirable target for parallelizing compilers due to their flexibility. However, studies show synchronization and load imbalance are significant sources of overhead. the authors investigate the impact of compilation techniques for eliminating synchronization overhead in software DSMs, developing new algorithms to handle situations found in practice. they evaluate the contributions of synchronization elimination algorithms based on 1) dependence analysis, 2) communication analysis, 3) exploiting coherence protocols in software DSMs, and 4) aggressive expansion of parallel SPMD regions. they also found suppressing expensive parallelism to be useful for one application. Experiments indicate these techniques eliminate almost all parallel task invocations, and reduce the number of barriers executed by 66% on average. On a 16 processor IBM SP-2, speedups are improved on average by 35%, and are tripled for some applications.

关键词： parallel processing Application software Software systems Algorithm design and analysis Program processors Time measurement Computer science Educational institutions Software algorithms Programming profession

来源：评论

学校读者我要写书评

暂无评论

Adaptive quality equalizing: high-performance load balancing for parallel branch-and bound across applications and computing systems

Adaptive quality equalizing: high-performance load balancing...

引用

international symposium on parallel processing

作者： N.R. Mahapatra S. Dutt Department of Electrical & Computer Engineering State University of New York University at Buffalo Buffalo NY USA Department of Electrical Engineering & Computer Science University of Illinois Chicago Chicago IL USA

In this paper we present an adaptive version of our previously proposed quality equalizing (QE) load balancing strategy that attempts to maximize the performance of parallel branch-and-bound (B&B) by adapting to application and target computing system characteristics. Adaptive QE (AQE) incorporates the following salient adaptive features: (1) Anticipatory quantitative and qualitative load balancing mechanisms. (2) Regulation of load information exchange overhead. (3) Deterministic load balancing in extended neighborhoods instead of just immediate neighborhoods as in non-adaptive QE. (4) Randomized global load balancing to fetch work from outside the extended neighborhood. AQE fields speedup improvements of up to 80%, and 15% on the average, compared to that provided by QE for several real-world mixed-integer programming (MIP) problems, and near-ideal speedups for two of the largest problems in the MIPLIB benchmark suite on an IBM SP2 system.

关键词： Adaptive equalizers Load management Concurrent computing Application software Computer applications Optimization methods Cost function Space exploration Search problems parallel processing

来源：评论

学校读者我要写书评

暂无评论

Real-time distributed and parallel processing for MPEG-4

Real-time distributed and parallel processing for MPEG-4

引用

IEEE international symposium on Circuits and Systems (ISCAS)

作者： Yong He I. Ahmad M.L. Liou Department of EEE Hong Kong University of Science and Technology Hong Kong China

MPEG-4 is currently being developed by MPEG to specify the technologies for supporting current and emerging multimedia applications. Because of its object-based features and flexible toolbox approach, it is much more complex than previous video coding standards. We believe that software-based implementation on parallel and distributed computing systems is a natural and viable option. In this paper, we describe such an approach on the MPEG-4 video encoder using a cluster of workstations. We propose to use hierarchical Petri nets as a modeling tool to describe the temporal relations and time constraints among various video objects at different levels. this would allow us to perform scheduling with a guarantee of synchronization among multiple objects. A dynamic shape-adaptive data parallel approach is used in the spatial domain for further speed-up gain. Our preliminary results indicate that real-time MPEG-4 encoding using distributed and parallel computing is achievable.

关键词： parallel processing MPEG 4 Standard Application software Video coding distributed computing Workstations Petri nets Time factors Processor scheduling Encoding

来源：评论

学校读者我要写书评

暂无评论

Optimal all-to-some personalized communication on hypercubes

Optimal all-to-some personalized communication on hypercubes

引用

international symposium on parallel processing

作者： Y.C. Hu Department of Computer Science Rice University Houston TX USA

In a hypercube multiprocessor with distributed memory, each data element has a street address and an apartment number (i.e. a hypercube node address and a local memory address). We describe an optimal algorithm for performing the all-to-some personalized communication (ASPC) on Boolean n-cubes, defined as (i|j)/spl rarr/(i/spl plusmn/2/sup j/|j), i/spl isin/[0,2/sup n/-1], j/spl isin/[0,n-1], where (i|j) denote the data element on node i and location j. the algorithm also gives an optimal schedule for emulating PM2I networks on hypercubes under the binary-reflected Gray code encoding. We also study an important class of parallel algorithms, called /spl plusmn/2/sup b/-descend, which perform log M iterations on an M-element input a[O:M-1]. For b=log M-1,...,0, iteration b computes new values of each a[i] as a function of a[i], a[i+2/sup b/], a[i-2/sup b/]. For large applications, the problem size M is typically much larger than the number of nodes N. We show that on hypercubes, the optimal ASPC algorithm devised in this paper can be used in combination with pipelining communication and computation in /spl plusmn/2/sup b/-descend computations to reduce the communication steps from *** N.M/N to 4(log M+M/N-1). At one communication step, a hypercube node can send n elements along its n links, one per link.

关键词： Hypercubes Electronic switching systems Computer science Pipeline processing Genetic mutations Joining processes

来源：评论

学校读者我要写书评

暂无评论

Shape-adaptive partitioning method for MPEG-4 video encoding

Proceedings of the IEEE International Conference on Electron...

引用

Proceedings of the IEEE international Conference on Electronics, Circuits, and Systems 1998年 2卷 239-242页

作者： He, Yong Ahmad, Ishfaq Liou, Ming L. Hong Kong Univ of Science and Technology Kowloon Hong Kong

MPEG-4 is a new standard for multimedia applications. Due to the flexible and extensible features of MPEG-4, the software-based implementation seems to be a natural and viable option. While such approaches usually require huge computing power, we can overcome such problem by using parallel and distributed processing. Because the behaviour of MPEG-4 objects may vary with time and such variation can not be predicted in advance, the issues of data partition and load balancing of the multiprocessor systems need to be addressed carefully in order to achieve real-time operation performance. In this paper, we propose a shape-adaptive data partitioning method to guarantee the load balancing among the multiprocessor systems. the effectiveness of this method has been demonstrated by the experimental results.

关键词： Image coding

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：