检索结果-内蒙古大学图书馆

annual acm symposium on parallel algorithms and architectures 2000年 109-118页

作者： auf der Heide, F.Meyer Raecke, H. Westermann, M. Paderborn Univ Paderborn Germany

A hierarchical bus network T = (V, E) uses hierarchically, tree-like connected buses as a communication network. New communication technologies like SCI (Scalable Coherent Interface) make such networks very attractive, because they allow their easy construction and guarantee reasonable communication performance. Such networks can be modeled as tree networks: leaves correspond to processors, inner nodes to buses, edges to switches, and bandwidths of inner nodes and edges are related to bandwidths of buses and switches, respectively. In this paper we address the problem of static data management. Given a set of shared data objects X and the read and write frequencies from the processors to the shared data objects, the goal is to compute a (maybe redundant) placement of the shared data objects to the processors, such that the congestion (the maximum over the load of all edges and inner nodes, induced by the read and write frequencies, divided by the bandwidth of the edge or inner node, respectively) is minimized. It is known that this problem can be solved optimally in linear time, if inner nodes are allowed to hold copies of shared data objects. In our model, inner nodes correspond to buses and therefore cannot store copies of shared data objects. We show that this restriction increases the complexity of the placement problem drastically: It becomes NP-hard. On the other hand, the main contribution of our paper is an approximation algorithm with runtime O(|X|·|V|·height(T)·log(degree(T))) that increases the congestion by a factor of at most 7.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Infinite parallel job allocation

Annual ACM Symposium on Parallel Algorithms and Architecture...

引用

annual acm symposium on parallel algorithms and architectures 2000年 99-108页

作者： Berenbrink, Petra Czumaj, Artur Friedetzky, Tom Vvedenskaya, Nikita D. Paderborn Univ Paderborn Germany

In recent years, the task of allocating jobs to servers has been studied with the `balls and bins' abstraction. Results in this area exploit the large decrease in maximum load that can be achieved by allowing each job (ball) a little freedom in choosing its destination server (bin). In this paper we examine an infinite and parallel allocation process (see [ABS98]) which is related to the `balls and bins' abstraction. The simple process can be used to model many problems arising in applications like load balancing, data accesses for parallel data servers, hashing, and PRAM simulations. Unfortunately, the parallel allocation process behaves in a highly non-uniform manner which makes its analysis challenging. Even the typically simple question of for which arrival rates the process is stable, is highly non-trivial. In order to cope with this non-uniform behavior we introduce a new sequential process and show (via simulations) that the sequential process models the behavior of the parallel one very accurately. We develop a system of ordinary differential equations in order to describe the behavior of our sequential process and present a thorough analysis of the performance this process. For example, we show that the queue length distribution decreases double-exponentially. Finally, we present simulation results indicating that the solutions to the differential equations very well predict the queue length distribution of our sequential process and the largest injection rate for which it is stable. Summarizing, we can conclude that in all the performance characteristics we have measured experimentally, the parallel and the sequential process are closely related. This indicates that the obtained solution of the differential equations and the results presented above are applicable to the parallel process, too.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Ordering unstructured meshes for sparse matrix computations on leading parallel systems

Ordering unstructured meshes for sparse matrix computations ...

引用

IEEE International parallel and Distributed Processing symposium (IPDP 2000)

作者： Oliker, L Li, XY Heber, G Biswas, R Lawrence Berkeley Natl Lab NERSC Berkeley CA 94720 USA Cornell Univ Cornell Theory Ctr Ithaca NY 14853 USA NASA Ames Res Ctr MRJ Moffett Field CA 94035 USA

ISBN: (纸本)354067442X

Computer simulations of realistic applications usually require solving a set of non-linear partial differential equations (PDEs) over a finite region. The process of obtaining numerical solutions to the governing PDEs involves solving large sparse linear or eigen systems over the unstructured meshes that model the underlying physical objects. These systems are often solved iteratively, where the sparse matrix-vector multiply (SPMV) is the most expensive operation within each iteration. In this paper, we focus on the efficiency of SPMV using various ordering/partitioning algorithms. We examine different implementations using three leading programming paradigms and architectures. Results show that ordering greatly improves performance, and that cache reuse can be more important than reducing communication. However, a multithreaded implementation indicates that ordering and partitioning are not required on the Tera MTA to obtain an efficient and scalable SPMV.

关键词： Iterative methods

来源：评论

学校读者我要写书评

暂无评论

An algorithm for finding three dimensional symmetry in series parallel digraphs 11th

An algorithm for finding three dimensional symmetry in serie...

引用

11th annual International symposium on algorithms and Computation, ISAAC 2000

作者： Hong, Seok-Hee Eades, Peter Department of Computer Science University of Sydney Australia

ISBN: (纸本)3540412557

Symmetry is one of the most important aesthetic criteria which clearly reveals the structure of the graph. However, previous work on symmetric graph drawing has focused on two dimensions. In this paper, we extend symmetric graph drawing into three dimensions. Symmetry in three dimensions is much richer than that of two dimensions. We present a linear time algorithm for finding maximum number of three dimensional symmetries in series parallel digraphs. © Springer-Verlag Berlin Heidelberg 2000.

关键词： Clustering algorithms

来源：评论

学校读者我要写书评

暂无评论

Approximation algorithms for data placement on parallel disks

Approximation algorithms for data placement on parallel disk...

引用

11th annual acm/SIAM symposium on Discrete algorithms

作者： Golubchik, L Khanna, S Khuller, S Thurimella, R Zhu, A Univ Maryland Dept Comp Sci College Pk MD 20742 USA

ISBN: (纸本)0898714532

We study an optimization problem that arises in the context of data placement in multimedia storage systems. We are given a collection of M multimedia data objects that need to be assigned to a storage system consisting of N disks d(1),d(2)...,d(N). We are also given sets U-1,U-2,...,U-M such that U-i is the set of clients requesting the ith data object. Each disk d(j) is characterized by two parameters, namely, its storage capacity C-j which indicates the maximum number of data objects that may be assigned to it, and a load capacity L-j which indicates the maximum number of clients that it can serve. The goal is to find a placement of data objects on disks and an assignment of clients to disks so as to maximize the total number of clients served, subject to the capacity constraints of the storage system. We study this data placement problem for two natural classes of storage systems, namely, homogeneous and uniform ratio. Our first main result is a tight upper and lower bound on the number of items that can always be packed for any input instance to homogeneous as well as uniform ratio storage systems. We show that an algorithm given in [11] for data placement, achieves this bound. Our second main result is a polynomial time approximation scheme for the data placement problem in homogeneous and uniform ratio storage systems, answering an open question of [11]. Finally, we also study the problem from an empirical perspective.

关键词： Multimedia systems

来源：评论

学校读者我要写书评

暂无评论

Efficient conditional operations for data-parallel architectures

Efficient conditional operations for data-parallel architect...

引用

IEEE/acm International symposium on Microarchitecture (MICRO)

作者： U.J. Kapasi W.J. Dally S. Rixner P.R. MAttson J.D. Owens B. Khailany Computer Systems Laboratory University of Stanford Stanford CA USA

Many data-parallel applications, including emerging media applications, have regular structures that can easily be expressed as a series of arithmetic kernels operating on data streams. Data-parallel architectures are designed to exploit this regularity by performing the same operation on many data elements concurrently. However, applications containing data-dependent control constructs perform poorly on these architectures. Conditional streams convert these constructs into data-dependent data movement. This allows data-parallel architectures to efficiently execute applications with data-dependent control flow. Essentially, conditional streams extend the range of applications that a data-parallel architecture can execute efficiently. For example, polygon rendering speeds up by a factor of 1.8 with the use of conditional streams.

关键词： Kernel Streaming media Computer architecture Application software Laboratories Digital arithmetic Rendering (computer graphics) Graphics Video signal processing Data mining

来源：评论

学校读者我要写书评

暂无评论

Preemptive parallel task scheduling in o(N) + poly(m) time 11th

Preemptive parallel task scheduling in o(N) + poly(m) time

引用

11th annual International symposium on algorithms and Computation, ISAAC 2000

作者： Jansen, Klaus Porkolab, Lorant Institut für Informatik und praktische Mathematik Christian Albrechts University of Kiel Germany Department of Computing Imperial College London United Kingdom

ISBN: (纸本)3540412557

We study the problem of scheduling a set of n independent parallel tasks on m processors, where in addition to the processing time there is a size associated with each task indicating that the task can be processed on any subset of processors of the given size. Based on a linear programming formulation, we propose an algorithm for computing a preemptive schedule with minimum makespan, and show that the running time of the algorithm depends polynomially on m and only linearly on n. Thus for any fixed m, an optimal preemptive schedule can be computed in O(n) time. We also present extensions of this approach to other (more general) scheduling problems with malleable tasks, release times, due dates and maximum lateness minimization. © Springer-Verlag Berlin Heidelberg 2000.

关键词： Scheduling

来源：评论

学校读者我要写书评

暂无评论

A no-busy-wait balanced tree parallel algorithmic paradigm 00

A no-busy-wait balanced tree parallel algorithmic paradigm

引用

Proceedings of the twelfth annual acm symposium on parallel algorithms and architectures

作者： Uzi Vishkin University of Maryland

ISBN: (纸本)9781581131857

Suppose that a parallel algorithm can include any number of parallel threads. Each thread can proceed without ever having to busy wait to another thread. A thread can proceed till its termination, but no new threads can be formed. What kind of problems can such restrictive algorithms solve and still be competitive in the total number of operations they perform with the fastest serial algorithm for the same problem?Intrigued by this informal question, we considered one of the most elementary parallel algorithmic paradigms, that of balanced binary trees. The main contribution of this paper is a new balanced (not necessarily binary) tree no-busy-wait paradigm for parallel algorithms; applications of the basic paradigm to two problems are presented: building heaps, and executing parallel tree contraction (assuming a preparatory stage); the latter is known to be applicable to evaluating a family of general arithmetic *** putting things in context, we also discuss our “PRAM-on-chip” vision (actually a small update to it), presented at SPAA98.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Defect tolerant molecular electronics: algorithms, architectures, and atoms 33

Defect tolerant molecular electronics: algorithms, architect...

引用

Proceedings of the 33rd annual acm/IEEE international symposium on Microarchitecture

作者： Phil Keukes Hewlett-Packard

No abstract available.

ISBN: (纸本)9781581131963

No abstract available.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Algorithmic foundations for a parallel vector access memory system 00

Algorithmic foundations for a parallel vector access memory ...

引用

Proceedings of the twelfth annual acm symposium on parallel algorithms and architectures

作者： Binu K. Mathew Sally A. McKee John B. Carter Al Davis Department of Computer Science University of Utah Salt Lake City UT

ISBN: (纸本)9781581131857

This paper presents mathematical foundations for the design of a memory controller subcomponent that helps to bridge the processor/memory performance gap for applications with strided access patterns. The parallel Vector Access (PVA) unit exploits the regularity of vectors or streams to access them efficiently in parallel on a multi-bank SDRAM memory system. The PVA unit performs scatter/gather operations so that only the elements accessed by the application are transmitted across the system bus. Vector operations are broadcast in parallel to all memory banks, each of which implements an efficient algorithm to determine which vector elements it holds. Earlier performance evaluations have demonstrated that our PVA implementation loads elements up to 32.8 times faster than a conventional memory system and 3.3 times faster than a pipelined vector unit, without hurting the performance of normal cache-line fills. Here we present the underlying PVA algorithms for both word interleaved and cache-line inter-leaved memory systems.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：