检索结果-内蒙古大学图书馆

6th annual acm-SIAM symposium on Discrete algorithms, SODA 1995

作者： Fix, James D. Ladner, Richard E. Department of Computer Science and Engineering University of Washington SeattleWA98195 United States

ISBN: (纸本)0898713498

The problem of sorting on a one-dimensional sub-bus array of processors is addressed. A one-dimensional sub-bus array has a bus connecting the processors which can be segmented into sub-buses on which an active processor can broadcast. The sub-bus broadcast capability is implemented on the MasPar MP-1 and MP-2 parallel computers. The sub-bus broadcast operation makes possible a new class of parallel sorting algorithms which can be characterized by parallel insertion steps. We restrict our attention to the class of sorting methods where parallel insertion steps are in the same direction, say left. For this class of one-way sorting strategies we prove a lower bound and give two strategies, the left greedy sort and the left adaptive insertion sort, both of which achieve the lower bound. Because our parallel insertion model is quite general, it is not necessarily the case that a sorting strategy in the model can be efficiently implemented on a real sub-bus array of processors. The left greedy sort cannot be efficiently implemented, but the left adaptive insertion sort can be efficiently implemented. The two sorting strategies have different properties and each is interesting in its own right.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Proceedings of the annual acm-SIAM symposium on Discrete algorithms

Proceedings of the Annual ACM-SIAM Symposium on Discrete Alg...

引用

6th annual acm-SIAM symposium on Discrete algorithms, SODA 1995

ISBN: (纸本)0898713498

The proceedings contain 70 papers. The topics discussed include: selecting the median;algorithms for graphic polymatroids and parametric s-sets;design of practical and provably good random number generators;a combinatorial algorithm for minimizing symmetric submodular functions;on the entropy of DNA: algorithms and measurements based on memory and rapid convergence;improved algorithms for protein motif recognition;from valid inequalities to heuristics: a unified view of primal-dual approximation algorithms in covering problems;doing two-level logic minimization 100 times faster;on the statistical dependencies of coalesced hashing and their implications for both full and limited independence;efficient parallel computations for singular band matrices;chaining multiple-alignment fragments in sub-quadratic time;finding optimal edge-rankings of trees;dihedral bounds for mesh generation in high dimensions;approximating discrete collections via local improvements;randomized rounding without solving the linear program;finding subsets maximizing minimum structures;average-case analysis of off-line and on-line knapsack problems;Voronoi diagrams of lines in 3-space under polyhedral convex distance functions;and path optimization and near-greedy analysis for graph partitioning: an empirical study.

关键词：

来源：评论

学校读者我要写书评

暂无评论

OPTIMAL parallel algorithms FOR TRIANGULATED SIMPLE POLYGONS

引用

INTERNATIONAL JOURNAL OF COMPUTATIONAL GEOMETRY & APPLICATIONS 1995年第1-2期5卷 145-170页

作者： HERSHBERGER, J DIGITAL EQUIPMENT CORP SYST RES CTR PALO ALTO CA 94301 USA

We provide optimal parallel solutions to several shortest path and visibility problems set in triangulated simple polygons. Let P be a triangulated simple polygon with n vertices, preprocessed to support shortest path queries. We can find the shortest path tree from any point inside P in O(log n) time using O(n/log n) processors. In the same bounds, we can preprocess P for shooting queries (a query can be answered in O(log n) time by a uniprocessor). Given a set S of m points inside P, we can find an implicit representation of the relative convex hull of S in O(log(nm)) time with O(m) processors. If the relative convex hull has k edges, we can explicitly produce these edges in O(log(nm)) time with O(k/log(nm)) processors. All of these algorithms are deterministic and use the CREW PRAM model.

关键词： parallel algorithms SIMPLE POLYGONS SHORTEST PATHS SHORTEST PATH QUERIES SHORTEST PATH TREES RAY SHOOTING RELATIVE CONVEX HULL

来源：评论

学校读者我要写书评

暂无评论

Optimal average case sorting on arrays 12th

引用

12th annual symposium on Theoretical Aspects of Computer Science, STACS 1995

作者： Kunde, Manfred Niedermeier, Rolf Reinhardt, Klaus Rossmanith, Peter Wilhelm-Schickard-Institut für Informatik Universität Tübingen Sand 13 TübingenD-72076 Germany Technische Universität München Arcisstr. 21 MünchenD-80290 Germany

ISBN: (纸本)3540590420

We present algorithms for sorting and routing on two-dimensional mesh-connected parallel architectures that are optimal on average. If one processor has many packets then we asymptotically halve the up to now best running times. For a load of one optimal algorithms are known for the mesh. We improve this to a load of eight without increasing the running time. For tori no optimal algorithms were known even for a load of one. Our algorithm is optimal for every load. Other architectures we consider include meshes with diagonals and reconfigurable meshes. Furthermore, the method applies to meshes of arbitrary higher dimensions and also enables optimal solutions for the routing problem. © Springer-Verlag Berlin Heidelberg 1995.

关键词： parallel architectures

来源：评论

学校读者我要写书评

暂无评论

Optimal parallel dictionary matching and compression (extended abstract) 95

Optimal parallel dictionary matching and compression (extend...

引用

Proceedings of the seventh annual acm symposium on parallel algorithms and architectures

作者： Martin Farach S. Muthukrishnan Rutgers University DIMACS

来源：评论

学校读者我要写书评

暂无评论

Real-time volume rendering on shared memory multiprocessors using the shear-warp factorization 95

Real-time volume rendering on shared memory multiprocessors ...

引用

Proceedings of the 1995 parallel Rendering symposium

作者： Lacroute, Philippe Stanford Univ Stanford CA United States

ISBN: (纸本)9780897917742

This paper presents a new parallel volume rendering algorithm that can render 2563 voxel medical data sets at over 10 Hz and 1283 voxel data sets at over 30 Hz on a 16-processor Silicon Graphics Challenge. The algorithm achieves these results by minimizing each of the three components of execution time: computation time, synchronization time, and data communication time. Computation time is low because the parallel algorithm is based on the recently-reported shear-warp serial volume rendering algorithm which is over five times faster than previous serial algorithms. Synchronization time is minimized by using dynamic load balancing and a task partition that minimizes synchronization events. Data communication costs are low because the algorithm is implemented for shared-memory multiprocessors, a class of machines with hardware support for low-latency fine-grain communication and hardware caching to hide latency. We draw two conclusions from our implementation. First, we find that on shared-memory architectures data redistribution and communication costs do not dominate rendering time. Second, we find that cache locality requirements impose a limit on parallelism in volume rendering algorithms. Specifically, our results indicate that shared-memory machines with hundreds of processors would be useful only for rendering very large data sets.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Comparison of full and partial predicated execution support for ILP processors

Comparison of full and partial predicated execution support ...

引用

Proceedings of the 22nd annual International symposium on Computer Architecture

作者： Mahlke, Scott A Hank, Richard E. McCormick, James E. August, David I. Hwu, Wen-mei W. Univ of Illinois Urbana-Champaign United States

ISBN: (纸本)0780330005

One can effectively utilize predicated execution to improve branch handling in instruction-level parallel processors. Although the potential benefits of predicated execution are high, the tradeoffs involved in the design of an instruction set to support predicated execution can be difficult. On one end of the design spectrum, architectural support for full predicated execution requires increasing the number of source operands for all instructions. Full predicate support provides for the most flexibility and the largest potential performance improvements. On the other end, partial predicated execution support, such as conditional moves, requires very little change to existing architectures. This paper presents a preliminary study to qualitatively and quantitatively address the benefit of full and partial predicated execution support. With our current compiler technology, we show that the compiler can use both partial and full predication to achieve speedup in large control-intensive programs. Some details of the code generation techniques are shown to provide insight into the benefit of going from partial to full predication. Preliminary experimental results are very encouraging: partial predication provides an average of 33% performance improvement for an 8-issue processor with no predicate support while full predication provides an additional 33% improvement.

关键词： parallel processing systems

来源：评论

学校读者我要写书评

暂无评论

Optimal trade-offs between size and slowdown for universal parallel networks 95

Optimal trade-offs between size and slowdown for universal p...

引用

Proceedings of the seventh annual acm symposium on parallel algorithms and architectures

作者： Friedhelm Meyer auf der Heide Martin Storch Rolf Wanka Dept. of Mathematics and Computer Science and Heinz Nixdorf Institute University of Paderborn D-33095 Paderborn Germany

来源：评论

学校读者我要写书评

暂无评论

LogGP: incorporating long messages into the LogP model—one step closer towards a realistic model for parallel computation 95

LogGP: incorporating long messages into the LogP model—one ...

引用

Proceedings of the seventh annual acm symposium on parallel algorithms and architectures

作者： Albert Alexandrov Mihai F. Ionescu Klaus E. Schauser Chris Scheiman Department of Computer Science University of California Santa Barbara Santa Barbara CA

来源：评论

学校读者我要写书评

暂无评论

A randomized parallel 3D convex hull algorithm for coarse grained multicomputers 95

A randomized parallel 3D convex hull algorithm for coarse gr...

引用

Proceedings of the seventh annual acm symposium on parallel algorithms and architectures

作者： Frank Dehne Xiaotie Deng Patrick Dymond Andreas Fabri Ashfaq A. Khokhar School of Computer Science Carleton University Ottawa Canada K1S Dept. of Computer Science York University North York Canada M3J 1P3 Dept. of Computer Science Utrecht University 3508 TB Utrecht The Netherlands School of EE and Dept. of Computer Sci. Purdue University West Lafayette IN

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：