检索结果-内蒙古大学图书馆

parallel sorting WITH COOPERATING HEAPS IN A LINEAR-ARRAY OF PROCESSORS

parallel COMPUTING 1990年第2-3期16卷 273-278页

作者： LIN, YC LIN, FC NATL TAIWAN UNIV DEPT COMP SCI & INFORMAT ENGN TAIPEI 10764 TAIWAN

A parallel sorting algorithm using cooperating heaps in a linear array of processors is presented. It can sort a sequence whose length is much larger than the number of processors. Because the output begins one step after all the items have been input, sorting n items requires 2n + 1 steps. Two independent modifications of the algorithm are possible;one tries to reduce the number of processors used, and the other can sort more items on the same array.

关键词： LINEAR ARRAY parallel sorting ZERO-TIME

来源：评论

学校读者我要写书评

暂无评论

parallel sorting with limited bandwidth

引用

SIAM JOURNAL ON COMPUTING 2000年第6期29卷 1997-2015页

作者： Adler, M Byers, JW Karp, RM Univ Massachusetts Dept Comp Sci Amherst MA 01003 USA Boston Univ Dept Comp Sci Boston MA 02215 USA Univ Calif Berkeley Dept Comp Sci Berkeley CA 94720 USA Int Comp Sci Inst Berkeley CA 94720 USA

We study the problem of sorting on a parallel computer with limited communication bandwidth. By using the PRAM(m) model, where p processors communicate through a globally shared memory which can service m requests per unit time, we focus on the trade-off between the amount of local computation and the amount of interprocessor communication required for parallel sorting algorithms. Our main result is a lower bound of Omega(n log m/m log n) on the time required to sort n numbers on the exclusive-read and queued-read variants of the PRAM(m). We also show that Leighton's Columnsort can be used to give an asymptotically matching upper bound in the case where m grows as a fractional power of n. The bounds are of a surprising form in that they have little dependence on the parameter p. This implies that attempting to distribute the workload across more processors while holding the problem size and the size of the shared memory fixed will not improve the optimal running time of sorting in this model. We also show that both the lower and the upper bounds can be adapted to bridging models that address the issue of limited communication bandwidth: the LogP model and the bulk-synchronous parallel (BSP) model. The lower bounds provide further convincing evidence that efficient parallel algorithms for sorting rely strongly on high communication bandwidth.

关键词： parallel sorting limited bandwidth PRAM LogP BSP

来源：评论

学校读者我要写书评

暂无评论

parallel sorting WITH SERIAL MEMORIES

引用

IEEE TRANSACTIONS ON COMPUTERS 1985年第4期34卷 379-383页

作者： OWENS, RM JA, JJ UNIV MARYLAND DEPT ELECT ENGNCOLLEGE PKMD 20742

This correspondence examines the problem of sorting on a network of processors, where each processor consists of a single storage register and a small control unit capable of comparing two numbers and has a single serial memory attached to it. We show how to sort optimally on one- or two-dimensional arrays of p processors in time Â¿(n + (n2/p2)) and Â¿((n/Â¿p) + (n2/p2)), respectively. Because of the implementational advantages of serial memories, we feel that our architecture will be attractive for several applications.

关键词： Bitonic sorting parallel sorting serial memories shuffle/exchange

来源：评论

学校读者我要写书评

暂无评论

Communication-efficient parallel sorting

引用

SIAM JOURNAL ON COMPUTING 1999年第2期29卷 416-432页

作者： Goodrich, MT Johns Hopkins Univ Dept Comp Sci Baltimore MD 21218 USA

We study the problem of sorting n numbers on a p-processor bulk-synchronous parallel (BSP) computer, which is a parallel multicomputer that allows for general processor-to-processor communication rounds provided each processor sends and receives at most h items in any round. We provide parallel sorting methods that use internal computation time that is O(n log n/p) and a number of communication rounds that is O( log n/log(h+1)) for h = Theta(n/p). The internal computation bound is optimal for any comparison-based sorting algorithm. Moreover, the number of communication rounds is bounded by a constant for the (practical) situations when p less than or equal to n(1-1/c) for a constant c greater than or equal to 1. In fact, we show that our bound on the number of communication rounds is asymptotically optimal for the full range of values for p, for we show that just computing the "or" of n bits distributed evenly to the first O(n/h) of an arbitrary number of processors in a BSP computer requires Omega(log n/ log(h + 1)) communication rounds.

关键词： parallel algorithms parallel sorting parallel processing

来源：评论

学校读者我要写书评

暂无评论

Sequential in-core sorting performance for a SQL data service and for parallel sorting on heterogeneous clusters

引用

FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE 2006年第7期22卷 776-783页

作者： Cerin, Christophe Koskas, Michel Fkaier, Hazem Jemni, Mohamed Univ Paris 13 LIPN CNRS UMR 7030 F-93420 Villetaneuse France Univ Picardie LaMFA CNRS UMR 6140 F-80039 Amiens 1 France Ecole Super Sci & Tech Tunis Unite Rech UTIC Tunis 1008 Tunisia

The aim of the paper is to introduce techniques in order to tune sequential in-core sorting algorithms in the frameworks of two applications. The first application is parallel sorting when the processor speeds are not identical in the parallel system. The second application is the Zeta-Data Project [M. Koskas, A hierarchical database management algorithm, in: Annales 67 du Lamsade, vol. 2, 2004, pp. 277-317. [9]] whose aim is to develop novel algorithms for databases issues. About 50% of the work done in building indexes is devoted to sorting sets of integers. We develop and compare algorithms built to sort with equal keys. Algorithms are variations of the 3Way-Quicksort of Sedgewick. In order to observe performances and to fully exploit functional units in processors, and also in order to optimize the use of the memory system and the different functional units, we use hardware performance counters that are available on most modem microprocessors. We also develop analytical results for one of our algorithms and compare expected results with the measures. For the two applications, we show, through fine experiments on an Athlon processor (a three-way superscalar x86 processor), that L1 data cache misses are not the central problem, but a subtle proportion of independent retired instructions should be advised to get performance for in-core sorting. (C) 2006 Elsevier B.V. All rights reserved.

关键词： hardware performance counters in-core sorting algorithms with equal keys two-level memory hierarchy optimizing memory accesses parallelism at the chip level data structures for databases parallel sorting

来源：评论

学校读者我要写书评

暂无评论

TIGHT COMPARISON BOUNDS ON THE COMPLEXITY OF parallel sorting

引用

SIAM JOURNAL ON COMPUTING 1987年第3期16卷 458-464页

作者： AZAR, Y VISHKIN, U NYU COURANT INST MATH SCIDEPT COMP SCINEW YORKNY 10012

The problem of sorting n elements using p processors in a parallel comparison model is considered. Lower and upper bounds which imply that for p≧np≧np \geqq n, the time complexity of this problem is <span class=&... 详细信息

The problem of sorting n elements using p processors in a parallel comparison model is considered. Lower and upper bounds which imply that for

p ≧ n

$p \geqq n$ , the time complexity of this problem is

Θ

来源：评论

学校读者我要写书评

暂无评论

OPTIMAL AND SUBLOGARITHMIC TIME RANDOMIZED parallel sorting ALGORITHMS

引用

SIAM JOURNAL ON COMPUTING 1989年第3期18卷 594-607页

作者： RAJASEKARAN, S REIF, JH Harvard Univ MA United States

This paper assumes a parallel RAM (random access machine) model which allows both concurrent reads and concurrent writes of a global memory.

关键词： 68Q25 randomized algorithms parallel sorting parallel random access machines random permutations radix sort prefix sum optimal algorithms

来源：评论

学校读者我要写书评

暂无评论

A CONSTANT-TIME parallel sorting ALGORITHM AND ITS OPTICAL IMPLEMENTATION

引用

IEEE MICRO 1995年第3期15卷 60-71页

作者： LOURI, A HATCH, JA NA, JH Opt. Comput. & Parallel Process. Lab. Arizona Univ. Tucson AZ USA Trimble Navigation Texas Optical Computing and Parallel Processing Laboratory University of Arizona

High-speed electronic sorting networks are difficult to implement with VLSI technology because of the dense and global connectivity required. Optics eliminates this bottleneck by offering global interconnections, massive parallelism, and noninterfering communications. We present a parallel sorting algorithm and its efficient optical implementation using currently available optical hardware. The algorithm sorts n data elements in a few steps, independent of the number of elements to be sorted. Thus, it is a constant-time sorting algorithm, that is, O(1) time.

关键词： sorting parallel sorting Optical Computing parallel Processing

来源：评论

学校读者我要写书评

暂无评论

ODD-EVEN, COMPARE-EXCHANGE parallel sorting

MICROPROCESSING AND MICROPROGRAMMING

引用

MICROPROCESSING AND MICROPROGRAMMING 1994年第7期40卷 487-497页

作者： NIKOLOPOULOS, SD DANIELOPOULOS, SD UNIV IOANNINA DIV APPL MATH & INFORMAT GR-45110 IOANNINA GREECE

We present a parallel sorting algorithm and its proof which sorts a sequence of n elements in time O(log2 n) with n/2 processors on an EREW-PRAM computational model. A sorting network directly implements the algorithm using O(*** n)PEs. The algorithm is based on the elementary Compare-Exchange operation and has the advantage that it does not require a powerful computational model, uses the least amount of space for the sorting problem, has small constants and can be implemented directly on a sorting network. Furthermore, the architecture of the network is simple and makes no unrealistic technological assumptions.

关键词： parallel sorting COMPARE-EXCHANGE SCHEMES EREW-PRAM sorting NETWORKS COMPLEXITY

来源：评论

学校读者我要写书评

暂无评论

OPTIMAL parallel sorting SCHEME BY ORDER-STATISTICS

引用

SIAM JOURNAL ON COMPUTING 1987年第6期16卷 990-1003页

作者： YANG, MCK HUANG, JS CHOW, YC UNIV FLORIDA DEPT COMP & INFORMAT SCIGAINESVILLEFL 32611 ACAD SINICA INST INFORMAT SCITAIPEI 114TAIWAN

This paper presents a detailed analysis of a sampling approach used in the partitioning of a data file for the parallel balanced tree sort in a local area network or a multiprocessor environment. The average overall time complexity for sorting N data on a k processor system is derived. The performance of the parallel sorting rests upon how evenly the file can be partitioned into k ordered subfiles. A data partition scheme by sampling is proposed and analyzed. Formulas for computing the optimal sampling size are obtained. The results also show the computational improvement of the sorting as a function of k and the sampling overhead. The performance of the sampling method is studied and found to be approaching the absolute optimal in some cases.

关键词： 62G30 68E05 parallel sorting algorithms complexity samples order statistics optimization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：