检索结果-内蒙古大学图书馆

Efficient out-of-core sorting algorithms for the parallel disks model

JOURNAL OF parallel AND DISTRIBUTED COMPUTING 2011年第11期71卷 1427-1433页

作者： Kundeti, Vamsi Rajasekaran, Sanguthevar Univ Connecticut Dept Comp Sci & Engn Storrs CT 06269 USA

In this paper,(1) we present efficient algorithms for sorting on the parallel disks model (PDM). Numerous asymptotically optimal algorithms have been proposed in the literature. However, many of these merge based algorithms have large underlying constants in the time bounds, because they suffer from the lack of read parallelism on the PDM. The irregular consumption of the runs during the merge affects the read parallelism and contributes to the increased sorting time. In this paper, we first introduce a novel idea called the dirty sequence accumulation that improves the read parallelism. Next, we show analytically that this idea can reduce the number of parallel I/O's required to sort the input close to the lower bound of Omega (log(M/B) (N/M)). We verify experimentally our dirty sequence idea with the standard R-Way merge and show that our idea can reduce the number of parallel I/Os to sort on the PDM significantly. (C) 2011 Elsevier Inc. All rights reserved.

关键词： Algorithms External memory Sorting Massive data sets parallel disk model disk sorting Randomized algorithms

来源：评论

学校读者我要写书评

暂无评论

Tight Competitive Ratios for parallel disk Prefetching and Caching 08

Tight Competitive Ratios for Parallel Disk Prefetching and C...

引用

20th ACM Symposium on parallelism in Algorithms and Architectures

作者： Hon, Wing-Kai Shah, Rahul Varman, Peter J. Vitter, Jeffrey Scott Natl Tsing Hua Univ Dept Comp Sci Hsinchu Taiwan Louisiana State Univ Dept Comp Sci Baton Rouge LA 70803 USA Rice Univ Dept ECE Houston TX 77251 USA Purdue Univ Coll Sci W Lafayette IN 47907 USA

ISBN: (纸本)9781595939739

We consider the natural extension of the well-known single disk caching problem to the parallel disk I/O model (PDM) [17]. The main challenge is to achieve as much parallelism as possible and avoid I/O bottlenecks. We are given a fast memory (cache) of size M memory blocks along with a request sequence Sigma = (b(1),b(2), ... ,b(n)) where each block b(i) resides on one of D disks. In each parallel I/O step, at most one block from each disk can be fetched. The task is to serve Sigma in the minimum number of parallel I/Os. Thus, each I/O is analogous to a page fault. The difference here is that during each page fault, up to D blocks can be brought into memory, as long as all of the new blocks entering the memory reside on different disks. The problem has a long history [18, 12, 13, 26]. Note that this problem is non-trivial even if all requests in Sigma are unique. This restricted version is called read-once. Despite the progress in the offline version [13, 15] and read-once version [12], the general online problem still remained open. Here, we provide comprehensive results with a full general solution for the problem with asymptotically tight competitive ratios. To exploit parallelism, any parallel disk algorithm needs a certain amount of lookahead into future requests. To provide effective caching, an online algorithm must achieve o(D) competitive ratio. We show a lower bound that states, for lookahead L <= M, any online algorithm must be Omega(D)-competitive. For lookahead L greater than M(1 + 1/epsilon), where c is a constant, the tight upper bound of O(root MD/L) on competitive ratio is achieved by our algorithm SKEW. The previous algorithm tLRU [26] was O((MD/L)(2/3)) competitive and this was also shown to be tight [26] for an LRU-based strategy. We achieve the tight ratio using a fairly different strategy than LRU. We also show tight results for randomized algorithms against oblivious adversary and give an algorithm achieving better bounds in the resource augme

关键词： Online algorithms Competitive Analysis parallel disk model

来源：评论

学校读者我要写书评

暂无评论

On competitive online read-many parallel disks scheduling 05

On competitive online read-many parallel disks scheduling

引用

Proceedings of the seventeenth annual ACM symposium on parallelism in algorithms and architectures

作者： Rahul Shah Peter J. Varman Jeffrey Scott Vitter Purdue University West Lafayette IN Rice University Houston TX Purdue University West Lafayatte IN

ISBN: (纸本)9781581139860

We consider the natural extension of the single disk caching problem to parallel disk I/O model. We close the existing gap between lower and upper bounds and achieve optimal competitive ratio of O(√D) when lookahead is more than the memory size M. When lookahead is smaller, we derive various upper bounds and lower bounds on the competitive ratio under various adversarial models.

关键词： online algorithms parallel disk model

来源：评论

学校读者我要写书评

暂无评论

Early experiences in evaluating the parallel disk model with the ViC* implementation

引用

parallel COMPUTING 1997年第4-5期23卷 571-600页

作者： Cormen, TH Hirschl, M Dartmouth College Department of Computer Science Hanover NH 03755-3510 USA

Although several algorithms have been developed for the parallel disk model (PDM), few have been implemented, Consequently, little has been known about the accuracy of the PDM in measuring I/O time and total running time to perform an out-of-core computation. This paper analyzes timing results on multiple-disk platforms for two PDM algorithms, out-of-core radix sort and BMMC permutations, to determine the strengths and weaknesses of the PDM, The results indicate the following. First, good PDM algorithms are usually not I/O bound. Second, of the four PDM parameters, one (problem size) is a good indicator of I/O time and running time, one (memory size) is a good indicator of I/O time but not necessarily running time, and the other two (block size and number of disks) do not necessarily indicate either I/O or running time. Third, because PDM algorithms tend not to be I/O bound, using asynchronous I/O can reduce I/O wait times significantly, The software interface to the PDM is part of the ViC* run-time library. The interface is a set of wrappers that are designed to be both efficient and portable across several underlying file systems and target machines.

关键词： parallelism data distribution file systems parallel disk model input/output

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：