检索结果-内蒙古大学图书馆

Proceedings of the eleventh annual acm-SIAM symposium on Discrete algorithms

作者： Peter Sanders Sebastian Egner Jan Korst Max-Planck-Institute for Computer Science Im Stadtwald 66123 Saarbrücken Germany Philips Research Laboratories Prof. Holstlaan 4 5656 AA Eindhoven The Netherlands

来源：评论

学校读者我要写书评

暂无评论

Optimal parallel sorting in multi-level storage 94

Optimal parallel sorting in multi-level storage

引用

Proceedings of the fifth annual acm-SIAM symposium on Discrete algorithms

作者： Alok Aggarwal C. Greg Plaxton IBM Research Division T. J. Watson Research Center Yorktown Heights NY Department of Computer Science University of Texas at Austin Austin TX

来源：评论

学校读者我要写书评

暂无评论

Hidp: A hierarchical data parallel language

Hidp: A hierarchical data parallel language

引用

International symposium on Code Generation and Optimization (CGO)

作者： Yongpeng Zhang Frank Mueller North Carolina State University Raleigh NC USA

ISBN: (纸本)9781467355247

Problem domains are commonly decomposed hierarchically to fully utilize parallel resources in modern microprocessors. Such decompositions can be provided as library routines, written by experienced experts, for general algorithmic patterns. But such APIs tend to be constrained to certain architectures or data sizes. Integrating them with application code is often an unnecessarily daunting task, especially when these routines need to be closely coupled with user code to achieve better performance. This paper contributes HiDP, a high-level hierarchical data parallel language. The purpose of HiDP is to improve the coding productivity of integrating hierarchical data parallelism without significant loss of performance. HiDP is a source-to-source compiler that converts a very concise data parallel language into CUDA C++ source code. Internally, it performs necessary analysis to compose user code with efficient and architecture-aware code snippets. This paper discusses various aspects of HiDP systematically: the language, the compiler and the run-time system with built-in tuning capabilities. They enable HiDP users to express algorithms in less code than low-level SDKs require for native platforms. HiDP also exposes abundant computing resources of modern parallel architectures. Improved coding productivity tends to come with a sacrifice in performance. Yet, experimental results show that the generated code delivers performance very close to handcrafted native GPU code.

关键词： Shape parallel processing Graphics processing units Arrays Kernel Synchronization Libraries

来源：评论

学校读者我要写书评

暂无评论

A parallel Algorithm for Subgraph Isomorphism (Brief Announcement) 19

A Parallel Algorithm for Subgraph Isomorphism (Brief Announc...

引用

The 31st acm symposium on parallelism in algorithms and architectures

作者： Rohan Yadav Umut A. Acar Carnegie Mellon University Pittsburgh PA USA

ISBN: (纸本)9781450361842

Subgraph isomorphism is a fundamental property of graphs that requires checking whether the network structure of one graph can be found (embedded) within another graph. It has numerous applications and is a computationally challenging problem: it is NP-complete and known algorithms explore an exponentially large search space. Even though it has been studied extensively, relatively little is known about whether subgraph isomorphism accepts a theoretically and practically efficient parallel solution. In this paper, we present our ongoing work on designing a parallel algorithm for the subgraph (and graph) isomorphism problem, which addresses challenges commonly faced when attempting to obtain a parallel algorithm for isomorphism. Our algorithm appears to scale well up to the 70 cores of our empirical machine.

关键词： parallel computing algorithms graphs

来源：评论

学校读者我要写书评

暂无评论

Partitioned register file for TTAs

Partitioned register file for TTAs

引用

IEEE/acm International symposium on Microarchitecture (MICRO)

作者： J. Janssen H. Corporaal Department of Electrical Engineering Delft University of Technnology Delft Netherlands

A practical implementation of high performance instruction level parallel architectures is constrained by the difficulty to build a large monolithic multi-ported register file (RF). A solution is to partition the RF into smaller RFs while keeping the total number of registers and ports equal. This paper applies RF partitioning to transport triggered architectures (TTAs); these architectures are of the VLIW type. One may expect that partitioning increases the number of executed cycles because it constrains the number of ports per RF. It is shown that these performance losses are small; e.g. partitioning an RF with 24 registers and four read and four write ports into four RFs with 6 registers and one read and one write port gives a performance loss of only 5.8%. Partitioned RFs consume less area than monolithic RFs with the same number of ports and registers. Experiments show that, if the area saved by partitioning is spent on extra registers, partitioning does, on average, not reduce the performance; it may even result in a small performance gain.

关键词： Radio frequency Registers Performance loss VLIW Vector processors parallel architectures Performance gain Delay Energy consumption Hardware

来源：评论

学校读者我要写书评

暂无评论

A sublinear-time randomized parallel algorithm for the maximum clique problem in perfect graphs 91

A sublinear-time randomized parallel algorithm for the maxim...

引用

Proceedings of the second annual acm-SIAM symposium on Discrete algorithms

作者： Farid Alizadeh Computer Science Department University of Minnesota Minneapolis Mn

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms and concentration bounds for the Lovasz Local Lemma via witness-DAGs 17

Parallel algorithms and concentration bounds for the Lovasz ...

引用

annual acm-Society for Industrial and Applied Mathmatics symposium on Discrete algorithms

作者： Bernhard Haeupler David G. Harris School of Computer Science Carnegie Mellon University Department of Computer Science University of Maryland

ISBN: (纸本)9781510836358

The Lovasz Local Lemma (LLL) is a cornerstone principle in the probabilistic method of combinatorics, and a seminal algorithm of Moser & Tardos (2010) provides an efficient randomized algorithm to implement it. This algorithm can be parallelized to give an algorithm that uses polynomially many processors and runs in O(log~3 n) time, stemming from O(log n) adaptive computations of a maximal independent set (MIS). Chung et al. (2014) developed faster local and parallel algorithms, potentially running in time O(log~2 n), but these algorithms work under significantly more stringent conditions than the LLL. We give a new parallel algorithm that works under essentially the same conditions as the original algorithm of Moser & Tardos but uses only a single MIS computation, thus running in O(log~2 n) time. This conceptually new algorithm also gives a clean combinatorial description of a satisfying assignment which might be of independent interest. Our techniques extend to the deterministic LLL algorithm given by Chandrasekaran et al. (2013) leading to an NC-algorithm running in time O(log~2 n) as well. We also provide improved bounds on the run-times of the sequential and parallel resampling-based algorithms originally developed by Moser & Tardos. Our bounds extend to any problem instance in which the tighter Shearer LLL criterion is satisfied. We also improve on the analysis of Kolipaka & Szegedy (2011) to give tighter concentration results.

关键词： parallel algorithms low light algorithms Running in Lemma Bound Combinatorics Runtime

来源：评论

学校读者我要写书评

暂无评论

An efficient parallel algorithm for the row minima of a totally monotone matrix 91

An efficient parallel algorithm for the row minima of a tota...

引用

Proceedings of the second annual acm-SIAM symposium on Discrete algorithms

作者： Mikhail J. Atallah S. Rao Kosaraju Dept. of Computer Science Purdue University West Lafayette IN Dept. of Computer Science Johns Hopkins University Baltimore MD

来源：评论

学校读者我要写书评

暂无评论

Optimal parallel selection 03

引用

Proceedings of the fourteenth annual acm-SIAM symposium on Discrete algorithms

作者： Yijie Han University of Missouri at Kansas City Kansas City MO

We present an optimal parallel selection algorithm on the EREW PRAM. This algorithm runs in O(log n) time with n/log n processors. This complexity matches the known lower bound for parallel selection on the EREW PRAM ... 详细信息

ISBN: (纸本)9780898715385

关键词： selection EREW PRAM parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel-Correctness and Transferability for Conjunctive Queries

引用

JOURNAL OF THE acm 2017年第5期64卷 36-36页

作者： Ameloot, Tom J. Geck, Gaetano Ketsman, Bas Neven, Frank Schwentick, Thomas Hasselt Univ Martelarenlaan 42 B-3500 Hasselt Belgium Transnat Univ Limburg Martelarenlaan 42 B-3500 Hasselt Belgium TU Dortmund Univ Dortmund Germany TU Dortmund Fak Informat Otto Hahn Str 12 D-42277 Dortmund Germany

A dominant cost for query evaluation in modern massively distributed systems is the number of communication rounds. For this reason, there is a growing interest in single-round multiway join algorithms where data are first reshuffled over many servers and then evaluated in a parallel but communication-free way. The reshuffling itself is specified as a distribution policy. We introduce a correctness condition, called parallel-correctness, for the evaluation of queries w.r.t. a distribution policy. We study the complexity of parallel-correctness for conjunctive queries as well as transferability of parallel-correctness between queries. We also investigate the complexity of transferability for certain families of distribution policies, including the Hyper-cube distribution policies.

关键词： Distributed databases parallel query evaluation one-round evaluation distribution policies

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：