检索结果-内蒙古大学图书馆

cache complexity of cache-Oblivious Approaches: A Review and Extension

INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS 2022年第5期13卷 1002-1009页

作者： Abuqaddom, Inas Serhan, Sami Mahafzah, Basel A. Univ Jordan King Abdullah Sch Informat Technol 2 Dept Comp Sci Amman Jordan

The latest direction in cache-aware/cache-efficient algorithms is to use cache-oblivious algorithms based on the cache-oblivious model, which is an improvement of the external-memory model. The cache-oblivious model utilizes memory hierarchies without knowing memories' parameters in advance since algorithms of this model are automatically tuned according to the actual memory parameters. As a result, cache-oblivious algorithms are particularly applied to multi-level caches with changing parameters and to environments in which the amount of available memory for an algorithm can fluctuate. This paper shows the state of the art in cache-oblivious algorithms and data structures;each with its complexity concerning cache misses, which is called cache complexity. Additionally, this paper introduces an extension to minimize the cache complexity of neural networks by applying an appropriate cache-oblivious approach to neural networks.

关键词： cache complexity cache-oblivious algorithm memory hierarchy neural network

来源：评论

学校读者我要写书评

暂无评论

Scheduling Irregular Parallel Computations on Hierarchical caches 11

Scheduling Irregular Parallel Computations on Hierarchical C...

引用

23rd Annual Symposium on Parallelism in Algorithms and Architectures

作者： Blelloch, Guy E. Fineman, Jeremy T. Gibbons, Phillip B. Simhadri, Harsha Vardhan Carnegie Mellon Univ Pittsburgh PA 15213 USA

ISBN: (纸本)9781450307437

For nested-parallel computations with low depth (span, critical path length) analyzing the work, depth, and sequential cache complexity suffices to attain reasonably strong bounds on the parallel runtime and cache complexity on machine models with either shared or private caches. These bounds, however, do not extend to general hierarchical caches, due to limitations in (i) the cache-oblivious (CO) model used to analyze cache complexity and (ii) the schedulers used to map computation tasks to processors. This paper presents the parallel cache-oblivious (PCO) model, a relatively simple modification to the CO model that can be used to account for costs on a broad range of cache hierarchies. The first change is to avoid capturing artificial data sharing among parallel threads, and the second is to account for parallelism-memory imbalances within tasks. Despite the more restrictive nature of PCO compared to CO, many algorithms have the same asymptotic cache complexity bounds. The paper then describes a new scheduler for hierarchical caches, which extends recent work on "space-bounded schedulers" to allow for computations with arbitrary work imbalance among parallel subtasks. This scheduler attains provably good cache performance and runtime on parallel machine models with hierarchical caches, for nested-parallel computations analyzed using the PCO model. We show that under reasonable assumptions our scheduler is "work efficient" in the sense that the cost of the cache misses are evenly balanced across the processors-i.e., the runtime can be determined within a constant factor by taking the total cost of the cache misses analyzed for a computation and dividing it by the number of processors. In contrast, to further support our model, we show that no scheduler can achieve such bounds (optimizing for both cache misses and runtime) if work, depth, and sequential cache complexity are the only parameters used to analyze a computation.

关键词： Parallel hierarchical memory Cost models Schedulers Analysis of parallel algorithms cache complexity

来源：评论

学校读者我要写书评

暂无评论

Hardware Acceleration Technologies in Computer Algebra: Challenges and Impact

Hardware Acceleration Technologies in Computer Algebra: Chal...

引用

作者： Sardar Anisul Haque University of Western Ontario

学位级别：博士

The objective of high performance computing (HPC) is to ensure that the compu- tational power of hardware resources is well utilized to solve a problem. Various techniques are usually employed to achieve this goal. Improvement of algorithm to reduce the number of arithmetic operations, modifications in accessing data or rear- rangement of data in order to reduce memory traffic, code optimization at all levels, designing parallel algorithms with smaller span or reduced overhead are some of the attractive areas that HPC researchers are working on. In this thesis, we investigate HPC techniques for the implementation of basic routines in computer algebra targeting hardware acceleration technologies. We start with a sorting algorithm and its application to sparse matrix-vector multiplication for which we focus on work on cache complexity issues. Since basic routines in computer algebra often provide a lot of fine grain parallelism, we then turn our attention to many-core architectures on which we consider dense polynomial and matrix operations ranging from plain to fast arithmetic. Most of these operations are combined within a bivariate system solver running entirely on a graphics processing unit (GPU).

关键词： High Performance Computing cache complexity Parallel algorithms Many core machines multi-core machines Computer algebra

来源：评论

学校读者我要写书评

暂无评论

Parallel Integer Polynomial Multiplication 18

Parallel Integer Polynomial Multiplication

引用

18th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC)

作者： Chen, Changbo Covanov, Svyatoslav Mansouri, Farnam Maza, Marc Moreno Xie, Ning Xie, Yuzhen Univ Western Ontario Dept Comp Sci London ON N6A 5B7 Canada

We propose a new algorithm for multiplying dense polynomials with integer coefficients in a parallel fashion, targeting multi-core processor architectures. complexity estimates and experimental comparisons demonstrate... 详细信息

ISBN: (纸本)9781509057078

关键词： Polynomial algebra symbolic computation parallel processing cache complexity multi-core architectures

来源：评论

学校读者我要写书评

暂无评论

Design and Implementation of Multi-Threaded Algorithms in Polynomial Algebra 21

Design and Implementation of Multi-Threaded Algorithms in Po...

引用

46th International Symposium on Symbolic and Algebraic Computation, ISSAC 2021

作者： Moreno Maza, Marc University of Western Ontario LondonON Canada

来源：评论

学校读者我要写书评

暂无评论

Two dimensional range minimum/maximum query revisited

Two dimensional range minimum/maximum query revisited

引用

International Conference on Computer and Information Technology

作者： Crochemore, Maxime Hasan, Masud Moosa, Tanaeem M. Rahman, M. Sohel Algorithm Design Group Department of Computer Science King's College London Strand London WC2R 2LS United Kingdom Department of CSE BUET Dhaka-1000 Bangladesh

ISBN: (纸本)9781424484973

关键词： cache complexity cache complexity Range Minima/Maxima Query.A Range Minima/Maxima Query.lgorithms lgorithms

来源：评论

学校读者我要写书评

暂无评论

Mit :: Lcs :: Tr :: Mit-Lcs-Tr-785

引用

2016年

[Auto Generated] 1 Portable high performance 9 1.1 Thescopeofthisdissertation ...... ..... ...... ...... ..... . 9 1.1.1 Coping withparallelism .... ..... ...... ...... ..... . 9 1.1.2 Coping withthememoryhierarchy ... ...... ...... ..... . 11 1.1.3 Coping withparallelism andmemoryhierarchy together .... ..... . 13 1.1.4 Coping with the processor architecture . . ...... ...... ..... . 14 1.2 The methods of this dissertation .... ..... ...... ...... ..... . 16 1.3 Contributions . . . ..... ......

关键词： algorithm algorithms backer coherence cache cache complexity cache misses cilk computation consistency dag execution time fftw fourier transform location consistency main memory memory memory model memory models mit-lcs-tr-785 observer function parallel

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：