检索结果-内蒙古大学图书馆

31st International Conference on Parallel Architectures and Compilation Techniques (PACT)

作者： Horro, Marcos Pouchet, Louis-Noel Rodriguez, Gabriel Tourino, Juan Univ A Coruna CITIC La Coruna Spain Colorado State Univ Ft Collins CO USA

ISBN: (纸本)9781450398688

sparse computations, such as sparse matrix-dense vector multiplication, are notoriously hard to optimize due to their irregularity and memory-boundedness. Solutions to improve the performance of sparse computations have been proposed, ranging from hardware-based such as gather-scatter instructions, to software ones such as generalized and dedicated sparse formats, used together with specialized executor programs for different hardware targets. These sparse computations are often performed on read-only sparse structures: while the data themselves are variable, the sparsity structure itself does not change. Indeed, sparse formats such as CSR have a typically high cost to insert/remove nonzero elements in the representation. The typical use case is to not modify the sparsity during possibly repeated computations on the same sparse structure. In this work, we exploit the possibility to generate a specialized executor program dedicated to the particular sparsity structure of an input matrix. It creates opportunities to remove indirection arrays and synthesize regular, vectorizable code for such computations. But, at the same time, it introduces challenges in code size and instruction generation, as well as efficient SIMD vectorization. We present novel techniques and extensive experimental results to efficiently generate SIMD vector code for data-specific sparse computations, and study the limits in terms of applicability and performance of our techniques compared to state-of-practice highperformance libraries like Intel MKL.

关键词： vectorization data-specific compilation sparse data structure

来源：评论

学校读者我要写书评

暂无评论

Custom High-Performance Vector Code Generation for data-Specific sparse Computations 22

Custom High-Performance Vector Code Generation for Data-Spec...

引用

Proceedings of the International Conference on Parallel Architectures and Compilation Techniques

作者： Marcos Horro Louis-Noël Pouchet Gabriel Rodríguez Juan Touriño Universidade da Coruña A Coruña Spain Colorado State University

ISBN: (纸本)9781450398688

sparse computations, such as sparse matrix-dense vector multiplication, are notoriously hard to optimize due to their irregularity and memory-boundedness. Solutions to improve the performance of sparse computations have been proposed, ranging from hardware-based such as gather-scatter instructions, to software ones such as generalized and dedicated sparse formats, used together with specialized executor programs for different hardware targets. These sparse computations are often performed on read-only sparse structures: while the data themselves are variable, the sparsity structure itself does not change. Indeed, sparse formats such as CSR have a typically high cost to insert/remove nonzero elements in the representation. The typical use case is to not modify the sparsity during possibly repeated computations on the same sparse *** this work, we exploit the possibility to generate a specialized executor program dedicated to the particular sparsity structure of an input matrix. It creates opportunities to remove indirection arrays and synthesize regular, vectorizable code for such computations. But, at the same time, it introduces challenges in code size and instruction generation, as well as efficient SIMD vectorization. We present novel techniques and extensive experimental results to efficiently generate SIMD vector code for data-specific sparse computations, and study the limits in terms of applicability and performance of our techniques compared to state-of-practice high-performance libraries like Intel MKL.

关键词： sparse data structure data-specific compilation vectorization

来源：评论

学校读者我要写书评

暂无评论

Generating Piecewise-Regular Code from Irregular structures 2019

Generating Piecewise-Regular Code from Irregular Structures

引用

40th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI) part of ACM's Federated Computing Research Conference (FCRC)

作者： Augustine, Travis Sarma, Janarthanan Pouchet, Louis-Noel Rodriguez, Gabriel Colorado State Univ Ft Collins CO 80523 USA Univ A Coruna La Coruna Spain

ISBN: (纸本)9781450367127

Irregular data structures, as exemplified with sparse matrices, have proved to be essential in modern computing. Numerous sparse formats have been investigated to improve the overall performance of sparse Matrix-Vector multiply (SpMV). But in this work we propose instead to take a fundamentally different approach: to automatically build sets of regular sub-computations by mining for regular sub-regions in the irregular data structure. Our approach leads to code that is specialized to the sparsity structure of the input matrix, but which does not need anymore any indirection array, thereby improving SIMD vectorizability. We particularly focus on small sparse structures (below 10M nonzeros), and demonstrate substantial performance improvements and compaction capabilities compared to a classical CSR implementation and Intel MKL IE's SpMV implementation, evaluating on 200+ different matrices from the Suitesparse repository.

关键词： Polyhedral compilation SpMV trace compression sparse data structure

来源：评论

学校读者我要写书评

暂无评论

Solvers for the verified solution of parametric linear systems

引用

COMPUTING 2012年第2-4期94卷 109-123页

作者： Zimmer, Michael Kraemer, Walter Popova, Evgenija D. Univ Wuppertal D-42097 Wuppertal Germany Bulgarian Acad Sci Inst Math & Informat BU-1113 Sofia Bulgaria

We present a newly developed version of our solvers for the verified solution of dense parametric linear systems, i.e. linear systems whose system matrix and right-hand side depend affine-linearly on parameters that vary inside prescribed intervals. The solvers use our C++ class library for reliable computing, C-XSC. The C-XSC library provides many features, especially easy to handle data types for dense and sparse matrices and vectors and the ability to compute dot products and dot product expressions in arbitrary precision. The new solvers can use either sparse or dense matrices as the coefficient matrices for the parameters. The use of sparse coefficient matrices can result in huge improvements in both performance and memory consumption. BLAS and LAPACK routines are used where applicable, and OpenMP is used for the parallelization on multi-core and multi-processor systems. The solvers also provide the ability to compute not only an outer but also a componentwise inner enclosure of the solution set of the system and to choose between two versions of the algorithm, one being very fast and one giving sharp results and extending the range of solvable systems. We give some examples for parametric linear systems (also from real world examples such as worst-case tolerance analysis of linear electric circuits), give performance measurements of our solvers and also demonstrate that they scale very well when using multiple cores or processors.

关键词： Parameter dependent system Self-verifying solver sparse data structure C-XSC

来源：评论

学校读者我要写书评

暂无评论

EFFICIENTLY COMPUTING WITH DESIGN structure MATRICES

EFFICIENTLY COMPUTING WITH DESIGN STRUCTURE MATRICES

引用

12th International DSM Conference

作者： Hossain, Shahadat Univ Lethbridge Dept Math & Comp Sci Lethbridge AB T1K 3M4 Canada

来源：评论

学校读者我要写书评

暂无评论

Parallel volume rendering with sparse data structures

引用

JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 2005年第2期21卷 327-339页

作者： Liu, JS Huang, CH Yang, DL Feng Chia Univ Dept Comp Sci & Informat Engn Taichung 407 Taiwan

Direct volume rendering is a popular technique for scientific visualization. The computation cost of direct volume rendering increases exponentially as the size of the volume dataset increases. Hence, efficient volume rendering has become an important issue. In this work, we study parallel volume rendering algorithms based on sparse data structures. In order to exploit object space coherence, we propose to employ two sparse-matrix representation schemes as spatial data structures. To further reduce the processing time, we employ data-parallel volume rendering algorithms based on sparse data structures. Two distinct features of our work are: (a) the sparse data structures enable us to reduce the processing time as well as the memory storage requirement;and (b) parallel processing allows us to further speed up the volume rendering process. Experiments were conducted to assess our proposed scheme. Results show that our proposed data parallel algorithms performed well on two different parallel distributed memory systems.

关键词： volume rendering sparse data structure data-parallel algorithm splatting

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：