咨询与建议

限定检索结果

文献类型

  • 19 篇 会议
  • 12 篇 期刊文献

馆藏范围

  • 31 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 25 篇 工学
    • 25 篇 计算机科学与技术...
    • 6 篇 软件工程
    • 4 篇 电气工程
    • 1 篇 信息与通信工程
  • 10 篇 理学
    • 9 篇 数学
    • 1 篇 物理学
  • 1 篇 管理学
    • 1 篇 管理科学与工程(可...

主题

  • 31 篇 communication-av...
  • 5 篇 parallel algorit...
  • 3 篇 i/o-complexity
  • 3 篇 particle methods
  • 3 篇 fast matrix mult...
  • 3 篇 linear algebra
  • 2 篇 roundoff error a...
  • 2 篇 computational fl...
  • 2 篇 partial differen...
  • 2 篇 pipelined krylov...
  • 2 篇 s-step iterative...
  • 2 篇 qr decomposition
  • 2 篇 sparse matrix co...
  • 2 篇 asynchronous ite...
  • 2 篇 matrix multiplic...
  • 2 篇 domain decomposi...
  • 1 篇 nonnegative leas...
  • 1 篇 cholesky
  • 1 篇 op2
  • 1 篇 performance

机构

  • 6 篇 univ calif berke...
  • 4 篇 lawrence berkele...
  • 2 篇 georgia inst tec...
  • 2 篇 oak ridge natl l...
  • 2 篇 inria paris rocq...
  • 2 篇 univ calif berke...
  • 2 篇 univ calif berke...
  • 2 篇 univ electrocomm...
  • 2 篇 oregon state uni...
  • 1 篇 arup 3 piccadill...
  • 1 篇 wake forest univ...
  • 1 篇 nyu ny usa
  • 1 篇 syracuse univ de...
  • 1 篇 university of ma...
  • 1 篇 lawrence berkele...
  • 1 篇 sandia natl labs...
  • 1 篇 univ calif berke...
  • 1 篇 centralesupélec ...
  • 1 篇 pazmany peter ca...
  • 1 篇 devito codes eng...

作者

  • 8 篇 demmel james
  • 6 篇 schwartz oded
  • 5 篇 ballard grey
  • 4 篇 yelick katherine
  • 3 篇 holtz olga
  • 3 篇 buluc aydin
  • 3 篇 koanantakool pen...
  • 3 篇 kannan ramakrish...
  • 2 篇 sao piyush
  • 2 篇 magee daniel j.
  • 2 篇 nakatsukasa yuji
  • 2 篇 azad ariful
  • 2 篇 fukaya takeshi
  • 2 篇 yanagisawa yuka
  • 2 篇 solomonik edgar
  • 2 篇 yamamoto yusaku
  • 2 篇 lipshitz benjami...
  • 2 篇 vuduc richard
  • 2 篇 niemeyer kyle e.
  • 1 篇 jin peter

语言

  • 31 篇 英文
检索条件"主题词=communication-avoiding algorithms"
31 条 记 录,以下是11-20 订阅
排序:
SHIFTED CHOLESKY QR FOR COMPUTING THE QR FACTORIZATION OF ILL-CONDITIONED MATRICES
收藏 引用
SIAM JOURNAL ON SCIENTIFIC COMPUTING 2020年 第1期42卷 A477-A503页
作者: Fukaya, Takeshi Kannan, Ramaseshan Nakatsukasa, Yuji Yamamoto, Yusaku Yanagisawa, Yuka Hokkaido Univ Sapporo Hokkaido Japan Arup 3 Piccadilly Pl Manchester M1 3BN Lancs England Univ Oxford Math Inst Oxford OX2 6GG England Univ Electrocommun Tokyo Japan Waseda Univ Waseda Res Inst Sci & Engn Tokyo Japan
The Cholesky QR algorithm is an efficient communication-minimizing algorithm for computing the QR factorization of a tall-skinny matrix X epsilon R-mxn, where m >> n. Unfortunately it is inherently unstable and ... 详细信息
来源: 评论
Applying the swept rule for solving explicit partial differential equations on heterogeneous computing systems
收藏 引用
JOURNAL OF SUPERCOMPUTING 2021年 第2期77卷 1976-1997页
作者: Magee, Daniel J. Walker, Anthony S. Niemeyer, Kyle E. Oregon State Univ Sch Mech Ind & Mfg Engn Corvallis OR 97331 USA Los Alamos Natl Lab Los Alamos NM 87545 USA
Applications that exploit the architectural details of high-performance computing (HPC) systems have become increasingly invaluable in academia and industry over the past two decades. The most important hardware devel... 详细信息
来源: 评论
Accelerating solutions of one-dimensional unsteady PDEs with GPU-based swept time-space decomposition
收藏 引用
JOURNAL OF COMPUTATIONAL PHYSICS 2018年 357卷 338-352页
作者: Magee, Daniel J. Niemeyer, Kyle E. Oregon State Univ Sch Mech Ind & Mfg Engn Corvallis OR 97331 USA
The expedient design of precision components in aerospace and other high-tech industries requires simulations of physical phenomena often described by partial differential equations (PDEs) without exact solutions. Mod... 详细信息
来源: 评论
Orthogonal Layers of Parallelism in Large-Scale Eigenvalue Computations
收藏 引用
ACM TRANSACTIONS ON PARALLEL COMPUTING 2023年 第3期10卷 1-31页
作者: Alvermann, Andreas Hager, Georg Fehske, Holger Univ Greifswald Inst Phys Felix Hausdorff Str 6 D-17489 Greifswald Germany Friedrich Alexander Univ Erlangen Nurnberg Erlangen Natl High Performance Comp Ctr Martensstr 1 D-91058 Erlangen Germany
We address the communication overhead of distributed sparse matrix-(multiple)-vector multiplication in the context of large-scale eigensolvers, using filter diagonalization as an example. The basis of our study is a p... 详细信息
来源: 评论
Graph Expansion and communication Costs of Fast Matrix Multiplication
收藏 引用
JOURNAL OF THE ACM 2012年 第6期59卷 1–23页
作者: Ballard, Grey Demmel, James Holtz, Olga Schwartz, Oded Univ Calif Berkeley Berkeley CA 94720 USA
The communication cost of algorithms (also known as I/O-complexity) is shown to be closely related to the expansion properties of the corresponding computation graphs. We demonstrate this on Strassen's and other f... 详细信息
来源: 评论
Recent Developments in Iterative Methods for Reducing Synchronization  18
Recent Developments in Iterative Methods for Reducing Synchr...
收藏 引用
18th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES)
作者: Zou, Qinmeng Magoules, Frederic Univ Paris Saclay Cent Supelec F-91190 Gif Sur Yvette France
On modern parallel architectures, the cost of synchronization among processors can often dominate the cost of floating-point computation. Several modifications of the existing methods have been proposed in order to ke... 详细信息
来源: 评论
Graph Expansion and communication Costs of Fast Matrix Multiplication  11
Graph Expansion and Communication Costs of Fast Matrix Multi...
收藏 引用
23rd Annual Symposium on Parallelism in algorithms and Architectures
作者: Ballard, Grey Demmel, James Holtz, Olga Schwartz, Oded Univ Calif Berkeley Dept Comp Sci Berkeley CA 94720 USA
The communication cost of algorithms (also known as I/O-complexity) is shown to be closely related to the expansion properties of the corresponding computation graphs. We demonstrate this on Strassen's and other f... 详细信息
来源: 评论
A Supernodal All-Pairs Shortest Path Algorithm  20
A Supernodal All-Pairs Shortest Path Algorithm
收藏 引用
25th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP)
作者: Sao, Piyush Kannan, Ramakrishnan Gera, Prasun Vuduc, Richard Oak Ridge Natl Lab Oak Ridge TN 37830 USA Georgia Inst Technol Atlanta GA 30332 USA
We show how to exploit graphs parsity in the Floyd-Warshall algorithm for the all-pairs shortest path (Apsp) problem. FLOYD-WARSHALL is an attractive choice for Apsp on high-performing systems due to its structural si... 详细信息
来源: 评论
I/O-Optimal algorithms for Symmetric Linear Algebra Kernels  22
I/O-Optimal Algorithms for Symmetric Linear Algebra Kernels
收藏 引用
34th ACM Symposium on Parallelism in algorithms and Architectures (SPAA)
作者: Beaumont, Olivier Eyraud-Dubois, Lionel Langou, Julien Verite, Mathieu Univ Bordeaux Inria Ctr Bordeaux France Univ Colorado Denver Denver Denver CO USA
In this paper, we consider two fundamental symmetric kernels in linear algebra: the Cholesky factorization and the symmetric rank-k update (SYRK), with the classical three nested loops algorithms for these kernels. In... 详细信息
来源: 评论
Tera-Scale 1D FFT with Low-communication Algorithm and Intel® Xeon Phi™ Coprocessors  13
Tera-Scale 1D FFT with Low-Communication Algorithm and Intel...
收藏 引用
International Conference for High Performance Computing, Networking, Storage and Analysis (SC)
作者: Park, Jongsoo Bikshandi, Ganesh Vaidyanathan, Karthikeyan Tang, Ping Tak Peter Dubey, Pradeep Kim, Daehyun Intel Corp Parallel Comp Lab Santa Clara CA 95051 USA Intel Corp Software & Serv Grp Santa Clara CA 95051 USA
This paper demonstrates the first tera-scale performance of Intel (R) Xeon Phi (TM) coprocessors on 1D FFT computations. Applying a disciplined performance programming methodology of sound algorithm choice, valid perf... 详细信息
来源: 评论