咨询与建议

限定检索结果

文献类型

  • 16 篇 会议
  • 15 篇 期刊文献
  • 1 篇 学位论文

馆藏范围

  • 32 篇 电子文献
  • 0 种 纸本馆藏

日期分布

学科分类号

  • 25 篇 工学
    • 23 篇 计算机科学与技术...
    • 7 篇 软件工程
    • 3 篇 电气工程
    • 1 篇 信息与通信工程
    • 1 篇 控制科学与工程
    • 1 篇 生物工程
  • 10 篇 理学
    • 4 篇 数学
    • 4 篇 物理学
    • 1 篇 化学
    • 1 篇 生物学
  • 2 篇 管理学
    • 2 篇 管理科学与工程(可...

主题

  • 32 篇 simd vectorizati...
  • 4 篇 openmp
  • 3 篇 tiling
  • 2 篇 performance
  • 2 篇 program synthesi...
  • 2 篇 mpi
  • 2 篇 compiler optimiz...
  • 2 篇 basic linear alg...
  • 2 篇 speculation
  • 2 篇 csr2
  • 2 篇 avx2
  • 2 篇 spmv
  • 2 篇 atlas
  • 2 篇 ifko
  • 2 篇 iterative compil...
  • 2 篇 dsl
  • 2 篇 matrix multiplic...
  • 2 篇 csr5
  • 1 篇 parallel algorit...
  • 1 篇 video retrieval

机构

  • 2 篇 tsinghua univ de...
  • 2 篇 qinghai univ dep...
  • 2 篇 swiss fed inst t...
  • 1 篇 louisiana state ...
  • 1 篇 stfc daresbury l...
  • 1 篇 univ paris sacla...
  • 1 篇 warsaw univ tech...
  • 1 篇 tech univ berlin...
  • 1 篇 univ utah sci co...
  • 1 篇 hokkaido univ gr...
  • 1 篇 auis engn dept s...
  • 1 篇 high performance...
  • 1 篇 uppsala univ upp...
  • 1 篇 univ tx san anto...
  • 1 篇 university of te...
  • 1 篇 univ exeter coll...
  • 1 篇 edinburgh resear...
  • 1 篇 lawrence berkele...
  • 1 篇 dept. of compute...
  • 1 篇 univ bologna bol...

作者

  • 2 篇 spampinato danie...
  • 2 篇 bian haodong
  • 2 篇 liu lingbin
  • 2 篇 wang xiaoying
  • 2 篇 huang jianqiang
  • 2 篇 dong runting
  • 2 篇 lobet m.
  • 1 篇 inman jeff
  • 1 篇 alonso-jorda ped...
  • 1 篇 mirsalari seyed ...
  • 1 篇 nikos ntarmos
  • 1 篇 wang patricia p.
  • 1 篇 vay j. -l.
  • 1 篇 hoehnerbach mark...
  • 1 篇 yi qing
  • 1 篇 massimo f.
  • 1 篇 martinez hector
  • 1 篇 kidwai hashir k.
  • 1 篇 perez f.
  • 1 篇 guo yuluo

语言

  • 32 篇 英文
检索条件"主题词=SIMD vectorization"
32 条 记 录,以下是1-10 订阅
排序:
A novel ILU preconditioning method with a block structure suitable for simd vectorization
收藏 引用
JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS 2023年 419卷
作者: Suzuki, Kengo Fukaya, Takeshi Iwashita, Takeshi Hokkaido Univ Grad Sch Informat Sci & Technol Kita Ku W9 N14 Sapporo Hokkaido 0600814 Japan Hokkaido Univ Informat Initiat Ctr Kita Ku W5 N11 Sapporo Hokkaido 0600811 Japan
Incomplete LU (ILU) preconditioning is typically used when an iterative solver is applied on an asymmetric system of linear equations. A fill-in selection policy significantly affects the ILU preconditioned iterative ... 详细信息
来源: 评论
simd vectorization for the Lennard-Jones potential with AVX2 and AVX-512 instructions
收藏 引用
COMPUTER PHYSICS COMMUNICATIONS 2019年 237卷 1-7页
作者: Watanabe, Hiroshi Nakagawa, Koh M. Univ Tokyo Inst Solid State Phys Kashiwanoha 5-1-5 Kashiwa Chiba 2778581 Japan
This work describes the simd vectorization of the force calculation of the Lennard-Jones potential with Intel AVX2 and AVX-512 instruction sets. Since the force-calculation kernel of the molecular dynamics method invo... 详细信息
来源: 评论
Algorithm 1039: Automatic Generators for a Family of Matrix Multiplication Routines with Apache TVM
收藏 引用
ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE 2024年 第1期50卷 1-34页
作者: Alaejos, Guillermo Castello, Adrian Alonso-Jorda, Pedro Igual, Francisco D. Martinez, Hector Quintana-Orti, Enrique S. Univ Politecn Valencia Valencia 46022 Spain Univ Complutense Madrid Madrid 28040 Spain Univ Cordoba Cordoba 14071 Spain
We explore the utilization of the Apache TVM open source framework to automatically generate a family of algorithms that follow the approach taken by popular linear algebra libraries, such as GotoBLAS2, BLIS, and Open... 详细信息
来源: 评论
YFlows: Systematic Dataflow Exploration and Code Generation for Efficient Neural Network Inference using simd Architectures on CPUs  2024
YFlows: Systematic Dataflow Exploration and Code Generation ...
收藏 引用
33rd ACM SIGPLAN International Conference on Compiler Construction (CC)
作者: Zhou, Cyrus Hassman, Zack Shah, Dhirpal Richard, Vaughn Li, Yanjing Univ Chicago Chicago IL 60637 USA
We address the challenges associated with deploying neural networks on CPUs, with a particular focus on minimizing inference time while maintaining accuracy. Our novel approach is to use the dataflow (i.e., computatio... 详细信息
来源: 评论
TransLib: A Library to Explore Transprecision Floating-Point Arithmetic on Multi-Core IoT End-Nodes
TransLib: A Library to Explore Transprecision Floating-Point...
收藏 引用
Design, Automation and Test in Europe Conference and Exhibition (DATE)
作者: Mirsalari, Seyed Ahmad Tagliavini, Giuseppe Rossi, Davide Benini, Luca Univ Bologna Bologna Italy ETH Zurich Switzerland
Reduced-precision floating-point (FP) arithmetic is being widely adopted to reduce memory footprint and execution time on battery-powered Internet of Things (IoT) end-nodes. However, reduced precision computations mus... 详细信息
来源: 评论
Efficient Application of Hanging-Node Constraints for Matrix-Free High-Order FEM Computations on CPU and GPU  37th
Efficient Application of Hanging-Node Constraints for Matrix...
收藏 引用
37th International Supercomputing Conference on High Performance Computing (ISC High Performance Computing)
作者: Munch, Peter Ljungkvist, Karl Kronbichler, Martin Helmholtz Zentrum Hereon Geesthacht Germany Tech Univ Munich Munich Germany Uppsala Univ Uppsala Sweden
This contribution presents an efficient algorithm for resolving hanging-node constraints on the fly for high-order finite-element computations on adaptively refined meshes, using matrix-free implementations. We concen... 详细信息
来源: 评论
A simple and efficient storage format for simd-accelerated SpMV
收藏 引用
CLUSTER COMPUTING-THE JOURNAL OF NETWORKS SOFTWARE TOOLS AND APPLICATIONS 2021年 第4期24卷 3431-3448页
作者: Bian, Haodong Huang, Jianqiang Dong, Runting Guo, Yuluo Liu, Lingbin Huang, Dongqiang Wang, Xiaoying Qinghai Univ Dept Comp Technol & Applicat Xining Peoples R China Tsinghua Univ Dept Comp Sci & Technol Beijing Peoples R China
SpMV (Sparse matrix-vector multiplication) is an essential component in scientific computing and has attracted the attention of researchers in related fields at home and abroad. With the continuous expansion of matrix... 详细信息
来源: 评论
Automating Vectorized Distributed Graph Computation
收藏 引用
Proceedings of the ACM on Management of Data 2024年 第6期2卷 1-27页
作者: Wenyue Zhao Yang Cao Peter Buneman Jia Li Nikos Ntarmos University of Edinburgh Edinburgh UK Edinburgh Research Center Central Software Institute Huawei Edinburgh UK
Multi-instance graph algorithms interleave the evaluation of multiple instances of the same algorithm with different inputs over the same graph. They have been shown to be significantly faster than traditional serial ... 详细信息
来源: 评论
vectorization OF A THREAD-PARALLEL JACOBI SINGULAR VALUE DECOMPOSITION METHOD
收藏 引用
SIAM JOURNAL ON SCIENTIFIC COMPUTING 2023年 第3期45卷 C73-C100页
作者: Novakovic, Vedran Zagreb 10000 Croatia
The eigenvalue decomposition (EVD) of (a batch of) Hermitian matrices of order two has a role in many numerical algorithms, of which the one-sided Jacobi method for the singular value decomposition (SVD) is the prime ... 详细信息
来源: 评论
EFFICIENT MATRIX-FREE HIGH-ORDER FINITE ELEMENT EVALUATION FOR SIMPLICIAL ELEMENTS
收藏 引用
SIAM JOURNAL ON SCIENTIFIC COMPUTING 2020年 第3期42卷 C97-C123页
作者: Moxey, David Amici, Roman Kirby, Mike Univ Exeter Coll Engn Math & Phys Sci Exeter EX17 1EJ Devon England Univ Utah Sci Comp & Imaging Inst Salt Lake City UT 84112 USA
With the gap between processor clock speeds and memory bandwidth speeds continuing to increase, the use of arithmetically intense schemes, such as high-order finite element methods, continues to be of considerable int... 详细信息
来源: 评论