咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >Efficient computation of posit... 收藏

Efficient computation of positional population counts using SIMD instructions

用 SIMD 指令的位置的人口计数的有效计算

作     者:Klarqvist, Marcus D. R. Mula, Wojciech Lemire, Daniel 

作者机构:Univ Cambridge Dept Genet Cambridge England Wellcome Sanger Inst Cambridge England Univ Quebec TELUQ 5800 St Denis St Quebec City PQ H2S 3L5 Canada 

出 版 物:《CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE》 (并行学和计算:实践与经验)

年 卷 期:2021年第33卷第17期

页      面:e6304-e6304页

核心收录:

学科分类:08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)] 

基  金:National Research Council Canada [RGPIN-2017-03910] 

主  题:bioinformatics genomics population counts sequencing SIMD instructions software performance vectorization 

摘      要:In several fields such as statistics, machine learning, and bioinformatics, categorical variables are frequently represented as one-hot encoded vectors. For example, given eight distinct values, we map each value to a byte where only a single bit has been set. We are motivated to quickly compute statistics over such encodings. Given a stream of k-bit words, we seek to compute k distinct sums corresponding to bit values at indexes 0, 1, 2, horizontal ellipsis , k - 1. If the k-bit words are one-hot encoded then the sums correspond to a frequency histogram. This multiple-sum problem is a generalization of the population-count problem where we seek the sum of all bit values. Accordingly, we refer to the multiple-sum problem as a positional population-count. Using SIMD (Single Instruction, Multiple Data) instructions from recent Intel processors, we describe algorithms for computing the 16-bit position population count using less than half of a CPU cycle per 16-bit word. Our best approach uses up to 400 times fewer instructions and is up to 50 times faster than baseline code using only regular (non-SIMD) instructions, for sufficiently large inputs.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分