检索结果-内蒙古大学图书馆

Speed improvements of peptide-spectrum matching using single-instruction multiple-data instructions

PROTEOMICS 2011年第19期11卷 3779-3785页

作者： Zhang, Jian McQuillan, Ian Wu, Fang-Xiang Univ Saskatchewan Dept Mech Engn Saskatoon SK S7N 5A9 Canada Univ Saskatchewan Dept Comp Sci Saskatoon SK S7N 5A9 Canada Univ Saskatchewan Div Biomed Engn Saskatoon SK S7N 5A9 Canada

Peptide-spectrum matching is one of the most time-consuming portion of the database search method for assignment of tandem mass spectra to peptides. In this study, we develop a parallel algorithm for peptide-spectrum matching using single-instruction multiple data (SIMD) instructions. Unlike other parallel algorithms in peptide-spectrum matching, our algorithm parallelizes the computation of matches between a single spectrum and a given peptide sequence from the database. It also significantly reduces the number of comparison operations. Extra improvements are obtained by using SIMD instructions to avoid conditional branches and unnecessary memory access within the algorithm. The implementation of the developed algorithm is based on the Streaming SIMD Extensions technology that is embedded in most Intel microprocessors. Similar technology also exists in other modern microprocessors. A simulation shows that the developed algorithm achieves an 18-fold speedup over the previous version of Real-Time Peptide-Spectrum Matching algorithm [F. X. Wu et al., Rapid Commun. Mass Sepctrom. 2006, 20, 1199-1208]. Therefore, the developed algorithm can be employed to develop real-time control methods for MS/MS.

关键词： Bioinformatics MS Parallel computing single-instruction multiple-data

来源：评论

学校读者我要写书评

暂无评论

PyCUDA and PyOpenCL: A scripting-based approach to GPU run-time code generation

引用

PARALLEL COMPUTING 2012年第3期38卷 157-174页

作者： Kloeckner, Andreas Pinto, Nicolas Lee, Yunsup Catanzaro, Bryan Ivanov, Paul Fasih, Ahmed NYU Courant Inst Math Sci New York NY 10012 USA MIT McGovern Inst Brain Res Cambridge MA 02139 USA Harvard Univ Rowland Inst Cambridge MA 02142 USA Univ Calif Berkeley Vis Sci Grad Program Berkeley CA 94720 USA Univ Calif Berkeley Redwood Ctr Theoret Neurosci Berkeley CA 94720 USA Ohio State Univ Dept Elect & Comp Engn Columbus OH 43210 USA

High-performance computing has recently seen a surge of interest in heterogeneous systems, with an emphasis on modern Graphics Processing Units (GPUs). These devices offer tremendous potential for performance and efficiency in important large-scale applications of computational science. However, exploiting this potential can be challenging, as one must adapt to the specialized and rapidly evolving computing environment currently exhibited by GPUs. One way of addressing this challenge is to embrace better techniques and develop tools tailored to their needs. This article presents one simple technique, CPU run-time code generation (RTCG), along with PyCUDA and PyOpenCL, two open-source tool-kits that supports this technique. In introducing PyCUDA and PyOpenCL, this article proposes the combination of a dynamic, high-level scripting language with the massive performance of a CPU as a compelling two-tiered computing platform, potentially offering significant performance and productivity advantages over conventional single-tier, static systems. The concept of RTCG is simple and easily implemented using existing, robust infrastructure. Nonetheless it is powerful enough to support (and encourage) the creation of custom application-specific tools by its users. The premise of the paper is illustrated by a wide range of examples where the technique has been applied with considerable success. (C) 2011 Elsevier B.V. All rights

关键词： GPU Many-core Code generation Automated tuning Software engineering High-level languages Massive parallelism single-instruction multiple-data CUDA OpenCL

来源：评论

学校读者我要写书评

暂无评论

ARCHITECTURE-INDEPENDENT PARALLEL COMPUTATION

引用

COMPUTER 1990年第12期23卷 38-50页

作者： SKILLICORN, DB Dept. of Comput. & Inf. Sci. Queen's Univ. Kingston Ont.

The major parallel architecture classes are considered: single-instruction multiple-data (SIMD) computers, tightly coupled multiple-instruction multiple-data (MIMD) computers, hypercuboid computers and constant-valence MIMD computers. An argument that the PRAM model is universal over tightly coupled and hypercube systems, but not over constant-valence-topology, loosely coupled-system is reviewed, showing precisely how the PRAM model is too powerful to permit broad universality. Ways in which a model of computation can be restricted to become universal over less powerful architectures are discussed. The Bird-Meertens formalism (R.S. Bird, 1989), is introduced and it is shown how it is used to express computations in a compact way. It is also shown that the Bird-Meertens formalism is universal over all four architecture classes and that nontrivial restrictions of functional programming languages exist that can be efficiently executed on disparate architectures. The use of the Bird-Meertens formalism as the basis for a programming language is discussed, and it is shown that it is expressive enough to be used for general programming. Other models and programming languages with architecture-independent properties are reviewed

关键词： Bird-Meertens formalism PRAM model SIMD architecture-independent properties constant-valence MIMD computers constant-valence-topology functional programming languages hypercube networks hypercube systems hypercuboid computers loosely coupled-system major parallel architecture classes nontrivial restrictions parallel architectures parallel computation parallel machines parallel programming random-access storage single-instruction multiple-data tightly coupled multiple-instruction multiple-data

来源：评论

学校读者我要写书评

暂无评论

IMAGE UNDERSTANDING ARCHITECTURE - EXPLOITING POTENTIAL PARALLELISM IN MACHINE VISION

引用

COMPUTER 1992年第2期25卷 65-68页

作者： WEEMS, CC RISEMAN, EM HANSON, AR UNIV MASSACHUSETTS DEPT COMP & INFORMAT SCIAMHERSTMA 01003 UNIV MASSACHUSETTS COMP VISION RES LABAMHERSTMA 01003

A hardware architecture that addresses at least part of the potential parallelism in each of the three levels of vision abstraction, low (sensory), intermediate (symbolic), and high (knowledge-based), is described. The machine, called the image understanding architecture (IUA), consists of three different, tightly coupled parallel processors; the content addressable array parallel processor (CAAPP) at the low level, the intermediate communication associative processor (ICAP) at the intermediate level, and the symbolic processing array (SPA) at the high level. The CAAPP and ICAP levels are controlled by an array control unit (ACU) that takes its directions from the SPA level. The SPA is a multiple-instruction multiple-data (MIMD) parallel processor, while the intermediate and low levels operat in multiple modes. The CAAPP operates in single-instruction multiple-data (SIMD) associative or multiassociative mode, and the ICAP operates in single-program multiple-data (SPMD) or MIMD mode

关键词： ACU CAAPP ICAP IUA MIMD SIMD SPA SPMD array control unit computer vision computerised picture processing content addressable array parallel processor hardware architecture image understanding architecture intermediate communication associative processor machine vision multiple-instruction multiple-data parallel architectures parallel machines parallel processor single-instruction multiple-data single-program multiple-data symbolic processing array tightly coupled parallel processors vision abstraction

来源：评论

学校读者我要写书评

暂无评论

Parsimony: Enabling SIMD/Vector Programming in Standard Compiler Flows 2023

Parsimony: Enabling SIMD/Vector Programming in Standard Comp...

引用

21st ACM/IEEE International Symposium on Code Generation and Optimization (CGO)

作者： Kandiah, Vijay Lustig, Daniel Villa, Oreste Nellans, David Hardavellas, Nikos Northwestern Univ Evanston IL 60208 USA NVIDIA San Jose CA USA

ISBN: (纸本)9798400701016

Achieving peak throughput on modern CPUs requires maximizing the use of single-instruction, multiple-data (SIMD) or vector compute units. single-program, multiple-data (SPMD) programming models are an effective way to use high-level programming languages to target these ISAs. Unfortunately, many SPMD frameworks have evolved to have either overly-restrictive language specifications or under-specified programming models, and this has slowed the widescale adoption of SPMD-style programming. This paper introduces Parsimony (PARallel SIMd), a SPMD programming approach built with semantics designed to be compatible with multiple languages and to cleanly integrate into the standard optimizing compiler toolchains for those languages. We first explain the Parsimony programming model semantics and how they enable a standalone compiler IR-to-IR pass that can perform vectorization independently of other passes, improving the language and toolchain compatibility of SPMD programming. We then demonstrate a LLVM prototype of the Parsimony approach that matches the performance of ispc, a popular but more restrictive SPMD approach, and achieves 97% of the performance of hand-written AVX-512 SIMD intrinsics on over 70 benchmarks ported from the Simd Library. We finally discuss where Parsimony has exposed parts of existing language and compiler flows where slight improvements could further enable improved SPMD program vectorization.

关键词： Parallel Computing Vectorization Code Translation single-instruction multiple-data Compiler Design

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：