检索结果-内蒙古大学图书馆

8th IEEE international symposium on Cluster Computing and the Grid

作者： Ganeshamoorthy, K. Ranasinghe, D. N. Univ Colombo Sch Comp Dept Computat & Intelligent Syst Colombo 07 Sri Lanka

ISBN: (纸本)9781424442379

In this paper, we study the impact of multi processor memory systems in particular, the distributed memory (I)M) and virtual shared memory (VSM), on the implementation of parallel backpropagation neural network algorithms. In the first instance, neural network is partitioned into sub neural networks by applying a hybrid partitioning scheme. In the second, each partitioned network is evaluated with matrix multiplication. Three different sizes of neural networks are used and exchange rate prediction used as a reference problem. parallel implementations for each of the distributed memory and virtual shared memory scenarios is obtained. These algorithms are implemented on a high performance cluster, "Monolith" consisting of over 396 nodes. programming is realized using Message Passing Interface (MPI) library and C-Linda. The partitioned, matrix multiplication has the fastest execution time, and DM/MPI implementation is always faster than the VSM/Linda equivalent. However in VSM/Linda it is possible to allow the parallel neural network to choose the optimum number of processors dynamically.

关键词： VSM DM Hybrid partition parallel neural network

来源：评论

学校读者我要写书评

暂无评论

A parallel PCG solver for large-scale groundwater flow simulation based on OpenMP

A parallel PCG solver for large-scale groundwater flow simul...

引用

2011 4th international symposium on parallel architectures, algorithms and programming, PAAP 2011

作者： Li, Dandan Ji, Xiaohui Wang, Qun Beijing China

ISBN: (纸本)9780769545752

Groundwater flow simulation has become one of the top international issues in new generation of environmental applications. When managing large-scale groundwater flow problems, the intensive computational ability and large amounts of memory space required for modeling are the main bottlenecks for researchers. In order to solve three-dimensional large-scale groundwater flow problems more rapidly, the OpenMP was adopted to parallelize the preconditioned conjugate gradient (PCG) algorithm in this paper. And this paper carried out a numerical experiment of the three-dimensional groundwater flow model on a computer with four cores. Based on the numerical experiment, it is found that the execution time of the original serial PCG program is about 1.74 to 2.86 times of the parallel PCG program executed with different number of threads. The experimental results also demonstrate that the PCG solver based on OpenMP is an effective way for solving large-scale three-dimensional groundwater flow problem. © 2011 IEEE.

关键词： Groundwater flow

来源：评论

学校读者我要写书评

暂无评论

A CMOS architecture allowing parallel DNA comparison for on-chip assembly

A CMOS architecture allowing parallel DNA comparison for on-...

引用

IEEE international symposium on Circuits and Systems

作者： Hu, Yuanqi Liu, Yan Toumazou, Christofer Georgiou, Pantelis Univ London Imperial Coll Sci Technol & Med Dept Elect & Elect Engn Ctr Bioinspired Technol Inst Biomed Engn London SW7 2AZ England

ISBN: (纸本)9781467302197

This paper introduces a CMOS based system that has been designed to allow parallel comparison of fragmented DNA sequences for on-chip assembly. The compatibility of different existing PC-based algorithms for implementation in CMOS is compared and the overlaplayout-consensus approach is found to be the most suitable one. The designed system comprises a scalable processing array capable of parallel computation, which allows identification of overlaps in DNA fragments in addition to error tolerance through dynamic programming. Analysis shows that there is a "pixel area vs computation time" trade-off when implementing such a parallel architecture. Results from a hypothetical assembly confirm good overlap detection and error tolerance, with up to 94% similarity in the detected overlaps, when the error is as much as 10%.

关键词： parallel architectures

来源：评论

学校读者我要写书评

暂无评论

Bit-parallel approximate pattern matching: Kepler GPU versus Xeon Phi

引用

parallel COMPUTING 2016年 54卷 128-138页

作者： Tuan Tu Tran Liu, Yongchao Schmidt, Bertil Johannes Gutenberg Univ Mainz Inst Informat D-55128 Mainz Germany Georgia Inst Technol Sch Computat Sci & Engn Atlanta GA 30332 USA

Approximate pattern matching (APM) targets to find the occurrences of a pattern inside a subject text allowing a limited number of errors. It has been widely used in many application areas such as bioinformatics and information retrieval. Bit-parallel APM takes advantage of the intrinsic parallelism of bitwise operations inside a machine word. This approach typically encodes non-deterministic finite automaton (NFA) states or value differences between adjacent cells of a dynamic programming matrix in the form of bit arrays. Wu-Manber (WM) is a well-known bit-parallel APM algorithm, which simulates an NFA and gains parallel efficiency by performing multiple state updates within a machine word. An important parameter is the machine word size (e.g. 32 or 64 bits for CPUs). Due to increasing vector capabilities, efficient mapping of bit-parallel APM algorithms onto modern high performance computing architectures is an interesting research topic. Prominent examples are Xeon Phi coprocessors and CUDA-enabled GPUs, which provide words of size 512 bits (by means of vector registers) and 1024 bits (by means of warps), respectively. In this paper, we investigate mappings of the WM algorithm onto these two accelerator types. Both architectures are able to achieve around two orders-of-magnitude speedups compared to a single-threaded CPU implementation. Moreover, our tile-based implementation on a GeForce Titan graphics card runs up to 2.9 x faster than our implementation on an Intel Xeon Phi 5110P. Source code is available at http://***. (C) 2015 Elsevier B.V. All rights reserved.

关键词： Bit-parallel Approximate pattern matching Wu-Manber algorithm CUDA GPU Xeon Phi

来源：评论

学校读者我要写书评

暂无评论

programming, ANALYSIS AND SYNTHESIS OF parallel SIGNAL PROCESSORS.

PROGRAMMING, ANALYSIS AND SYNTHESIS OF PARALLEL SIGNAL PROCE...

引用

1987 IEEE international symposium on Circuits and Systems.

作者： Thaler, M. Loeffler, Ch Moschytz, G.S. ETH Zurich Switz ETH Zurich Switz

A method for the programming and evaluation of parallel signal-processor architectures based on a data-flow representation of signal-processing algorithms is described. The constant data flow, which is a special property of most signal-processing algorithms, allows the scheduling and resource allocation to be done at compile time, rather than at run time as in usual data-flow systems. It is therefore possible to describe arbitrary hardware configurations;a result that is closer to a realizable hardware solution is guaranteed. Therefore hardware requirements can be kept low. 7 refs.

关键词： SIGNAL PROCESSING

来源：评论

学校读者我要写书评

暂无评论

The design and implementation of OMPit: An OpenMP compiler characterized by logs for parallel and work-sharing

The design and implementation of OMPit: An OpenMP compiler c...

引用

2011 4th international symposium on parallel architectures, algorithms and programming, PAAP 2011

作者： Luo, Qiuming Cai, Ye Liu, Chengjian Kong, Chang College of Computer Science and Software Engineering Shenzhen University China

ISBN: (纸本)9780769545752

There are many tools for OpenMP benchmarking which measure the various aspects of the performance, such as the overheads of OpenMP directives and the characteristics of the whole system. But we lack some tools to show us the work-sharing details when the OpenMP program finished running. The OMPit (OMPi for tutoring) is designed to provide the work-sharing information during the running, which can be used for tutoring and might help debugging or tuning. The work-sharing logging includes the work assignment and the timestamps for three different work-sharing behaviors. The logging information can be output as a text files or visualized figures. The designing of OMPit is provided and the details of how to inserting the logging code into the OMPi compiler is discussed too. © 2011 IEEE.

关键词： Benchmarking

来源：评论

学校读者我要写书评

暂无评论

A case study of SWIM: Optimization of memory intensive application on GPGPU

A case study of SWIM: Optimization of memory intensive appli...

引用

international symposium on parallel architectures, algorithms, and programming

作者： Yi, Wei Tang, Yuhua Wang, Guibin Fang, Xudong National Laboratory for Parallel and Distributed Processing School of Computer National University of Defense Technology Changsha China

ISBN: (纸本)9780769543123

Recently, GPGPU has been adopted well in the High Performance Computing (HPC) field. The limited global memory bandwidth poses a great challenge to many GPGPU programmers trying to exploit parallelism within the CPUGPU heterogeneous platform. In this paper, we choose SWIM, a typical memory intensive application from the SPEC OMP 2001 benchmark suite, for case study. We attempt to optimize the performance and energy consumption of the application utilizing different memory access mechanisms and present optimization methods including matrix transposition and kernel fusion. The experimental results on the Intel CoreTM i920 CPU plus GeForce GTX 295 platform shows that, the proposed optimizing methods achieve a speedup of 8.7X over the original OpenMP program and reduce the energy consumption by 83% for the problem size of 2048*2048. © 2010 IEEE.

关键词： Energy utilization

来源：评论

学校读者我要写书评

暂无评论

Block sorting is hard

Block sorting is hard

引用

6th international symposium on parallel architectures, algorithms and Networks (I-SPAN 02)

作者： Bein, WW Larmore, LL Latifi, S Sudborough, IH Univ Nevada Dept Comp Sci Las Vegas NV 89154 USA

ISBN: (纸本)0769515797

Block sorting is used in connection with Optical Character Recognition (OCR). Recent work has focused on finding good strategies which work in practice. In this paper, we show that optimizing block sorting is NP-hard. Along with this result, we give new non-trivial lower bounds. These bound can be computed efficiently, We define the concept of "Local Property algorithms" and show that several previously published block sorting algorithms fall into this class.

关键词： Approximation algorithms Character recognition Computer science Optical character recognition software parallel architectures Performance evaluation Sorting Testing

来源：评论

学校读者我要写书评

暂无评论

parallel evolutionary algorithms based on shared memory programming approaches

引用

JOURNAL OF SUPERCOMPUTING 2011年第2期58卷 270-279页

作者： Redondo, J. L. Garcia, I. Ortigosa, P. M. Univ Almeria Dpt Comp Architecture & Elect Almeria Spain

In this work, two parallel techniques based on shared memory programming are presented. These models are specially suitable to be applied over evolutionary algorithms. To study their performance, the algorithm UEGO (U... 详细信息

关键词： Evolutionary algorithm Shared memory programming Computational experiment UEGO

来源：评论

学校读者我要写书评

暂无评论

Exploiting shape in parallel programming

Exploiting shape in parallel programming

引用

Proceedings of the 1996 IEEE 2nd international Conference on algorithms & architectures for parallel Processing, ICA3PP

作者： Barry Jay, C. Clarke, David G. Edwards, Jenny J. Univ of Technology Sydney

Shape theory is a new approach to data types and programming based on the separation of a data type into its 'shape' and 'data' parts. Shape is common in parallel computing. This paper identifies areas where the explicit use of shape reduces the burden of programming a parallel computer, using examples from an implementation of Cholesky decomposition.

关键词： Computer programming languages

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：