检索结果-内蒙古大学图书馆

2nd IEEE International Conference on Parallel, Distributed and Grid Computing (PDGC)

作者： Barbhuiya, Saki Liang, Ying Univ West Scotland Sch Comp Paisley Renfrew Scotland

ISBN: (纸本)9781467329255;9781467329224

It is seen that Weather Forecast Models (WFMs) are often implemented using the sequential programs. This usually takes longer execution time, larger computer resources and more power as WFMs involve high level computational tasks to process large amount of weather forecast data. These become problems for the weather forecast companies in terms of WFM performance. The companies have already tried to use the multi-core systems to overcome these, but it does not work always because of the poor selection and implementation of programming strategies. By addressing these problems, a research project has been conducted as a case study for the weather production company named Weather2 Ltd. The case study attempted multi-threaded programming based on the multi-core systems as a different implementation strategy for Weather2's WFM as solution to their problems in using sequential programs. The results of the case study showed that this new strategy could improve the performance of WFM significantly by reducing the execution time, using less computer resources and power. This paper presents the case study and its results.

关键词： multi-threaded programming weather forcast model multi-core system parallel computing

来源：评论

学校读者我要写书评

暂无评论

Finite element numerical integration for first order approximations on multi- and many-core architectures

引用

COMPUTER METHODS IN APPLIED MECHANICS AND ENGINEERING 2016年 305卷 827-848页

作者： Banas, Krzysztof Kruzel, Filip Bielanski, Jan AGH Univ Sci & Technol Dept Appl Comp Sci & Modelling Mickiewicza 30 PL-30059 Krakow Poland Cracow Univ Technol Inst Comp Sci Warszawska 24 PL-31155 Krakow Poland

The paper presents investigations on the performance of the finite element numerical integration algorithm for first order approximations and three processor architectures, popular in scientific computing, classical x86_64 CPU, Intel Xeon Phi and NVIDIA Kepler GPU. We base the discussion on theoretical performance models and our own implementations for which we perform a range of computational experiments. For the latter, we consider a unifying programming model and portable OpenCL implementation for all architectures. Variations of the algorithm due to different problems solved and different element types are investigated and several optimizations aimed at proper optimization and mapping of the algorithm to computer architectures are demonstrated. The experimental results show the varying levels of performance for different architectures, but indicate that the algorithm can be effectively ported to all of them. The conclusions indicate the factors that limit the performance for different problems and types of approximation and the performance ranges that can be expected for FEM numerical integration on different processor architectures. (C) 2016 Elsevier B.V. All rights reserved.

关键词： Finite element method First order approximation Numerical integration multi-threaded programming multi-core processors Graphics processors

来源：评论

学校读者我要写书评

暂无评论

Computing Optimised Parallel Speeded-Up Robust Features (P-SURF) on multi-Core Processors

引用

INTERNATIONAL JOURNAL OF PARALLEL programming 2010年第2期38卷 138-158页

作者： Zhang, Nan Xian Jiaotong Liverpool Univ Dept Comp Sci & Software Engn Suzhou Peoples R China

This article presents a novel CPU-based parallel algorithm (P-SURF) that computes the Speeded-Up Robust Features (SURF), a local descriptor that is able to find point correspondences between images in spite of scaling and rotation. The algorithm presented here parallelises all the seven major steps found in the original serial computation. The task in each of the steps is decomposed and the fractions are assigned to running threads bound onto distinctive processors. The implementation of the algorithm was tested using randomly selected images in regard to performance, scalability and stability. The results showed that its performance on mid-level Intel Core Duo processors was comparable to that of some fast GPU-based SURF implementations. For example, on a testing system equipped with an Intel Core Duo P8600 at 2.4 GHz, P-SURF was able to extract and represent features from a 640 x 480 image at a rate of 33 frames per second. The experimental results also revealed that, instead of leaving the threads to the kernel for processor assignment, assigning hard processor affinity by the algorithm produced better performance and stability.

关键词： Parallel computing multi-threaded programming multi-core processing Image feature extraction Image local descriptor

来源：评论

学校读者我要写书评

暂无评论

Bounding the number of segment histories during data race detection

引用

PARALLEL COMPUTING 2002年第9期28卷 1221-1238页

作者： Christiaens, M Ronsse, M De Bosschere, K Univ Ghent B-9000 Ghent Belgium

In this article we present a technique for reducing the memory overhead while performing data race detection. Data races occur when multiple threads modify the same memory location without proper synchronization. In order to detect data races, we need to check all read and write operations performed by the threads. We describe a method for efficiently storing these read and write operations called "merging of segment histories". This method improves upon known techniques by ensuring an upper limit to the amount of memory consumed for storing the read and write operations while maintaining the full accuracy of the data race detection. The method has been implemented in an existing data race detection tool called RecPlay for Solaris binaries. We show that it enables us to perform data race detection on benchmarks which were previously beyond our grasp. (C) 2002 Elsevier Science B.V. All rights reserved.

关键词： data race detection multi-threaded programming debugging

来源：评论

学校读者我要写书评

暂无评论

From verified model to executable program: the PAT approach

引用

INNOVATIONS IN SYSTEMS AND SOFTWARE ENGINEERING 2016年第1期12卷 1-26页

作者： Zhu, Huiquan Sun, Jing Dong, Jin Song Lin, Shang-Wei Natl Univ Singapore Dept Comp Sci Singapore 117548 Singapore Univ Auckland Dept Comp Sci Auckland 1 New Zealand Nanyang Technol Univ Sch Comp Engn 50 Nanyang Ave Singapore 639798 Singapore

CSP# is a formal modeling language that emphasizes the design of communication in concurrent systems. PAT framework provides a model checking environment for the simulation and verification of CSP# models. Although the desired properties can be formally verified at the design level, it is not always straightforward to ensure the correctness of the system's implementation conforms to the behaviors of the formal design model. To avoid human error and enhance productivity, it would be beneficial to have a tool support to automatically generate the executable programs from their corresponding formal models. In this paper, we propose such a solution for translating verified CSP# models into C# programs in the PAT framework. We encoded the CSP# operators in a C# library-"***", where the event synchronization is based on the "Monitor" class in C#. The precondition and choice layers are built on top of the CSP event synchronization to support language-specific features. We further developed a code generation tool to automatically transform CSP# models into multi-threaded C# programs. We proved that the generated C# program and original CSP# model are equivalent on the trace semantics. This equivalence guarantees that the verified properties of the CSP# models are preserved in the generated C# programs. Furthermore, based on the existing implementation of choice operator, we improved the synchronization mechanism by pruning the unnecessary communications among the choice operators. The experiment results showed that the improved mechanism notably outperforms the standard JCSP library.

关键词： Modeling checking CSP# multi-threaded programming C#

来源：评论

学校读者我要写书评

暂无评论

A Stack-Slicing Algorithm for multi-Core Model Checking

引用

ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE 2008年第1期198卷 3-16页

作者： Holzmann, Gerard J. NASA JPL Lab Reliable Software 4800 Oak Grove Dr Pasadena CA 91109 USA

The broad availability of multi-core chips on standard desktop PCs provides strong motivation for the development of new algorithms for logic model checkers that can take advantage of the additional processing power. With a steady increase in the number of available processing cores, we would like the performance of a model checker to increase as well - ideally linearly. The new trend implies a change of focus away from cluster computers towards shared memory systems. In this paper we discuss the multi-core algorithms that are in development for the SPIN model checker.

关键词： multi-core systems Distributed systems multi-threaded programming Software verification Logic model checking Cluster computers

来源：评论

学校读者我要写书评

暂无评论

Fault Detection in multi-threaded C++ Server Applications

引用

ELECTRONIC NOTES IN THEORETICAL COMPUTER SCIENCE 2007年第9期174卷 5-22页

作者： Muehlenfeld, Arndt Wotawa, Franz Graz Univ Technol Inst Software Technol Graz Austria

Due to increasing demands in processing power on the one hand, but the physical limit on CPU clock speed on the other hand, multi-threaded programming is becoming more important in current applications. Unfortunately, multi-threaded programs are prone to programming mistakes that result in hard to find defects, mainly race-conditions and deadlocks. The need for tools that help finding these faults is immanent, but currently available tools are either difficult to use because of the need for annotations, unable to cope with more than a few 10 kLOC, or issue too many false warnings. This paper describes experiments with the freely available tool Helgrind and results obtained by using it for debugging a server application comprising 500 kLOC. We present improvements to the runtime analysis of C++ programs that result in a dramatic reduction of false warnings.

关键词： data races race conditions debugging parallel programs synchronization multi-threaded programming object-oriented programming static-dynamic co-analysis

来源：评论

学校读者我要写书评

暂无评论

Parallel multi-View Graph Matrix Completion for Large Input Matrix 9

Parallel Multi-View Graph Matrix Completion for Large Input ...

引用

9th IEEE Annual Computing and Communication Workshop and Conference (CCWC)

作者： Koohi, Arezou Homayoun, Houman George Mason Univ Elect & Comp Engn Fairfax VA 22030 USA

ISBN: (纸本)9781728105543

We propose a method for parallel multi-view graph matrix completion for the prediction of ratings in recommender systems. The missing ratings are computed based on both the similarity matrix in addition to a rating matrix. The rating matrix is sparse and some items might not have any rating information available. The similarity matrix can be calculated from different item attributes available from ecommerce websites. As the input matrix becomes large, the need for more computationally efficient matrix completion increases. The main contribution of this paper is to show speed-up in calculating the missing ratings by using multi-threaded programming. Simulation results are based on the large input matrix and show reduction in RMSE for the case of cold start prediction.

关键词： recommender systems multi-view graph multi-threaded programming parallel computing matrix decomposition

来源：评论

学校读者我要写书评

暂无评论

Scalable Data-Driven PageRank: Algorithms, System Issues, and Lessons Learned 21st

Scalable Data-Driven PageRank: Algorithms, System Issues, an...

引用

21st International Conference on Parallel and Distributed Computing (Euro-Par)

作者： Whang, Joyce Jiyoung Lenharth, Andrew Dhillon, Inderjit S. Pingali, Keshav Univ Texas Austin Austin TX 78712 USA

ISBN: (纸本)9783662480960;9783662480953

Large-scale network and graph analysis has received considerable attention recently. Graph mining techniques often involve an iterative algorithm, which can be implemented in a variety of ways. Using PageRank as a model problem, we look at three algorithm design axes: work activation, data access pattern, and scheduling. We investigate the impact of different algorithm design choices. Using these design axes, we design and test a variety of PageRank implementations finding that data-driven, push-based algorithms are able to achieve more than 28x the performance of standard PageRank implementations (e.g., those in GraphLab). The design choices affect both single-threaded performance as well as parallel scalability. The implementation lessons not only guide efficient implementations of many graph mining algorithms, but also provide a framework for designing new scalable algorithms.

关键词： Scalable computing Graph analytics PageRank multi-threaded programming Data-driven algorithm

来源：评论

学校读者我要写书评

暂无评论

Helgrind⁺: An Efficient Dynamic Race Detector

Helgrind<SUP>+</SUP>: An Efficient Dynamic Race Detector

引用

23rd IEEE International Parallel and Distributed Processing Symposium

作者： Jannesari, Ali Bao, Kalbin Pankratius, Victor Tichy, Walter F. Univ Karlsruhe D-76131 Karlsruhe Germany

ISBN: (纸本)9781424437511

Finding synchronization defects is difficult due to non-deterministic orderings of parallel threads. Current tools for detecting synchronization defects tend to miss man), data races or produce an overwhelming number of false alarms. In this paper, we describe Helgrind(+), a dynamic race detection tool that incorporates correct handling of condition variables and a combination of the lockset algorithm and happens-before relation. We compare our techniques with Intel Thread Checker and the original Helgrind tool on two substantial benchmark suites. Helgrind+ reduces the number of both false negatives (missed races) and false positives. The additional accuracy incurs almost no performance overhead.

关键词： Race detection race conditions debugging parallel programs multi-threaded programming dynamic analysis happens-before lockset

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：