检索结果-内蒙古大学图书馆

Mapping stream programs onto multicore platforms by local search and genetic algorithm

COMPUTER LANGUAGES SYSTEMS & STRUCTURES 2016年 46卷 182-205页

作者： Farhad, S. M. Nayeem, Muhammad Ali Rahman, Md. Khaledur Rahman, M. Sohel BUET Dept CSE Dhaka 1000 Bangladesh

This paper presents a number of novel metaheuristic approaches that can efficiently map stream graphs on multicores. A stream graph consists of a set of actors performing different functions communicating through edges. Orchestrating stream graphs on multi cores can be formulated as an Integer Linear programming (ILP) problem but ILP solver takes exponential time to provide an optimal solution. We propose metaheuristic algorithms to achieve near optimal solutions within a reasonable amount of time. We employ six different variants of the Hill-Climbing (HC) algorithm employing different tweak operators that produce excellent result extremely quickly. We also propose six different variants of Genetic Algorithm (GA) to examine how effective these variants can be in escaping the local optima. We finally combine HC and GA techniques (which is also known as 'memetic algorithm') to produce hybrid techniques that outperform the individual performance of HC and GA techniques. We compare our results with the results generated by the CPLEX optimization tool. Our best technique has achieved a geometric mean speedup of 7.42 x across a range of StreamIt benchmarks on an eight-core processor. (C) 2016 Elsevier Ltd. All rights reserved.

关键词： Stream programming Metaheuristics Local search operator Compiler optimization parallel programming Genetic algorithm Hybrid genetic algorithm

来源：评论

学校读者我要写书评

暂无评论

An efficient and flexible scanning of databases of protein secondary structures

引用

JOURNAL OF INTELLIGENT INFORMATION SYSTEMS 2016年第1期46卷 213-233页

作者： Mrozek, Dariusz Socha, Bartek Kozielski, Stanislaw Malysiak-Mrozek, Bozena Silesian Tech Univ Inst Informat Akad 16 PL-44100 Gliwice Poland

Protein secondary structure describe protein construction in terms of regular spatial shapes, including alpha-helices, beta-strands, and loops, which protein amino acid chain can adopt in some of its regions. This information is supportive for protein classification, functional annotation, and 3D structure prediction. The relevance of this information and the scope of its practical applications cause the requirement for its effective storage and processing. Relational databases, widely-used in commercial systems in recent years, are one of the serious alternatives honed by years of experience, enriched with developed technologies, equipped with the declarative SQL query language, and accepted by the large community of programmers. Unfortunately, relational database management systems are not designed for efficient storage and processing of biological data, such as protein secondary structures. In this paper, we present a new search method implemented in the search engine of the PSS-SQL language. The PSS-SQL allows formulation of queries against a relational database in order to find proteins having secondary structures similar to the structural pattern specified by a user. In the paper, we will show how the search process can be accelerated by multiple scanning of the Segment Index and parallel implementation of the alignment procedure using multiple threads working on multiple-core CPUs.

关键词： Bioinformatics Proteins Secondary structure Query language Information retrieval parallel programming Alignment Structure matching SQL Databases

来源：评论

学校读者我要写书评

暂无评论

Contention in Structured Concurrency

引用

ACM SIGPLAN Notices 2017年第8期52卷 75-88页

作者： Acar, Umut A. Ben-David, Naama Rainey, Mike Carnegie Mellon University United States Inria France

Over the past two decades, many concurrent data structures have been designed and implemented. Nearly all such work analyzes concurrent data structures empirically, omitting asymptotic bounds on their efficiency, partly because of the complexity of the analysis needed, and partly because of the difficulty of obtaining relevant asymptotic bounds: when the analysis takes into account important practical factors, such as contention, it is difficult or even impossible to prove desirable bounds. In this paper, we show that considering structured concurrency or relaxed concurrency models can enable establishing strong bounds, also for contention. To this end, we first present a dynamic relaxed counter data structure that indicates the non-zero status of the counter. Our data structure extends a recently proposed data structure, called SNZI, allowing our structure to grow dynamically in response to the increasing degree of concurrency in the system. Using the dynamic SNZI data structure, we then present a concurrent data structure for series-parallel directed acyclic graphs (sp-dags), a key data structure widely used in the implementation of modern parallel programming languages. The key component of sp-dags is an in-counter data structure that is an instance of our dynamic SNZI. We analyze the efficiency of our concurrent sp-dags and in-counter data structures under nested-parallel computing paradigm. This paradigm offers a structured model for concurrency. Under this model, we prove that our data structures require amortized (1) shared memory steps, including contention. We present an implementation and an experimental evaluation that suggests that the sp-dags data structure is practical and can perform well in practice. © 2017 ACM.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Ebb: A DSL for Physical Simulation on CPUs and GPUs

引用

ACM TRANSACTIONS ON GRAPHICS 2016年第2期35卷 1–12页

作者： Bernstein, Gilbert Louis Shah, Chinmayee Lemire, Crystal Devito, Zachary Fisher, Matthew Levis, Philip Hanrahan, Pat Stanford Univ Stanford CA 94305 USA

ISBN: (纸本)9781450342797

Designing programming environments for physical simulation is challenging because simulations rely on diverse algorithms and geometric domains. These challenges are compounded when we try to run efficiently on heterogeneous parallel architectures. We present Ebb, a Domain-Specific Language (DSL) for simulation, that runs efficiently on both CPUs and GPUs. Unlike previous DSLs, Ebb uses a three-layer architecture to separate (1) simulation code, (2) definition of data structures for geometric domains, and (3) runtimes supporting parallel architectures. Different geometric domains are implemented as libraries that use a common, unified, relational data model. By structuring the simulation framework in this way, programmers implementing simulations can focus on the physics and algorithms for each simulation without worrying about their implementation on parallel computers. Because the geometric domain libraries are all implemented using a common runtime based on relations, new geometric domains can be added as needed, without specifying the details of memory management, mapping to different parallel architectures, or having to expand the runtime's interface. We evaluate Ebb by comparing it to several widely used simulations, demonstrating comparable performance to handwritten GPU code where available, and surpassing existing CPU performance optimizations by up to 9x when no GPU code exists.

关键词： Simulation Database Relations Domain-Specific Languages programming languages parallel programming geometric data structures GPU computing local computation

来源：评论

学校读者我要写书评

暂无评论

Scientific computations on multi-core systems using different programming frameworks

引用

APPLIED NUMERICAL MATHEMATICS 2016年 104卷 62-80页

作者： Michailidis, Panagiotis D. Margaritis, Konstantinos G. Univ Macedonia Dept Balkan Slav & Oriental Studies 156 Egnatia Str Thessaloniki 54636 Greece Univ Macedonia Dept Appl Informat 156 Egnatia Str Thessaloniki 54636 Greece

Numerical linear algebra is one of the most important forms of scientific computation. The basic computations in numerical linear algebra are matrix computations and linear systems solution. These computations are used as kernels in many computational problems. This study demonstrates the parallelisation of these scientific computations using multi core programming frameworks. Specifically, the frameworks examined here are Pthreads, OpenMP, Intel Cilk Plus, Intel TBB, SWARM, and FastFlow. A unified and exploratory performance evaluation and a qualitative study of these frameworks are also presented for parallel scientific computations with several parameters. The OpenMP and SWARM models produce good results running in parallel with compiler optimisation when implementing matrix operations at large and medium scales, whereas the remaining models do not perform as well for some matrix operations. The qualitative results show that the OpenMP, Cilk Plus, TBB, and SWARM frameworks require minimal programming effort, whereas the other models require advanced programming skills and experience. Finally, based on an extended study, general conclusions regarding the programming models and matrix operations for some parameters were obtained. (C) 2014 IMACS. Published by Elsevier B:V. All rights reserved.

关键词： Scientific computations Linear algebra parallel computing Multi-core parallel programming

来源：评论

学校读者我要写书评

暂无评论

A distributed neuro-genetic programming tool

引用

SWARM AND EVOLUTIONARY COMPUTATION 2016年 27卷 145-155页

作者： Russo, Marco Univ Catania Dept Phys & Astron Viale Andrea Doria 6 I-95125 Catania Italy

This paper describes the performance of the Brain Project, a distributed software tool for the formal modeling of numerical data using a hybrid neural-genetic programming technique. One of the most interesting characteristics of the Brain Project is its distributed implementation. Unlike many other parallel and/or distributed solutions the only requirement of the Brain Project is that the collaborating personal computers must be 64-bit Linux machines connected to Internet via the transmission control protocol/internet protocol. The performance of the Brain Project is clearly enhanced with the very simple parallelization scheme illustrated in the paper. Although the Brain Project presents many innovative solutions for the genetic programming research, this paper focuses mainly on its behavior in the distributed environment. (C) 2015 Elsevier B.V. All rights reserved.

关键词： Genetic programming Neural networks Distributed computing parallel programming

来源：评论

学校读者我要写书评

暂无评论

Evaluating Large-Scale Biomedical Ontology Matching Over parallel Platforms

引用

IETE TECHNICAL REVIEW 2016年第4期33卷 415-427页

作者： Amin, Muhammad Bilal Khan, Wajahat Ali Hussain, Shujaat Bui, Dinh-Mao Banos, Oresti Kang, Byeong Ho Lee, Sungyoung Kyung Hee Univ Dept Comp Engn Ubiquitous Comp Lab Yongin South Korea Univ Tasmania Sch Comp & Informat Syst Hobart Tas Australia

Biomedical systems have been using ontology matching as a primary technique for heterogeneity resolution. However, the natural intricacy and vastness of biomedical data have compelled biomedical ontologies to become large-scale and complex;consequently, biomedical ontology matching has become a computationally intensive task. Our parallel heterogeneity resolution system, i.e., SPHeRe, is built to cater the performance needs of ontology matching by exploiting the parallelism-enabled multicore nature of today's desktop PC and cloud infrastructure. In this paper, we present the execution and evaluation results of SPHeRe over large-scale biomedical ontologies. We evaluate our system by integrating it with the interoperability engine of a clinical decision support system (CDSS), which generates matching requests for large-scale NCI, FMA, and SNOMED-CT biomedical ontologies. Results demonstrate that our methodology provides an impressive performance speedup of 4.8 and 9.5times over a quad-core desktop PC and a four virtual machine (VM) cloud platform, respectively.

关键词： Biomedical informatics Multithreading Biomedical ontologies Ontology matching parallel processing parallel programming Semantic web

来源：评论

学校读者我要写书评

暂无评论

Using Machine Learning Techniques to Detect parallel Patterns of Multi-threaded Applications

引用

INTERNATIONAL JOURNAL OF parallel programming 2016年第4期44卷 867-900页

作者： Deniz, Etem Sen, Alper Bogazici Univ Dept Comp Engn Istanbul Turkey

Multicore hardware and software are becoming increasingly more complex. The programmability problem of multicore software has led to the use of parallel patterns. parallel patterns reduce the effort and time required to develop multicore software by effectively capturing its thread communication and data sharing characteristics. Hence, detecting the parallel pattern used in a multi-threaded application is crucial for performance improvements and enables many architectural optimizations;however, this topic has not been widely studied. We apply machine learning techniques in a novel approach to automatically detect parallel patterns and compare these techniques in terms of accuracy and speed. We experimentally validate the detection ability of our techniques on benchmarks including PARSEC and Rodinia. Our experiments show that the k-nearest neighbor, decision trees, and naive Bayes classifier are the most accurate techniques. Overall, decision trees are the fastest technique with the lowest characterization overhead producing the best combination of detection results. We also show the usefulness of the proposed techniques on synthetic benchmark generation.

关键词： parallel patterns parallel programming Multi-threaded applications Multicore software Pattern detection

来源：评论

学校读者我要写书评

暂无评论

Automatic Generation of Unit Tests for Correlated Variables in parallel Programs

引用

INTERNATIONAL JOURNAL OF parallel programming 2016年第3期44卷 644-662页

作者： Jannesari, Ali Wolf, Felix German Res Sch Simulat Sci Aachen Germany Rhein Westfal TH Aachen Aachen Germany

A notorious class of concurrency bugs are race condition related to correlated variables, which make up about 30 % of all non-deadlock concurrency bugs. A solution to prevent this problem is the automatic generation of parallel unit tests. This paper presents an approach to generate parallel unit tests for variable correlations in multithreaded code. We introduce a hybrid approach for identifying correlated variables. Furthermore, we estimate the number of potentially violated correlations for methods executed in parallel. In this way, we are capable of creating unit tests that are suited for race detectors considering correlated variables. We were able to identify more than 85 % of all race conditions on correlated variables in eight applications after applying our parallel unit tests. At the same time, we reduced the number of unnecessary generated unit tests. In comparison to a test generator unaware of variable correlations, redundant unit tests are reduced by up to 50 %, while maintaining the same precision and accuracy in terms of the number of detected races.

关键词： Unit tests Automatic testing parallel programming Debugging Race detection Program analysis Correlated variables

来源：评论

学校读者我要写书评

暂无评论

Improving the performance of GPU-based genetic programming through exploitation of on-chip memory

引用

SOFT COMPUTING 2016年第2期20卷 661-680页

作者： Chitty, Darren M. Univ Bristol Dept Comp Sci Merchant Venturers BldgWoodland Rd Bristol BS8 1UB Avon England

Genetic programming (GP) (Koza, Genetic programming, MIT Press, Cambridge, 1992) is well-known as a computationally intensive technique. Subsequently, faster parallel versions have been implemented that harness the highly parallel hardware provided by graphics cards enabling significant gains in the performance of GP to be achieved. However, extracting the maximum performance from a graphics card for the purposes of GP is difficult. A key reason for this is that in addition to the processor resources, the fast on-chip memory of graphics cards needs to be fully exploited. Techniques will be presented that will improve the performance of a graphics card implementation of tree-based GP by better exploiting this faster memory. It will be demonstrated that both L1 cache and shared memory need to be considered for extracting the maximum performance. Better GP program representation and use of the register file is also explored to further boost performance. Using an NVidia Kepler 670GTX GPU, a maximum performance of 36 billion Genetic programming Operations per Second is demonstrated.

关键词： Genetic programming Many-core GPU parallel programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：