检索结果-内蒙古大学图书馆

parallel algorithms for moving Lagrangian data on block structured Eulerian meshes

parallel COMPUTING 2011年第2期37卷 101-113页

作者： Dubey, Anshu Antypas, Katie Daley, Christopher Univ Chicago ASC Flash Ctr Chicago IL 60637 USA Univ Calif Berkeley Lawrence Berkeley Lab Berkeley CA 94720 USA

We present a suite of algorithms for migrating Lagrangian data between processors in a parallel environment when the underlying mesh is Eulerian. The collection of algorithms applies to both uniform and adaptive meshes. The algorithms are implemented in, and distributed with, FLASH, a publicly available multiphysics simulation code. Migrating Lagrangian data on an Eulerian mesh is non-trivial because the Eulerian grid points are spatially fixed whereas Lagrangian entities move with the flow of a simulation. Thus, the movement of Lagrangian data cannot use the data migration methods associated with the Eulerian mesh. Additionally, when the mesh is adaptive, as the simulation progresses the grid resolution changes. The resulting regridding process can cause complex Lagrangian data migration. The algorithms presented in this paper describe Lagrangian data movement on a static uniform mesh and on an adaptive octree based block-structured mesh. Some of the algorithms are general enough to be applicable to any block structured mesh, while some others exploit the meta-data and structure of PARAMESH, the adaptive mesh refinement (AMR) package used in FLASH. We also present an analysis of the algorithms' comparative performances in different parallel environments, and different flow characteristics. (C) 2011 Elsevier B.V. All rights reserved.

关键词： parallel algorithm Lagrangian data Tracer particles Adaptive mesh FLASH

来源：评论

学校读者我要写书评

暂无评论

Matrix Multiplication using r-Train Data Structure

Matrix Multiplication using r-Train Data Structure

引用

The 2013 AASRI Conference on parallel and Distributed Computing and Systems(DCS 2013)

作者： Bashir Alam Department of Computer Engineering Jamia Millia Islamia New Delhi

A new dynamic data structure has been proposed recently in *** are several algorithms for matrix *** none of them has used r-train data structure for storing and multiplying the *** this paper algorithm for matrix mul... 详细信息

关键词： R-Train SIMD parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

parallel Implementation of Asynchronous Cellular Automata on a 32-Core Computer

引用

NUMERICAL ANALYSIS AND APPLICATIONS 2012年第1期5卷 45-53页

作者： Kalgin, K. V. Russian Acad Sci Inst Computat Math & Math Geophys Siberian Branch Pr Akad Lavrenteva 6 Novosibirsk 630090 Russia

An efficient way of how some parallel algorithms of asynchronous cellular automata simulation can be mapped onto the architecture of a modern 32-core computer (4xIntel Xeon X7560) is investigated. An example is a mode... 详细信息

关键词： parallel implementation cellular automata parallel algorithm multicores

来源：评论

学校读者我要写书评

暂无评论

parallel H-matrix arithmetic on distributed-memory systems

引用

COMPUTING AND VISUALIZATION IN SCIENCE 2012年第2期15卷 87-97页

作者： Izadi, Mohammad Shahid Bahonar Univ Kerman Dept Math POB 76169-14111 Kerman Iran

In the last decade, the hierarchical matrix technique was introduced to deal with dense matrices in an efficient way. It provides a data-sparse format and allows an approximate matrix algebra of nearly optimal complexity. This paper is concerned with utilizing multiple processors to gain further speedup for the H-matrix algebra, namely matrix truncation, matrix-vector multiplication, matrix-matrix multiplication, and inversion. One of the most cost-effective solution for large-scale computation is distributed computing. Distribute-memory architectures provide an inexpensive way for an organization to obtain parallel capabilities as they are increasingly popular. In this paper, we introduce a new distribution scheme for H-matrices based on the corresponding index set. Numerical experiments applied to a BEM model will complement our complexity analysis.

关键词： Hierarchical matrices parallel algorithm Distributed-memory systems

来源：评论

学校读者我要写书评

暂无评论

A parallel METAHEURISTIC FRAMEWORK BASED ON HARMONY SEARCH FOR SCHEDULING IN DISTRIBUTED COMPUTING SYSTEMS

引用

INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE 2012年第2期23卷 445-464页

作者： Lee, Young Choon Taheri, Javid Zomaya, Albert Y. Univ Sydney Sch Informat Technol Ctr Distributed & High Performance Comp Sydney NSW 2006 Australia

A large number of optimization problems have been identified as computationally challenging and/or intractable to solve within a reasonable amount of time. Due to the NP-hard nature of these problems, in practice, heuristics account for the majority of existing algorithms. Metaheuristics are one very popular type of heuristics used for many of these optimization problems. In this paper, we present a novel parallel-metaheuristic framework, which effectively enables to devise parallel metaheuristics, particularly with heterogeneous metaheuristics. The core component of the proposed framework is its harmony-search-based coordinator. Harmony search is a recent breed of metaheuristic that mimics the improvisation process of musicians. The coordinator facilitates heterogeneous metaheuristics (forming a parallel metaheuristic) to escape local optima. Specifically, best solutions generated by these worker metaheuristics are maintained in the harmony memory of the coordinator, and they are used to form new-possibly better-harmonies (solutions) before actual solution sharing between workers occurs;hence, their solutions are harmonized with each other. For the applicability validation and the performance evaluation, we have implemented a parallel hybrid metaheuristic using the framework for the task scheduling problem on multiprocessor computing systems (e.g., computer clusters). Experimental results verify that the proposed framework is a compelling approach to parallelize heterogeneous metaheuristics.

关键词： parallel algorithm scheduling metaheuristics

来源：评论

学校读者我要写书评

暂无评论

parallel Method Using MPI for Solving Large Systems of Delay Differential Equations

Parallel Method Using MPI for Solving Large Systems of Delay...

引用

IEEE Colloquium on Humanities, Science and Engineering Research (CHUSER)

作者： Ishak, Fuziyah Suleiman, Mohamed B. Univ Teknol MARA Fac Comp & Math Sci Shah Alam 40450 Selangor Malaysia Univ Putra Malaysia Dept Math Serdang 43400 Selangor Malaysia

ISBN: (纸本)9781467346177;9781467346153

In this paper, we describe a parallel algorithm for solving large systems of first order delay differential equations. The algorithm is based on a variable stepsize variable order block method. The method produces two new approximations in a single integration step. The formulae derivation permits concurrent computation between two processors. The parallel algorithm is implemented by calling the Message Passing Interface (MPI) library. The performance of the sequential and parallel block method is compared with a sequential non-block method. Moreover, the performance of the parallel algorithm is assessed in terms of speedup and efficiency. It is shown from the numerical results that the overall performance of the block method is increased by parallelizing each point in a block.

关键词： delay differential equations parallel algorithm block method MPI

来源：评论

学校读者我要写书评

暂无评论

Application of Multi-core parallel Computing in FPGA Placement

Application of Multi-core Parallel Computing in FPGA Placeme...

引用

International Symposium on Instrumentation and Measurement, Sensor Network and Automation

作者： Bohu Huang Haibin Zhang Inst. of Computing Theory & Technology Xidian University Xi'an China

ISBN: (纸本)9781479927173

As the sizes of FPGA device grow, the long run-time of the placement is becoming a great challenge for the FPGA design flow. Simulated annealing is the best-known method applied to this problem due to the good quality of result (QoR), but its computation time seems not satisfactory. In this paper, we propose a parallel placement algorithm named MPP-SA (Multi-core parallel Placement algorithm based on Simulated Annealing). Our goal is to provide a fast placement algorithm with high QoR. MPP-SA has the same annealing schedule as the traditional simulated annealing, but it uses the parallel approach to move blocks concurrently by multiple threads that are run on different cores of the same processor. To ensure the correctness of the results, MPP-SA also uses synchronization technology and lock mechanism, which brings some overheads. However, experiment results show that these overheads have not seriously affected the performance of our algorithm, especial for large circuits. Compared with the placement algorithm of T_VPlace in VPR5.0, MPP-SA is able to decrease the run-time of 5 different size benchmark circuits by an average of 32%-42% without losing QoR.

关键词： FPGA multi-core parallel algorithm simulated annealing design AIDS

来源：评论

学校读者我要写书评

暂无评论

A Distributed Index for Efficient parallel Top-k Keyword Search on Massive Graphs

A Distributed Index for Efficient Parallel Top-k Keyword Sea...

引用

12th ACM International Workshop on Web Information and Data Management (WIDM)

作者： Zhong, Ming Liu, Mengchi Wuhan Univ Comp Sch State Key Lab Software Engn Wuhan Peoples R China Carleton Univ Sch Comp Sci Ottawa ON Canada

ISBN: (纸本)9781450317207

Recently, a variety of indexing techniques have been proposed for optimizing keyword search on graph. However, graph indexing has very high space and time complexities, and thus these single-machine in-memory indices are usually not affordable for massive graphs. In this paper, we propose a novel distributed disk-based index, which organizes the local topology information in the graph to track and prune matched vertices that will not participate in the top-k answers to a specified query before search with heuristics. The distributed index can be constructed in a MapReduce manner. Moreover, a parallel search algorithm is also developed. It runs multiple asynchronous search instances that incrementally enumerate the current best local answers and then produces the global top-k answers from them. Lastly, we perform experiments on both synthetic and real graphs with various configurations. The results show that our approach can improve search efficiency on massive graphs significantly with affordable indexing overheads.

关键词： Graph Keyword Search Distributed Index parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

Alarm Fuzzy Association Rules parallel Mining in Multi-domain Distributed Communication Network

Alarm Fuzzy Association Rules Parallel Mining in Multi-domai...

引用

14th IEEE International Conference on Communication Technology (ICCT)

作者： Leng, Xiaojie Li, Xingming Univ Elect Sci & Technol China Sch Commun & Informat Engn Chengdu 611731 Peoples R China

ISBN: (纸本)9781467321013

Network is divided into different management domains in large-scale communication system, each management domain has its own Local Management Site. The distributed characteristics of network determine the Global Management Site with several Local Management Sites to meet a distributed architecture. This paper proposes a new algorithm called PFAARM (parallel fuzzy alarm association rules mining algorithm), which can be used for alarm fuzzy association rules parallel mining in multi-domain distributed communication network. Alarm correlation analysis ban be executed in parallel both in Global Management Site and Local Management Sites. Fuzzy association rules can be achieved within inner- and inter-domain alarms. On the basis of alarm correlation analysis in single management domain, the introduction of inter-domain alarm fuzzy association rules, which are based on inter-domain communication relationship, gives another essential clue for fault location. Meanwhile, it's of great significance for quick and efficient fault location. The simulation results illustrated its feasibility and efficiency.

关键词： Fault Diagnosis Fuzzy Logic Fuzzy Association Rule parallel algorithm Multi-domain Distributed Network

来源：评论

学校读者我要写书评

暂无评论

A HYBRID GRANULARITY parallel algorithm FOR PRECISE INTEGRATION OF STRUCTURAL DYNAMIC RESPONSES

引用

Acta Mechanica Solida Sinica 2008年第1期21卷 28-33页

作者： Yuanyin Li Xianlong Jin Genguo Li High Performance Computing Center Shanghai Jiaotong University Shanghai 200240 China Shanghai Supercomputer Center Shanghai 201203 China

Precise integration methods to solve structural dynamic responses and the corresponding time integration formula are composed of two parts： the multiplication of an exponential matrix with a vector and the integration term. The second term can be solved by the series solution. Two hybrid granularity parallel algorithms are designed, that is, the exponential matrix and the first term are computed by the fine-grained parallel algorithra and the second term is computed by the coarse-grained parallel algorithm. Numerical examples show that these two hybrid granularity parallel algorithms obtain higher speedup and parallel efficiency than two existing parallel algorithms.

关键词： dynamic response precise integration hybrid granularity parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：