检索结果-内蒙古大学图书馆

22nd IEEE International Conference on parallel and Distributed Systems (ICPADS)

作者： Vi Ngoc-Nha Tran Phuong Hoai Ha UiT Arctic Univ Norway Dept Comp Sci Tromso Norway

ISBN: (纸本)9781509044573

Like time complexity models that have significantly contributed to the analysis and development of fast algorithms, energy complexity models for parallel algorithms are desired as crucial means to develop energy efficient algorithms for ubiquitous multicore platforms. Ideal energy complexity models should be validated on real multicore platforms and applicable to a wide range of parallel algorithms. However, existing energy complexity models for parallel algorithms are either theoretical without model validation or algorithm-specific without ability to analyze energy complexity for a widerange of parallel algorithms. This paper presents a new general validated energy complexity model for parallel (multithreaded) *** new model abstracts away possible multicore platforms by their static and dynamic energy of computational operations and data access, and derives the energy complexity of a given algorithm from its work, span and I/O complexity. The new model is validated by different sparse matrix vector multiplication (SpMV) algorithms and dense matrix multiplication (matmul) algorithms running on high performance computing (HPC) platforms (e.g., Intel Xeon and Xeon Phi). The new energy complexity model is able to characterize and compare the energy consumption of SpMV and matmul kernels according to three aspects: different algorithms, different input matrix types and different platforms. The prediction of the new model regarding which algorithm consumes more energy with different inputs on different platforms, is confirmed by the experimental results. In order to improve the usability and accuracy of the new model for a wide range of platforms, the platform parameters of ICE model are provided for eleven platforms including HPC, accelerator and embedded platforms.

关键词： power-aware and green computing parallel algorithms energy complexity energy models

来源：评论

学校读者我要写书评

暂无评论

A spark-based parallel simulation approach for repairable system 62nd

A spark-based parallel simulation approach for repairable sy...

引用

Annual Symposium on Reliability and Maintainability (RAMS)

作者： Yan Liu Yi Ren Linlin Liu Zhifeng Li Beihang University Beihang University Beijing China Beihang University Beijing CN

ISBN: (纸本)9781509002504

Fault-tree analysis is a useful analytic tool for the reliability and safety of complex system. However, fault tree is not suitable for repairable system. In this paper, we will propose a new method called TTF (time to failure) and TTM (time to maintenance) to analyze repairable system. Nevertheless, Monte Carlo simulation may be time consuming. In order to reduce simulation time, a parallel algorithm based on Spark will be used in this paper. Spark-MapReduce is the latest parallel computation framework. In situations where the amount of data is prohibitively large, we will propose a parallel algorithm for repairable system analysis to quickly get the simulation result. In this article, we propose a parallel algorithm to speed up through the experiment, we prove that the parallel algorithm has a superior performance on large scale models, and under Spark-MapReduce framework, researchers can concentrate on algorithm itself. It has significant benefits on reliability or availability assessment issues because it can free researchers, who are non-computer professional researchers, from parallelization and computational frame.

关键词： parallel algorithms Fault trees Logic gates Maintenance engineering Algorithm design and analysis Monte Carlo methods Computational modeling

来源：评论

学校读者我要写书评

暂无评论

Optimizing PLASMA Eigensolver on Large Shared Memory Systems

Optimizing PLASMA Eigensolver on Large Shared Memory Systems

引用

7th Workshop on Latest Advances in Scalable algorithms for Large-Scale Systems (ScalA)

作者： Cheng Liao Silicon Graph Int Corp 900 North McCarthy Blvd Milpitas CA 95035 USA

ISBN: (纸本)9781509052226

Performance of the PLASMA dense symmetric Eigensolver is optimized for large shared memory computer systems using multiple Householder domains for dense to band reduction and a communication reducing kernel for bulge chasing. The mr(3)-smp code by Petschow and Bientinesi is used for the tridiagonal eigensolution and the eigenvector back-transformations employ a 1D parallel decomposition. The input matrix, Householder vectors and scalars, are distributed among the CPU sockets with interleaved memory pages but the banded matrix, the eigenvectors, and temporary memory buffers are allocated and processed locally. Other considerations and optimization techniques also are presented. Numerical examples show the PLASMA eigensolver can out-perform ELPA and EIGENEXA significantly, for solving all the eigenpairs, if the problem size is sufficiently large, and the 2-stage eigensolution is generally better than its 1-stage counterpart on the latest x86_64 EP-4S CPUs with AVX2.

关键词： High performance computing parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

The parallel algorithm of fast multipole expansion method for electric field integral equation

The parallel algorithm of fast multipole expansion method fo...

引用

Progress in Electromagnetic Research Symposium (PIERS)

作者： Jiayu He Feiran Chen Xinglu Yu Taige Cheng Jinxin Xiao Jianshu Luo National University of Defense Technology Changsha Hunan CN

This paper achieves the parallel algorithm of fast multipole expansion method for electric field integral equation which is aimed to solve the electromagnetic scattering problems of some large metallic structures. We give the example of a spherical scattering structure to show the result of radar cross section. At first, this paper offers the discrete parallel algorithm boundary electric field integral equation under the frame of Galerkin Method, which is to expand the Green Function by multipole expansion and get the discrete linear equation system. Then, for the system of linear equations resulting from scattering the electric field integral equation, this paper proposes a parallel iterative algorithm based on conjugate gradient method. Finally, this paper will give a numerical example and reduce the complexity to O(N 1.5 ).

关键词： Integral equations parallel algorithms Electric fields Information systems Electromagnetic scattering

来源：评论

学校读者我要写书评

暂无评论

A Fast parallel Algorithm for Minimum-Cost Small Integral Flows

引用

ALGORITHMICA 2015年第2期72卷 607-619页

作者： Lingas, Andrzej Persson, Mia Lund Univ Dept Comp Sci S-22100 Lund Sweden Malmo Univ Dept Comp Sci S-20506 Malmo Sweden

We present a new approach to the minimum-cost integral flow problem for small values of the flow. It reduces the problem to the tests of simple multivariate polynomials over a finite field of characteristic two for non-identity with zero. In effect, we show that a minimum-cost flow of value k in a network with n vertices, a sink and a source, integral edge capacities and positive integral edge costs polynomially bounded in n can be found by a randomized PRAM, with errors of exponentially small probability in n, running in O(klog(kn)+log(2)(kn)) time and using 2 (k) (kn) (O(1)) processors. Thus, in particular, for the minimum-cost flow of value O(logn), we obtain an RNC2 algorithm, improving upon time complexity of earlier NC and RNC algorithms.

关键词： Maximum integral flow Minimum-cost flow Polynomial verification parallel algorithms Randomized algorithms Time complexity Processor complexity

来源：评论

学校读者我要写书评

暂无评论

Mit :: Lcs :: Tm :: Mit-Lcs-Tm-548

引用

2016年

Mit :: Lcs :: Tm :: Mit-Lcs-Tm-548 by published by

关键词： cilk cilk computation cilk program cilk runtime cilk scheduler closure computer science execution fully strict mit-lcs-tm-548 parallel parallel algorithms perfect linear processor ready pool runtime runtime system spawn speedup thread threads

来源：评论

学校读者我要写书评

暂无评论

A Practical parallel Algorithm for Diameter Approximation of Massive Weighted Graphs

A Practical Parallel Algorithm for Diameter Approximation of...

引用

International Symposium on parallel and Distributed Processing (IPDPS)

作者： Matteo Ceccarello Andrea Pietracaprina Geppino Pucci Eli Upfal Department of Information Engineering University of Padova Padova Italy Department of Computer Science Brown University Providence RI USA

ISBN: (纸本)9781509021413

We present a space and time efficient practical parallel algorithm for approximating the diameter of massive weighted undirected graphs on distributed platforms supporting a MapReduce-like abstraction. The core of the algorithm is a weighted graph decomposition strategy generating disjoint clusters of bounded weighted radius. Theoretically, our algorithm uses linear space and yields a polylogarithmic approximation guarantee, moreover, for important practical classes of graphs, it runs in a number of rounds asymptotically smaller than those required by the natural approximation provided by the state-of-the-art Δ-stepping SSSP algorithm, which is its only practical linear-space competitor in the aforementioned computational scenario. We complement ourtheoretical findings with an extensive experimental analysis on large benchmark graphs, which demonstrates that our algorithm attains substantial improvements on a number of key performance indicators with respect to the aforementioned competitor, while featuring a similar approximation ratio (a small constant less than 1.4, as opposed to the polylogarithmic theoretical bound).

关键词： Approximation algorithms Clustering algorithms Algorithm design and analysis Complexity theory Computational modeling parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Mit :: Lcs :: Tr :: Mit-Lcs-Tr-457

引用

2016年

Mit :: Lcs :: Tr :: Mit-Lcs-Tr-457 by published by

关键词： algorithm algorithm maximal algorithms cover cycle cycle cover cycles dual variables edge edges graph maximal maximal set minimum cycle mit-lcs-tr-457 parallel parallel algorithm parallel algorithms perfect matching reduced cost spanning tree

来源：评论

学校读者我要写书评

暂无评论

parallel top-k subgraph query in massive graphs: Computing from the perspective of single vertex

Parallel top-k subgraph query in massive graphs: Computing f...

引用

IEEE International Conference on Big Data

作者： Jianliang Gao Bo Song Ping Liu Weimao Ke Jianxin Wang Xiaohua Hu College of Computing & Informatics Drexel University Philadelphia USA School of Information Science and Engineering Central South University Changsha China Drexel University Philadelphia PA US

ISBN: (纸本)9781467390064

In the real world, many problems on massive graphs can be mapped to an underlying critical problem of discovering top-k subgraphs. For massive graphs, subgraph queries may have enormous number of matches, and so it is inefficient to compute all matches when only top-k matches are desired. Meanwhile, parallel algorithm is urgent for the scalability of massive graph computing. In this paper, we address the challenges of top-k subgraph query in massive graph. Firstly, we present a new graph matching notion: “approximate graph simulation”. With approximate graph simulation, top-k subgraph query can be customized by appointing a weighted query graph, which provides good flexibility for different application scenarios. Secondly, we propose a parallel top-k subgraph query algorithm at the level of vertex. With such algorithm, each vertex in massive graph obtains its matching state separately without requiring global graph information. In the algorithm, we also design a filter mechanism to speed up the the computation and a aggregation mechanism to obtain top-k vertices for query focus. Using real-life datasets, we experimentally verify that our approach of parallel top-k subgraph query are efficient.

关键词： Computational modeling Algorithm design and analysis parallel algorithms Approximation algorithms Synchronization Program processors

来源：评论

学校读者我要写书评

暂无评论

Algorithm 953: parallel Library Software for the Multishift QR Algorithm with Aggressive Early Deflation

引用

ACM TRANSACTIONS ON MATHEMATICAL SOFTWARE 2015年第4期41卷 1–23页

作者： Granat, Robert Kagstrom, Bo Kressner, Daniel Shao, Meiyue Umea Univ Dept Comp Sci SE-90187 Umea Sweden Umea Univ HPC2N SE-90187 Umea Sweden Ecole Polytech Fed Lausanne MATHICSE CH-1015 Lausanne Switzerland

Library software implementing a parallel small-bulge multishift QR algorithm with Aggressive Early Deflation (AED) targeting distributed memory high-performance computing systems is presented. Starting from recent developments of the parallel multishift QR algorithm [Granat et al., SIAM J. Sci. Comput. 32(4), 2010], we describe a number of algorithmic and implementation improvements. These include communication avoiding algorithms via data redistribution and a refined strategy for balancing between multishift QR sweeps and AED. Guidelines concerning several important tunable algorithmic parameters are provided. As a result of these improvements, a computational bottleneck within AED has been removed in the parallel multishift QR algorithm. A performance model is established to explain the scalability behavior of the new parallel multishift QR algorithm. Numerous computational experiments confirm that our new implementation significantly outperforms previous parallel implementations of the QR algorithm.

关键词： algorithms Performance Multishift QR algorithm aggressive early deflation parallel algorithms distributed memory architectures

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：