检索结果-内蒙古大学图书馆

Towards efficient large scale epidemiological simulations in EpiGraph

parallel COMPUTING 2015年 42卷 88-102页

作者： Martin, Gonzalo Singh, David E. Marinescu, Maria-Cristina Carretero, Jesus Univ Carlos III Madrid Dept Comp Sci Madrid 28911 Spain Barcelona Supercomp Ctr Comp Applicat Sci & Engn Barcelona 08034 Spain

The work we present in this paper focuses on understanding the propagation of flu-like infectious outbreaks between geographically distant regions due to the movement of people outside their base location. Our approach incorporates geographic location and a transportation model into our existing region-based, closed-world EpiGraph simulator to model a more realistic movement of the virus between different geographic areas. This paper describes the MPI-based implementation of this simulator, including several optimization techniques such as a novel approach for mapping processes onto available processing elements based on the temporal distribution of process loads. We present an extensive evaluation of EpiGraph in terms of its ability to simulate large-scale scenarios, as well as from a performance perspective. (C) 2014 Elsevier B.V. All rights reserved.

关键词： Simulation Computational epidemiology parallel algorithms MPI implementation Resource allocation

来源：评论

学校读者我要写书评

暂无评论

A Technology of Full Seismic Field Simulation on High-performance Computing Systems

A Technology of Full Seismic Field Simulation on High-perfor...

引用

International Scientific-Technical Conference on Actual Problems of Electronics Instrument Engineering

作者： Dmitry A. Karavaev Alexander A. Yakimenko Nina A. Bulavina Institute of Computational Mathematics and Mathematical Geophysics of SB RAS Novosibirsk Russia Novosibirsk Technical State University Russia

ISBN: (纸本)9781509040704

The aim of work is developing the technology representing a complex approach for studding geophysical objects with complex subsurface geometry on the basis of numerical modeling of seismic filed from point sources. An important stage of successful solution of dynamic problem of the theory of elasticity is to develop the model representing the object under study in details and carrying out a series of calculations of elastic wave propagation in inho-mogeneous media. We present a programs software for solving the forward geophysical tasks using grid methods. Particular attention is paid to the software interface that allows you to carry out the preparation of geophysical models for theoretical experiments. The developing software for simulation is designed for usage on modern high-performance computing systems. Information and analytical set of programs can be used in the interpretation of experimental data, in design and verification of 2D and 2.5D models when compare experimental and theoretical results. Studying the structure of the Baikal rift zone is one of the geophysical tasks where 2D modeling is necessary. This work was partially supported by RFBR grants No. 16-07-01052, 15-31-20150, 15-07-06821, 14-05-00867, 14-07-00312, MES RK 1760/GF4.

关键词： Technology simulation geophysics parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Partial-Duplicate Clustering and Visual Pattern Discovery on Web Scale Image Database

引用

IEEE TRANSACTIONS ON MULTIMEDIA 2015年第7期17卷 967-980页

作者： Li, Wei Wang, Changhu Zhang, Lei Rui, Yong Zhang, Bo Tsinghua Univ Dept Comp Sci & Technol Tsinghua Natl Lab Informat Sci & Technol State Key Lab Intelligent Technol & Syst Beijing 100084 Peoples R China Microsoft Res Beijing 100080 Peoples R China Microsoft Res Redmond WA 98052 USA

In this paper, we study the problem of discovering visual patterns and partial-duplicate images, which is fundamental to visual concept representation and image parsing, but very challenging when the database is extremely large, such as billions of images indexed by a commercial search engine. Although extensive research with sophisticated algorithms has been conducted for either partial-duplicate clustering or visual pattern discovery, most of them can not be easily extended to this scale, since both are clustering problems in nature and require pairwise comparisons. To tackle this computational challenge, we introduce a novel and highly parallelizable framework to discover partial-duplicate images and visual patterns in a unified way in distributed computing systems. We emphasize the nested property of local features, and propose the generalized nested feature (GNF) as a mid-level representation for regions and local patterns. Initial coarse clusters are then discovered by GNFs, upon which - gram GNF is defined to represent co-occurrent visual patterns. After that, efficient merging and refining algorithms are used to get the partial-duplicate clusters, and logical combinations of probabilistic GNF models are leveraged to represent the visual patterns of partially duplicate images. Extensive experiments show the parallelizable property and effectiveness of the algorithms on both partial-duplicate clustering and visual pattern discovery. With 2000 machines, it costs about eight and 400 minutes to process one million and 40 million images respectively, which is quite efficient compared to previous methods.

关键词： Local features parallel algorithms partial-duplicate images visual patterns

来源：评论

学校读者我要写书评

暂无评论

parallelization of Bin Packing on Multicore Systems

Parallelization of Bin Packing on Multicore Systems

引用

International Conference on High Performance Computing

作者： Sayan Ghosh Assefaw H. Gebremedhin School of Electrical Engineering and Computer Science Washington State University Pullman WA USA

ISBN: (纸本)9781509054121

We study effective parallelization of approximation algorithms for the one-dimensional bin packing problem on a multicore platform. Bin packing is a classic combinatorial optimization problem that aims to pack a given sequence of items into a minimum number of equal-sized bins. The problem potentially serves as a model for a wide variety of applications. Examples include: packing data into chunks in a memory hierarchy in a given system to increase application performance, loading vehicles subject to weight limitations, and packing TV commercials into station breaks. Bin packing has long served as a proving ground for the analysis of approximation algorithms and played a crucial role in the development of much of the theory of approximation algorithms. Its parallelization, however, has received comparatively much less attention. In this work, we develop multiple parallel versions of an effective approximation algorithm (First Fit Decreasing) for the problem and investigate the trade-off between solution quality and execution time. We use OpenMP and Cilk Plus as mechanisms for achieving the parallelization. The new parallel algorithms obtain a speedup of more than 10× (on 32 cores) for moderate to large input sequences without sacrificing much on the quality of solution produced by the sequential algorithm - in particular, we see only about 3 to 30% increase in the number of bins compared to the sequential version. In turn, the solution obtained by the sequential First Fit Decreasing algorithm is provably almost optimal (the approximation ratio is less than 1.3).

关键词： Approximation algorithms Algorithm design and analysis parallel algorithms Dynamic scheduling Heuristic algorithms Multicore processing Upper bound

来源：评论

学校读者我要写书评

暂无评论

parallel simulation for VLSI power grid

Parallel simulation for VLSI power grid

引用

作者： Zhang, Le Texas A&M University

学位级别：Ph.D.

Due to the increasing complexity of VLSI circuits, power grid simulation has become more and more time-consuming. Hence, there is a need for fast and accurate power grid simulator. In order to perform power grid simulation in a timely manner, parallel algorithms have been developed to accelerate the simulation. In this dissertation, we present parallel algorithms and software for power grid simulation on CPU-GPU platforms. The power grid is divided into disjoint partitions. The partitions are enlarged using Breath First Search (BFS) method. In the partition enlarging process, a portion of edges are ignored to make the matrix factorization light-weight. Solving the enlarged partitions using a direct solver serves as a preconditioner for the Preconditioned Conjugate Gradient (PCG) method that is used to solve the power grid. This work combines the advantages of direct solvers and iterative solvers to obtain an efficient hybrid parallel solver. Two-tier parallelism is harnessed using MPI for partitions and CUDA within each partition. The experiments conducted on supercomputing clusters demonstrate significant speed improvements over a state-of-the-art direct solver in both static and transient analysis.

关键词： VLSI Power Grid parallel algorithms Iterative Solver Enlarged Partitions Thesis

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms for Solution of Air Pollution Inverse Problems

Parallel Algorithms for Solution of Air Pollution Inverse Pr...

引用

2nd Russia-Taiwan Symposium on Methods and Tools of parallel Programming

作者： Starchenko, Alexander Panasenko, Elena Tomsk State Univ Tomsk 634050 Russia

ISBN: (纸本)9783642148217

parallelization of Marchuk's method for solution of inverse problems based on adjoint equations and dual representation of contaminant concentration functional is considered here. There are N individual adjoint equations independently solved at each time step. Such conditions of numerical investigation allow application of high performance computations. For this purpose the following ways of parallelization are used: geometrical decomposition, functional decomposition and combination of geometrical and functional decompositions.

关键词： air pollution inverse problems parallel algorithms functional decomposition domain decomposition

来源：评论

学校读者我要写书评

暂无评论

DNA Sequence Splicing Algorithm Based on Spark

DNA Sequence Splicing Algorithm Based on Spark

引用

International Conference on Industrial Informatics, Computing Technology, Intelligent Technology, Industrial Information Integration (ICIICII)

作者： Xu Pan Xue-Liang Fu Gai-Fang Dong Hong-Hui Li College of Computer and Information Engineering Inner Mongolia Agricultural University Hohhot China

ISBN: (纸本)9781509035762

Bioinformatics is a cross subject of biological information processing. DNA sequence splicing is one of its research content. At present, most parallel algorithms are based on the operating environment of MapReduce. There is a complex process for reading and writing to hard disk, which lead to inferiority that the speed of the algorithm will be slow. In this paper, Spark calculation model based on memory is proposed to solve the problem. At the same time, a new method of matching K-2 bit will be also used by us. Results of experiment show that the running environment based on Spark and the method can ensure accuracy of stitching results and make the algorithm more efficient.

关键词： Algorithm design and analysis Sparks DNA Splicing parallel algorithms Computers Hard disks

来源：评论

学校读者我要写书评

暂无评论

BROADCAST-ENABLED MASSIVE MULTICORE ARCHITECTURES: A WIRELESS RF APPROACH

引用

IEEE MICRO 2015年第5期35卷 52-61页

作者： Abadal, Sergi Sheinman, Benny Katz, Oded Markish, Ofer Elad, Danny Fournier, Yvan Roca, Damian Hanzich, Mauricio Houzeaux, Guillaume Nemirovsky, Mario Alarcon, Eduard Cabellos-Aparicio, Albert Univ Politecn Cataluna NaNoNetworking Ctr Catalonia Spain IBM Res Haifa MmWave Technol Grp Haifa Israel EDF R&D London England Barcelona Supercomp Ctr Barcelona Spain

Novel interconnect technologies offer solutions to on-chip communication scalability problems. This article outlines the prospects of wireless on-chip communication technologies pointing toward low-latency and energy-efficient broadcast even in large-scale chip multiprocessors. It also discusses the challenges and potential impact of adopting these technologies as key enablers of unconventional hardware architectures and algorithmic approaches to significantly improve the performance, energy efficiency, scalability, and programmability of many-core chips.

关键词： broadcast Complexity theory Computer architecture hardware architecture many-core Message passing parallel algorithms Programming programming models Scalability System-on-chip Wireless communication wireless network-on-chip

来源：评论

学校读者我要写书评

暂无评论

A Efficient Algorithm for Molecular Dynamics Simulation on Hybrid CPU-GPU Computing Platforms

A Efficient Algorithm for Molecular Dynamics Simulation on H...

引用

International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery

作者： Dapu Li Wei Ai Yu Ye Jie Liang College of Information Science and Engineering Hunan University Changsha Hunan 410082 China

ISBN: (纸本)9781509040940

In this article, an efficient parallel algorithm for a hybrid CPU-GPU platform is proposed to enable large-scale molecular dynamics (MD) simulations of the metal solidification process. The results, implemented the parallel algorithm program on the hybrid CPU-GPU platform shows better performance than the program based on previous algorithms running on the CPU cluster platform. By contrast, the total execution time of the new program has been obviously decreased. Particularly, because of the use of the modified load balancing method, the neighbor list update time is approximately zero. The parallel program based on the CUDA+OpenMP model shows a factor of 6 16-core calculation speedups compared to the parallel program based on the MPI+OpenMP model, and the optimal computational efficiency is achieved in the simulation system including 10,000,000 aluminum atoms. Finally, the good consistency between them verifies the correctness of the algorithm efficiently, by comparison of the theoretical results and experimental results.

关键词： Graphics processing units Computational modeling parallel algorithms Mathematical model Runtime Load modeling Heuristic algorithms parallel algorithms Load modeling Runtime Heuristic algorithms Computational modeling Mathematical Model efficient algorithm parallel programming Graphics Processing Unit Molecular Dynamics Simulation Platform Simulation Systems execution time

来源：评论

学校读者我要写书评

暂无评论

Network Topologies and Inevitable Contention

Network Topologies and Inevitable Contention

引用

International Workshop on Communication Optimizations in HPC (COMHPC)

作者： Grey Ballard James Demmel Andrew Gearhart Benjamin Lipshitz Yishai Oltchik Oded Schwartz Sivan Toledo Wake Forest University UC Berkeley University of California Berkeley Berkeley CA US Tel-Aviv University The Hebrew University

Network topologies can have significant effect on the execution costs of parallel algorithms due to inter-processor communication. For particular combinations of computations and network topologies, costly network contention may inevitably become a bottleneck, even if algorithms are optimally designed so that each processor communicates as little as possible. We obtain novel contention lower bounds that are functions of the network and the computation graph parameters. For several combinations of fundamental computations and common network topologies, our new analysis improves upon previous per-processor lower bounds which only specify the number of words communicated by the busiest individual processor. We consider torus and mesh topologies, universal fat-trees, and hypercubes; algorithms covered include classical matrix multiplication and direct numerical linear algebra, fast matrix multiplication algorithms, programs that reference arrays, N-body computations, and the FFT. For example, we show that fast matrix multiplication algorithms (e.g., Strassen's) running on a 3D torus will suffer from contention bottlenecks. On the other hand, this network is likely sufficient for a classical matrix multiplication algorithm. Our new lower bounds are matched by existing algorithms only in very few cases, leaving many open problems for network and algorithmic design.

关键词： Algorithm design and analysis Network topology parallel algorithms Linear algebra Hypercubes Conferences Optimization

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：