检索结果-内蒙古大学图书馆

Numerical simulation of the failure of composite materials by using the grid-characteristic method

Mathematical Models and Computer Simulations 2016年第5期8卷 557-567页

作者： Beklemysheva, K.A. Vasyukov, A.V. Ermakov, A.S. Petrov, I.B. Moscow Physical-Technical Institute Moscow Russian Federation

This is an overview of the existing criteria of the failure of the composite materials and of the results of the application of some of them to simulate a low-speed hit on the composition material for the three-dimensional statement of the problem. Simulation is made by means of the grid-characteristic method. Reasons are given for the selection of specific criteria and they are compared with each other. © 2016, Pleiades Publishing, Ltd.

关键词： composition materials failure of composites grid-characteristic method mathematical simulation parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Estimation of Local Subgraph Counts 4

Estimation of Local Subgraph Counts

引用

4th IEEE International Conference on Big Data (Big Data)

作者： Ahmed, Nesreen K. Willke, Theodore L. Rossi, Ryan A. Intel Labs Hillsboro OR 97124 USA Palo Alto Res Ctr Palo Alto CA USA

ISBN: (纸本)9781467390057

Graphlets represent small induced subgraphs and are becoming increasingly important for a variety of applications. Despite the importance of the local subgraph (graphlet) counting problem, existing work focuses mainly on counting graphlets globally over the entire graph. These global counts have been used for tasks such as graph classification as well as for understanding and summarizing the fundamental structural patterns in graphs. In contrast, this work proposes an accurate, efficient, and scalable parallel framework for the more challenging problem of counting graphlets locally for a given edge or set of edges. The local graphlet counts provide a topologically rigorous characterization of the local structure surrounding an edge. The aim of this work is to obtain the count of every graphlet of size k for each edge. The framework gives rise to efficient, parallel, and accurate unbiased estimation methods with provable error bounds, as well as exact algorithms for counting graphlets locally. Experiments demonstrate the effectiveness of the proposed exact and estimation methods on various datasets. In particular, the exact methods show strong scaling results (11-16x on 16 cores). Moreover, our estimation framework is accurate with error less than 5% on average.

关键词： Graphlets edge graphlet counts statistical estimation relational learning link classification parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Out-of-core GPU accelerated surface reconstruction for large industrial environment monitoring 4

Out-of-core GPU accelerated surface reconstruction for large...

引用

4th International Conference on Applied Robotics for the Power Industry (CARPI)

作者： Miralles, Francois Xu, Chen Laurendeau, Denis IREQ Hydroquebec Res Inst Varennes PQ Canada Univ Laval Dept Elect Engn Quebec City PQ Canada

ISBN: (纸本)9781509032280

A parallel implementation of a surface reconstruction algorithm is presented. This algorithm uses the vector field surface representation and was adapted in a previous work by the authors to handle large scale environment reconstruction. Two parallel implementations with different memory requirements and processing speeds are described and compared. These parallel implementations increase the vector field computation speed by a factor of up to 31 times relative to a purely serial implementation. The method is demonstrated on different datasets captured on the sites of Hydro-Quebec using a variety of sensors: LiDAR, sonar and the WireScan, an underwater laser scanner designed at our laboratory.

关键词： Computer vision parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Study of Particle Swarm Optimization algorithms Using Message Passing Interface and Graphical Processing Units Employing a High Performance Computing Cluster 6th

Study of Particle Swarm Optimization Algorithms Using Messag...

引用

6th International Conference on Supercomputing in Mexico (ISUM)

作者： Santana-Castolo, Manuel-H. Alejandro Morales, J. Torres-Ramos, Sulema Alanis, Alma Y. Univ Guadalajara CUCEI Apartado Postal 51-71 Zapopan 45080 Jalisco Mexico

ISBN: (纸本)9783319322438;9783319322421

Particle Swarm Optimization (PSO) is a heuristic technique that have been used to solve problems where many events occur simultaneously and small pieces of the problem can collaborate to reach a solution. Among its advantages are fast convergence, large exploration coverage, and adequate global optimization;however to address the premature convergence problem, modifications to the basic model have been developed such as Aging Leader and Challengers (ALC) PSO and Bio-inspired Aging (BAM) PSO. Being these algorithms parallel in nature, some authors have attempted different approaches to apply PSO using MPI and GPU. Nevertheless ALC-PSO and BAM-PSO have not been implemented in parallel. For this study, we develop PSO, ALC-PSO and BAM-PSO, through MPI and GPU using the High Performance Computing Cluster (HPCC) Agave. The results suggest that ALC-PSO and BAM-PSO reduce the premature convergence, improving global precision, whilst BAM-PSO achieves better optimal at the expense of significantly increasing the algorithm computational complexity.

关键词： ALC-PSO BAM-PSO CUDA GPUs HPCC MPI parallel algorithms Particle swarm optimization PSO

来源：评论

学校读者我要写书评

暂无评论

An Unbounded Nonblocking Double-ended Queue 45

An Unbounded Nonblocking Double-ended Queue

引用

45th International Conference on parallel Processing (ICPP)

作者： Graichen, Matthew Izraelevitz, Joseph Scott, Michael L. Univ Rochester Dept Comp Sci Rochester NY 14627 USA

ISBN: (纸本)9781509028238

We introduce a new algorithm for an unbounded concurrent double-ended queue (deque). Like the bounded deque of Herlihy, Luchangco, and Moir on which it is based, the new algorithm is simple and obstruction free, has no pathological long-latency scenarios, avoids interference between operations at opposite ends, and requires no special hardware support beyond the usual compare-and-swap. To the best of our knowledge, no prior concurrent deque combines these properties with unbounded capacity, or provides consistently better performance across a wide range of concurrent workloads.

关键词： parallel processing parallel algorithms nonblocking algorithms

来源：评论

学校读者我要写书评

暂无评论

Solving Multidimensional Global Optimization Problems Using Graphics Accelerators 2nd

Solving Multidimensional Global Optimization Problems Using ...

引用

2nd Russian Supercomputing Days Conference (RuSCDays)

作者： Barkalov, Konstantin Lebedev, Ilya Lobachevsky State Univ Nizhny Novgorod Nizhnii Novgorod Russia

ISBN: (纸本)9783319556680;9783319556697

In the present paper an approach to solving the global optimization problems using a nested optimization scheme is developed. The use of different algorithms at different nesting levels is the novel element. A complex serial algorithm (on CPU) is used at the upper level, and a simple parallel algorithm (on GPU) is used at the lower level. This computational scheme has been implemented in ExaMin parallel solver. The results of computational experiments demonstrating the speedup when solving a series of test problems are presented.

关键词： Global optimization Multiextremal functions Dimension reduction parallel algorithms Speedup Graphics accelerators

来源：评论

学校读者我要写书评

暂无评论

ITERATIVE SOLVERS FOR MESHLESS PETROV GALERKIN (MLPG) METHOD APPLIED TO LARGE SCALE ENGINEERING PROBLEMS CHALLENGES

ITERATIVE SOLVERS FOR MESHLESS PETROV GALERKIN (MLPG) METHOD...

引用

ASME International Mechanical Engineering Congress and Exposition (IMECE2015)

作者： Singh, Rituraj Singh, Krishna M. Indian Inst Technol Roorkee Dept Mech & Ind Engn Roorkee 247667 Uttarakhand India

ISBN: (纸本)9780791857496

In recent years, significant research effort has been invested in development of mesh-free methods for different types of continuum problems. Prominent amongst these methods are element free Galerkin (EFG) method, RKPM, and mesh-less Petrov Galerkin (MLPG) method. Most of these methods employ a set of nodes for disbretization of the problem domain, and use a moving least squares (MLS) approximation to generate shape functions. Of these methods, MLPG method is seen as a pure meshless method since it does not require any background mesh. Accuracy and flexibility of MLPG method is well established for a variety of continuum problems. However, most of the applications have been limited to small scale problems solvable on serial machines. Very few attempts have been made to apply it to large scale problems which typically involve many millions (or even billions) of nodes and would require use of parallel algorithms based on domain decomposition. Such parallel techniques are well established in context of mesh-based methods. Extension of these algorithms in conjunction with MLPG method requires considerable further research. Objective of this paper is to spell out these challenges which need urgent attention to enable the application of meshless methods to large scale problems. We specifically address the issue of the solution of large scale linear problems which would necessarily require use of iterative solvers. We focus on application of BiCGSTAB method and an appropriate set of preconditioners for the solution of the MLPG system.

关键词： MLPG meshless methods large scale problems parallel algorithms domain decomposition

来源：评论

学校读者我要写书评

暂无评论

Efficient Delaunay Tessellation through K-D Tree Decomposition 16

Efficient Delaunay Tessellation through K-D Tree Decompositi...

引用

International Conference on High Performance Computing, Networking, Storage and Analysis (SC)

作者： Morozov, Dmitriy Peterka, Tom Lawrence Berkeley Natl Lab 1 Cyclotron Rd Berkeley CA 94720 USA Argonne Natl Lab 9700 S Cass Ave Argonne IL 60439 USA

ISBN: (纸本)9781467388153

Delaunay tessellations are fundamental data structures in computational geometry. They are important in data analysis, where they can represent the geometry of a point set or approximate its density. The algorithms for computing these tessellations at scale perform poorly when the input data is unbalanced. We investigate the use of k-d trees to evenly distribute points among processes and compare two strategies for picking split points between domain regions. Because resulting point distributions no longer satisfy the assumptions of existing parallel Delaunay algorithms, we develop a new parallel algorithm that adapts to its input and prove its correctness. We evaluate the new algorithm using two late-stage cosmology datasets. The new running times are up to 50 times faster using k-d tree compared with regular grid decomposition. Moreover, in the unbalanced data sets, decomposing the domain into a k-d tree is up to five times faster than decomposing it into a regular grid.

关键词： Three-dimensional displays Approximation algorithms parallel algorithms Algorithm design and analysis Computational modeling Octrees Clustering algorithms parallel algorithms three-dimensional displays Algorithm design and analysis octrees Clustering algorithms Approximation algorithms Computational modeling decomposition Set of points Dataset

来源：评论

学校读者我要写书评

暂无评论

Hierarchical Density-Based Clustering based on GPU Accelerated Data Indexing Strategy

Hierarchical Density-Based Clustering based on GPU Accelerat...

引用

16th Annual International Conference on Computational Science (ICCS)

作者： Melo, Danilo Toledo, Savyo Mourao, Fernando Sachetto, Rafael Andrade, Guilherme Ferreira, Renato Parthasarathy, Srinivasan Rocha, Leonardo Univ Fed Sao Joao Del Rei Sao Joao Del Rei MG Brazil Univ Fed Minas Gerais Belo Horizonte MG Brazil Ohio State Univ Dept Comp Sci & Engn Columbus OH 43210 USA

Due the recent increase of the volume of data that has been generated, organizing this data has become one of the biggest problems in Computer Science. Among the different strategies propose to deal efficiently and effectively for this purpose, we highlight those related to clustering, more specifically, density-based clustering strategies, which stands out for its ability to define clusters of arbitrary shape and the robustness to deal with the presence of data noise, such as DBSCAN and OPTICS. However, these algorithms are still a computational challenge since they are distance-based proposals. In this work we present a new approach to make OPTICS feasible based on data indexing strategy. Although the simplicity with which the data are indexed, using graphs, it allows explore various parallelization opportunities, which were explored using graphic processing unit (GPU). Based on this structure, the complexity of OPTICS is reduced to O(E * logV) in the worst case, becoming itself very fast. In our evaluation we show that our proposal can be over 200x faster than its sequential version using CPU.

关键词： Hierarchical Density Clustering parallel algorithms Graphic Processing Unit

来源：评论

学校读者我要写书评

暂无评论

Minimal Aggregated Shared Memory Messaging on Distributed Memory Supercomputers 30

Minimal Aggregated Shared Memory Messaging on Distributed Me...

引用

30th IEEE International parallel and Distributed Processing Symposium (IPDPS)

作者： Jamroz, Benjamin F. Dennis, John M. Natl Ctr Atmospher Res Computat Informat Syst Lab Boulder CO 80301 USA

ISBN: (纸本)9781509021406

Many high-performance distributed memory applications rely on point-to-point messaging using the Message Passing Interface (MPI). Due to the latency of the network, and other costs, this communication can limit the scalability of an application when run on high node counts of distributed memory supercomputers. Communication costs are further increased on modern multi- and many-core architectures, when using more than one MPI process per node, as each process sends and receives messages independently, inducing multiple latencies and contention for resources. In this paper, we use shared memory constructs available in the MPI 3.0 standard to implement an aggregated communication method to minimize the number of inter-node messages to reduce these costs. We compare the performance of this Minimal Aggregated SHared Memory (MASHM) messaging to the standard point-to-point implementation on large-scale supercomputers, where we see that MASHM leads to enhanced strong scalability of a weighted Jacobi relaxation. For this application, we also see that the use of shared memory parallelism through MASHM and MPI 3.0 can be more efficient than using Open Multi-Processing (OpenMP). We then present a model for the communication costs of MASHM which shows that this method achieves its goal of reducing latency costs while also reducing bandwidth costs. Finally, we present MASHM as an open source library to facilitate the integration of this efficient communication method into existing distributed memory applications.

关键词： Scalability parallel algorithms parallel Programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：