检索结果-内蒙古大学图书馆

Improvement of Real-Time Hybrid Simulation Using parallel Finite-Element Program

JOURNAL OF EARTHQUAKE ENGINEERING 2020年第10期24卷 1547-1565页

作者： Lu, Li-Qiao Wang, Jin-Ting Zhu, Fei Tsinghua Univ State Key Lab Hydrosci & Engn Beijing 100084 Peoples R China Changjiang Inst Survey Planning Design & Res Wuhan Peoples R China

This paper proposes a novel framework to efficiently calculate a large-scale finite element (FE) numerical substructure in real-time hybrid simulation (RTHS). It is composed of a non-real-time Windows computer and a real-time Target Computer. The Windows computer is used to solve the FE numerical substructure by parallel computing in soft real-time, while the real-time Target Computer generates displacement signals for the controller in real time. Based on the proposed framework, a RTHS with numerical substructure simulated in Windows environment is developed. It is demonstrated that the computational efficiency of the RTHS could be greatly improved by parallel programming.

关键词： Real-Time Hybrid Simulation Windows Calculation System Real-Time Blockset Soft Real-Time Interpolation Algorithm parallel programming

来源：评论

学校读者我要写书评

暂无评论

High-Level Stream and Data parallelism in C++ for Multi-Cores 21

High-Level Stream and Data Parallelism in C++ for Multi-Core...

引用

25th Brazilian Symposium on programming Languages, SBLP 2021, held in conjunction with the Brazilian Conference on Software: Theory and Practice, CBSoft 2021

作者： Loff, Junior Hoffman, Renato B. Griebler, Dalvan Fernandes, Luiz G. PUCRS Brazil

ISBN: (纸本)9781450390620

Stream processing applications have seen an increasing demand with the increased availability of sensors, IoT devices, and user data. Modern systems can generate millions of data items per day that require to be processed timely. To deal with this demand, application programmers must consider parallelism to exploit the maximum performance of the underlying hardware resources. However, parallel programming is often difficult and error-prone, because programmers must deal with low-level system and architecture details. In this work, we introduce a new strategy for automatic data-parallel code generation in C++ targeting multi-core architectures. This strategy was integrated with an annotation-based parallel programming abstraction named SPar. We have increased SPar's expressiveness for supporting stream and data parallelism, and their arbitrary composition. Therefore, we added two new attributes to its language and improved the compiler parallel code generation. We conducted a set of experiments on different stream and data-parallel applications to assess the efficiency of our solution. The results showed that the new SPar version obtained similar performance with respect to handwritten parallelizations. Moreover, the new SPar version is able to achieve up to 74.9x better performance with respect to the original ones due to this work. © 2021 ACM.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

The GR2 algorithm for subgraph isomorphism. A study from parallelism to quantum computing 29

The GR2 algorithm for subgraph isomorphism. A study from par...

引用

29th International Conference on Software, Telecommunications and Computer Networks, SoftCOM 2021

作者： Gheorghica, Radu-Iulian Department of Mathematics and Computer Science Babeş-Bolyai University Cluj-Napoca Romania

ISBN: (纸本)9789532901092

In this paper is presented the GR2 Algorithm in the context of a study that encompassed elements of parallel programming and pruning techniques. Also there were executed circuits having 5, 10 and 15 qubits on quantum computers. For these circuits were used classical and quantum gates along with oracles and Grover’s Algorithm. The original contribution consists in highlighting the importance of parallel programming in classiscal computing as well as in quantum computing. © SoftCOM 2021. All rights reserved.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

ACCTEST: Hybrid Testing Techniques for MPI-Based Programs

引用

IEEE ACCESS 2020年 8卷 91488-91500页

作者： Alghamdi, Abdullah S. Almalaise Alghamdi, Ahmed Mohammed Eassa, Fathy Elbouraey Khemakhem, Maher Ali King Abdulaziz Univ Fac Comp & Informat Technol Dept Informat Syst Jeddah 21589 Saudi Arabia Univ Jeddah Coll Comp Sci & Engn Dept Software Engn Jeddah 21493 Saudi Arabia King Abdulaziz Univ Fac Comp & Informat Technol Dept Comp Sci Jeddah 21589 Saudi Arabia

Recently, MPI has become widely used in many scientific applications, including different non-computer science fields, for parallelizing their applications. An MPI programming model is used for supporting parallelism in several programming languages, including C, C, and Fortran. MPI also supports integration with some programming models and has several implementations from different vendors, including open-source and commercial implementations. However, testing parallel programs is a difficult task, especially when using programming models with different behaviours and types of error based on the programming model type. In addition, the increased use of these programming models by non-computer science specialists can cause several errors due to lack of experience in programming, which needs to be considered when using any testing tools. We noticed that dynamic testing techniques have been used for testing the majority of MPI programs. The dynamic testing techniques detect errors by analyzing the source code during runtime, which will cause overheads, and this will affect the programs performance, especially when targeting massive parallel applications generating thousands or millions of threads. In this paper, we enhance ACCTEST to have the ability to test MPI-based programs and detect runtime errors occurring with different types of MPI communications. We decided to use hybrid-testing techniques by combining both static and dynamic testing techniques to gain the benefit of each and reduce the cost.

关键词： Testing programming Tools System recovery Static analysis Runtime Task analysis MPI MPI testing tool hybrid testing techniques parallel programming ACC_TEST

来源：评论

学校读者我要写书评

暂无评论

Modelling Breast Adenocarcinomas In Situ with 3D Cellular Automaton: A parallel Approach

引用

IEEE LATIN AMERICA TRANSACTIONS 2020年第3期18卷 487-494页

作者： Tomeu, A. Salguero, A. Zaldivar, S. Aparicio, J. Univ Cadiz Comp Sci Dept Ca 11519 Spain Univ Cadiz Pathol Dept Ca 11519 Spain

Adenocarcinomas are solid tumors that begins in the duct architecture of the endocrine glands in human body, constituting some of the most frequent tumors (breast or prostate), with high morbidity and mortality, and treatment costs in constant growth for public health systems. This work starts from a mathematical model known and contrasted in the literature for breast adenocarcinoma in situ (DCIS), and aims to perform the implementation with a 3D cellular automata and parallel processing, to help a better understanding of the pathogenesis of the disease. We describe the biology of this class of tumors and the parallel implementation methodology used, which employs parallelism of data, locks on access to data shared between tasks, and dynamic management of the simulated tissue domain. The results obtained by running the proposed parallel simulation are discussed in terms of their consistency with the histological reality of the real tumor, with the kinetics of Gompertz ' s function for tumor growth, and with the statistical distribution of tumor cells in a mammary duct with disease in situ, with reasonable times and speedups. The conclusions establish the achievement of the proposed objective, compare the approach developed with other similar ones already published, and establish our future work.

关键词： Adenocarcinomas Cellular Automaton DCIS Duct Mutual Exclusion Gland parallel programming Speedup

来源：评论

学校读者我要写书评

暂无评论

Performance Evaluation and Improvements of the PoCL Open-Source OpenCL Implementation on Intel CPUs 21

Performance Evaluation and Improvements of the PoCL Open-Sou...

引用

2021 International Workshop on OpenCL, IWOCL 2021

作者： Baumann, Tobias Noack, Matthias Steinke, Thomas Zuse Institute Berlin Berlin Germany

ISBN: (纸本)9781450390330

The Portable Computing Language (PoCL) is a vendor independent open-source OpenCL implementation that aims to support a variety of compute devices in a single platform. Evaluating PoCL versus the Intel OpenCL implementation reveals significant performance drawbacks of PoCL on Intel CPUs-which run 92 % of the TOP500 list. Using a selection of benchmarks, we identify and analyse performance issues in PoCL with a focus on scheduling and vectorisation. We propose a new CPU device-driver based on Intel Threading Building Blocks (TBB), and evaluate LLVM with respect to automatic compiler vectorisation across work-items in PoCL. Using the TBB driver, it is possible to narrow the gap to Intel OpenCL and even outperform it by a factor of up to 1.3 × in our proxy application benchmark with a manual vectorisation strategy. © 2021 Owner/Author.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

A parallel path-following phase unwrapping algorithm based on a top-down breadth-first search approach

引用

OPTICS AND LASERS IN ENGINEERING 2020年 124卷 105827-000页

作者： Lopez Garcia, Lourdes Garcia Arellano, Anmi Cruz-Santos, William CU UAEM Valle de Chalco Hermenegildo Galeana 3 Valle De Chalco 56615 Estado De Mexic Mexico CONACYT ECOSUR Unidad Chetumal Ave Centenario Km 5-5 Chetmal 77014 Quintana Roo Mexico

Path-following methods for two-dimensional phase unwrapping such as the Goldstein algorithm are, the most efficient and robust methods in remote sensing, digital phase shifting, and nuclear magnetic resonance imaging, among others. Several authors have attempted to sketch parallel versions of path-following methods. However, only the first stages of the algorithm such as residue identification and branch-cut placement have been improved using parallel architectures, with limitations such as phase maps with a single continuous region and without isolated regions owing to the cuts. In this article, a systematic parallel Goldstein algorithm that can handle phase data with multi-regions and isolated regions is proposed. Our proposal can improve the three steps of the serial Goldstein algorithm, residue identification, branch cut, and integration. In particular, the integration step is formulated as a top-down breadth-first search problem on a graph for which a parallel algorithm was developed. Synthetic and real phase maps were used to validate the performance and robustness of the proposed parallel algorithm on a multicore architecture. For simulated and real phase maps, we obtained a speedup of 3.3 and 1.98, respectively, on a laptop computer with modest hardware resources.

关键词： Phase unwrapping Phase retrieval Branch-cut method parallel programming

来源：评论

学校读者我要写书评

暂无评论

Advanced synchronization techniques for task-based runtime systems 21

Advanced synchronization techniques for task-based runtime s...

引用

26th ACM SIGPLAN Symposium on Principles and Practice of parallel programming, PPoPP 2021

作者： Álvarez, David Sala, Kevin Maroñas, Marcos Roca, Aleix Beltran, Vincenç Barcelona Supercomputing Center Barcelona Spain

ISBN: (纸本)9781450382946

Task-based programming models like OmpSs-2 and OpenMP provide a flexible data-flow execution model to exploit dynamic, irregular and nested parallelism. Providing an efficient implementation that scales well with small granularity tasks remains a challenge, and bottlenecks can manifest in several runtime components. In this paper, we analyze the limiting factors in the scalability of a task-based runtime system and propose individual solutions for each of the challenges, including a wait-free dependency system and a novel scalable scheduler design based on delegation. We evaluate how the optimizations impact the overall performance of the runtime, both individually and in combination. We also compare the resulting runtime against state of the art OpenMP implementations, showing equivalent or better performance, especially for fine-grained tasks. © 2021 ACM.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Scalable computing in Java with PCJ Library. Improved collective operations

Scalable computing in Java with PCJ Library. Improved collec...

引用

2021 International Symposium on Grids and Cloud, ISGC 2021

作者： Nowicki, Marek Górski, Lukasz Bala, Piotr Faculty of Mathematics and Computer Science Nicolaus Copernicus University Toruń Poland Interdisciplinary Centre for Mathematical and Computational Modeling University of Warsaw Poland

Machine learning and Big Data workloads are becoming as important as traditional HPC ones. AI and Big Data users tend to use new programming languages such as Python, Julia, or Java, while the HPC community is still dominated by C/C++ or Fortran. Hence, there is a need for new programming libraries and languages that will integrate different applications and allow them to run on large computer infrastructure. Since modest computers are multinode and multicore, parallel execution is an additional challenge here. For that purpose, we have developed the PCJ library, which introduces parallel programming capabilities to Java using the Partitioned Global Address Space model. It does not modify language nor running environment (JVM). The PCJ library allows for easy development of parallel code and runs it on laptops, workstations, supercomputers, and the cloud. This paper presents an overview of the PCJ library and its usage in parallelizing selected workloads, including HPC, AI, and Big Data. The performance and scalability are presented. We present recent addition to the PCJ library, which are collective operations. The collective operations significantly reduce the number of lines of code to write, ensuring good performance. © Copyright owned by the author(s) under the terms of the Creative Commons

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

Real-time cortical simulation on neuromorphic hardware

引用

PHILOSOPHICAL TRANSACTIONS OF THE ROYAL SOCIETY A-MATHEMATICAL PHYSICAL AND ENGINEERING SCIENCES 2020年第2164期378卷 20190160-20190160页

作者： Rhodes, Oliver Peres, Luca Rowley, Andrew G. D. Gait, Andrew Plana, Luis A. Brenninkmeijer, Christian Furber, Steve B. Univ Manchester Dept Comp Sci Manchester Lancs England

Real-time simulation of a large-scale biologically representative spiking neural network is presented, through the use of a heterogeneous parallelization scheme and SpiNNaker neuromorphic hardware. A published cortical microcircuit model is used as a benchmark test case, representing approximate to 1 mm(2) of early sensory cortex, containing 77 k neurons and 0.3 billion synapses. This is the first hard real-time simulation of this model, with 10 s of biological simulation time executed in 10 s wall-clock time. This surpasses best-published efforts on HPC neural simulators (3 x slowdown) and GPUs running optimized spiking neural network (SNN) libraries (2 x slowdown). Furthermore, the presented approach indicates that real-time processing can be maintained with increasing SNN size, breaking the communication barrier incurred by traditional computing machinery. Model results are compared to an established HPC simulator baseline to verify simulation correctness, comparing well across a range of statistical measures. Energy to solution and energy per synaptic event are also reported, demonstrating that the relatively low-tech SpiNNaker processors achieve a 10 x reduction in energy relative to modern HPC systems, and comparable energy consumption to modern GPUs. Finally, system robustness is demonstrated through multiple 12 h simulations of the cortical microcircuit, each simulating 12 h of biological time, and demonstrating the potential of neuromorphic hardware as a neuroscience research tool for studying complex spiking neural networks over extended time periods. This article is part of the theme issue 'Harmonizing energy-autonomous computing and intelligence'.

关键词： neuromorphic SpiNNaker cortical microcircuit real time low-power parallel programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：