This paper proposes a novel framework to efficiently calculate a large-scale finite element (FE) numerical substructure in real-time hybrid simulation (RTHS). It is composed of a non-real-time Windows computer and a r...
详细信息
This paper proposes a novel framework to efficiently calculate a large-scale finite element (FE) numerical substructure in real-time hybrid simulation (RTHS). It is composed of a non-real-time Windows computer and a real-time Target Computer. The Windows computer is used to solve the FE numerical substructure by parallel computing in soft real-time, while the real-time Target Computer generates displacement signals for the controller in real time. Based on the proposed framework, a RTHS with numerical substructure simulated in Windows environment is developed. It is demonstrated that the computational efficiency of the RTHS could be greatly improved by parallel programming.
Stream processing applications have seen an increasing demand with the increased availability of sensors, IoT devices, and user data. Modern systems can generate millions of data items per day that require to be proce...
详细信息
In this paper is presented the GR2 Algorithm in the context of a study that encompassed elements of parallel programming and pruning techniques. Also there were executed circuits having 5, 10 and 15 qubits on quantum ...
详细信息
Recently, MPI has become widely used in many scientific applications, including different non-computer science fields, for parallelizing their applications. An MPI programming model is used for supporting parallelism ...
详细信息
Recently, MPI has become widely used in many scientific applications, including different non-computer science fields, for parallelizing their applications. An MPI programming model is used for supporting parallelism in several programming languages, including C, C, and Fortran. MPI also supports integration with some programming models and has several implementations from different vendors, including open-source and commercial implementations. However, testing parallel programs is a difficult task, especially when using programming models with different behaviours and types of error based on the programming model type. In addition, the increased use of these programming models by non-computer science specialists can cause several errors due to lack of experience in programming, which needs to be considered when using any testing tools. We noticed that dynamic testing techniques have been used for testing the majority of MPI programs. The dynamic testing techniques detect errors by analyzing the source code during runtime, which will cause overheads, and this will affect the programs performance, especially when targeting massive parallel applications generating thousands or millions of threads. In this paper, we enhance ACCTEST to have the ability to test MPI-based programs and detect runtime errors occurring with different types of MPI communications. We decided to use hybrid-testing techniques by combining both static and dynamic testing techniques to gain the benefit of each and reduce the cost.
Adenocarcinomas are solid tumors that begins in the duct architecture of the endocrine glands in human body, constituting some of the most frequent tumors (breast or prostate), with high morbidity and mortality, and t...
详细信息
Adenocarcinomas are solid tumors that begins in the duct architecture of the endocrine glands in human body, constituting some of the most frequent tumors (breast or prostate), with high morbidity and mortality, and treatment costs in constant growth for public health systems. This work starts from a mathematical model known and contrasted in the literature for breast adenocarcinoma in situ (DCIS), and aims to perform the implementation with a 3D cellular automata and parallel processing, to help a better understanding of the pathogenesis of the disease. We describe the biology of this class of tumors and the parallel implementation methodology used, which employs parallelism of data, locks on access to data shared between tasks, and dynamic management of the simulated tissue domain. The results obtained by running the proposed parallel simulation are discussed in terms of their consistency with the histological reality of the real tumor, with the kinetics of Gompertz ' s function for tumor growth, and with the statistical distribution of tumor cells in a mammary duct with disease in situ, with reasonable times and speedups. The conclusions establish the achievement of the proposed objective, compare the approach developed with other similar ones already published, and establish our future work.
The Portable Computing Language (PoCL) is a vendor independent open-source OpenCL implementation that aims to support a variety of compute devices in a single platform. Evaluating PoCL versus the Intel OpenCL implemen...
详细信息
Path-following methods for two-dimensional phase unwrapping such as the Goldstein algorithm are, the most efficient and robust methods in remote sensing, digital phase shifting, and nuclear magnetic resonance imaging,...
详细信息
Path-following methods for two-dimensional phase unwrapping such as the Goldstein algorithm are, the most efficient and robust methods in remote sensing, digital phase shifting, and nuclear magnetic resonance imaging, among others. Several authors have attempted to sketch parallel versions of path-following methods. However, only the first stages of the algorithm such as residue identification and branch-cut placement have been improved using parallel architectures, with limitations such as phase maps with a single continuous region and without isolated regions owing to the cuts. In this article, a systematic parallel Goldstein algorithm that can handle phase data with multi-regions and isolated regions is proposed. Our proposal can improve the three steps of the serial Goldstein algorithm, residue identification, branch cut, and integration. In particular, the integration step is formulated as a top-down breadth-first search problem on a graph for which a parallel algorithm was developed. Synthetic and real phase maps were used to validate the performance and robustness of the proposed parallel algorithm on a multicore architecture. For simulated and real phase maps, we obtained a speedup of 3.3 and 1.98, respectively, on a laptop computer with modest hardware resources.
Task-based programming models like OmpSs-2 and OpenMP provide a flexible data-flow execution model to exploit dynamic, irregular and nested parallelism. Providing an efficient implementation that scales well with smal...
详细信息
Machine learning and Big Data workloads are becoming as important as traditional HPC ones. AI and Big Data users tend to use new programming languages such as Python, Julia, or Java, while the HPC community is still d...
详细信息
Real-time simulation of a large-scale biologically representative spiking neural network is presented, through the use of a heterogeneous parallelization scheme and SpiNNaker neuromorphic hardware. A published cortica...
详细信息
Real-time simulation of a large-scale biologically representative spiking neural network is presented, through the use of a heterogeneous parallelization scheme and SpiNNaker neuromorphic hardware. A published cortical microcircuit model is used as a benchmark test case, representing approximate to 1 mm(2) of early sensory cortex, containing 77 k neurons and 0.3 billion synapses. This is the first hard real-time simulation of this model, with 10 s of biological simulation time executed in 10 s wall-clock time. This surpasses best-published efforts on HPC neural simulators (3 x slowdown) and GPUs running optimized spiking neural network (SNN) libraries (2 x slowdown). Furthermore, the presented approach indicates that real-time processing can be maintained with increasing SNN size, breaking the communication barrier incurred by traditional computing machinery. Model results are compared to an established HPC simulator baseline to verify simulation correctness, comparing well across a range of statistical measures. Energy to solution and energy per synaptic event are also reported, demonstrating that the relatively low-tech SpiNNaker processors achieve a 10 x reduction in energy relative to modern HPC systems, and comparable energy consumption to modern GPUs. Finally, system robustness is demonstrated through multiple 12 h simulations of the cortical microcircuit, each simulating 12 h of biological time, and demonstrating the potential of neuromorphic hardware as a neuroscience research tool for studying complex spiking neural networks over extended time periods. This article is part of the theme issue 'Harmonizing energy-autonomous computing and intelligence'.
暂无评论