Efforts to support high performance computing (HPC) applications' requirements in the context of cloud computing have motivated us to design HPC Shelf, a cloud computing services platform to build and deploy large...
详细信息
We are witnessing several factors in computing that offers as much opportunities as challenges. We are witnessing the end of Moore's law, almost two decades after the death of Dennard's scaling. Exascale compu...
详细信息
parallel programming is one of the most effective approaches to handle complex problems regarding time complexity by reducing computation time, by getting the most of the capacity of the processors and shared-memory o...
详细信息
Automatic synthesis of efficient scientific parallel programs for supercomputers is in general a complex problem of system parallel programming. Therefore various specialized synthesis algorithms and heuristics are of...
详细信息
In earlier work, we defined a domain-specific language (DSL) with the aim to provide an easy-to-use approach for programming multi-core and multi-GPU clusters. The DSL incorporates the idea of utilizing algorithmic sk...
详细信息
In earlier work, we defined a domain-specific language (DSL) with the aim to provide an easy-to-use approach for programming multi-core and multi-GPU clusters. The DSL incorporates the idea of utilizing algorithmic skeletons, which are well-known patterns for parallel programming, such as map and reduce. Based on the chosen skeleton, a user-defined function can be applied to a data structure in parallel with the main advantage that the user does not have to worry about implementation details. So far, we had only implemented a generator for multi-core clusters and in this paper we present and evaluate two prototypes of generators for multi-GPU clusters, which are based on OpenACC and CUDA. We have evaluated the approach with four benchmark applications. The results show that the generation approach leads to execution times, which are on par with an alternative library implementation.
This paper proposes a novel framework to efficiently calculate a large-scale finite element (FE) numerical substructure in real-time hybrid simulation (RTHS). It is composed of a non-real-time Windows computer and a r...
详细信息
This paper proposes a novel framework to efficiently calculate a large-scale finite element (FE) numerical substructure in real-time hybrid simulation (RTHS). It is composed of a non-real-time Windows computer and a real-time Target Computer. The Windows computer is used to solve the FE numerical substructure by parallel computing in soft real-time, while the real-time Target Computer generates displacement signals for the controller in real time. Based on the proposed framework, a RTHS with numerical substructure simulated in Windows environment is developed. It is demonstrated that the computational efficiency of the RTHS could be greatly improved by parallel programming.
Stream processing applications have seen an increasing demand with the increased availability of sensors, IoT devices, and user data. Modern systems can generate millions of data items per day that require to be proce...
详细信息
In this paper is presented the GR2 Algorithm in the context of a study that encompassed elements of parallel programming and pruning techniques. Also there were executed circuits having 5, 10 and 15 qubits on quantum ...
详细信息
Recently, MPI has become widely used in many scientific applications, including different non-computer science fields, for parallelizing their applications. An MPI programming model is used for supporting parallelism ...
详细信息
Recently, MPI has become widely used in many scientific applications, including different non-computer science fields, for parallelizing their applications. An MPI programming model is used for supporting parallelism in several programming languages, including C, C, and Fortran. MPI also supports integration with some programming models and has several implementations from different vendors, including open-source and commercial implementations. However, testing parallel programs is a difficult task, especially when using programming models with different behaviours and types of error based on the programming model type. In addition, the increased use of these programming models by non-computer science specialists can cause several errors due to lack of experience in programming, which needs to be considered when using any testing tools. We noticed that dynamic testing techniques have been used for testing the majority of MPI programs. The dynamic testing techniques detect errors by analyzing the source code during runtime, which will cause overheads, and this will affect the programs performance, especially when targeting massive parallel applications generating thousands or millions of threads. In this paper, we enhance ACCTEST to have the ability to test MPI-based programs and detect runtime errors occurring with different types of MPI communications. We decided to use hybrid-testing techniques by combining both static and dynamic testing techniques to gain the benefit of each and reduce the cost.
Adenocarcinomas are solid tumors that begins in the duct architecture of the endocrine glands in human body, constituting some of the most frequent tumors (breast or prostate), with high morbidity and mortality, and t...
详细信息
Adenocarcinomas are solid tumors that begins in the duct architecture of the endocrine glands in human body, constituting some of the most frequent tumors (breast or prostate), with high morbidity and mortality, and treatment costs in constant growth for public health systems. This work starts from a mathematical model known and contrasted in the literature for breast adenocarcinoma in situ (DCIS), and aims to perform the implementation with a 3D cellular automata and parallel processing, to help a better understanding of the pathogenesis of the disease. We describe the biology of this class of tumors and the parallel implementation methodology used, which employs parallelism of data, locks on access to data shared between tasks, and dynamic management of the simulated tissue domain. The results obtained by running the proposed parallel simulation are discussed in terms of their consistency with the histological reality of the real tumor, with the kinetics of Gompertz ' s function for tumor growth, and with the statistical distribution of tumor cells in a mammary duct with disease in situ, with reasonable times and speedups. The conclusions establish the achievement of the proposed objective, compare the approach developed with other similar ones already published, and establish our future work.
暂无评论