The contribution of this paper is to present a results of applications of set theory and relations in modeling a complex distributed systems, based on parallel computing platform. The advantages of using the set theor...
详细信息
ISBN:
(数字)9783319273402
ISBN:
(纸本)9783319273402;9783319273396
The contribution of this paper is to present a results of applications of set theory and relations in modeling a complex distributed systems, based on parallel computing platform. The advantages of using the set theory are: the possibility of a formal examination of the local problems, and the possibility to organize individuals as elements of the considered classes, defined globally. To govern the collective behavior we propose three key relations and mappings determined taxonomic order on them. That can insulate us from reductionism and single-cause thinking, as people deal with complexity before. On three examples, we show how take advantage of the new parallel programming tools to obtain more effective multiple inputs in parallel way, than assigning sequentially single causes for any outputs.
Current general-purpose processors are augmented with vector instructions that can process many elements of matrices and vectors in parallel. Transposing a matrix in-place is a main kernel operation required by many s...
详细信息
ISBN:
(纸本)9781467385435
Current general-purpose processors are augmented with vector instructions that can process many elements of matrices and vectors in parallel. Transposing a matrix in-place is a main kernel operation required by many scientific and engineering applications to shuttle data before, during, or after processing. This operation increases the traffic on the memory bus and hence clever techniques such as blocking are required to enhance the performance. In this paper, we present an enhanced version of a previously published algorithm for transposing a matrix on a two-dimensional processor arrays. We restructured this algorithm to fit the one-dimensional vector register architecture augmented to general-purpose CPUs. We implemented the new vector algorithm using Intel SSE4 vector instruction set and compare its performance with the standard sequential algorithm in addition to an already employed implementation of Ekhlundh's algorithm. We also studied the automatic compiler optimizations and their effect on the vectorization of the algorithm. The best of our implementations showed a maximum speedup of 1.6 compared with the sequential algorithm, and an almost equal performance compared with Eklundh's algorithm implementation.
The iterative methods are widely used to solve eigen problems in scientific computation. We focus on multiple restarted Arnoldi methods (MRAM) [4] which manage iterative co-methods to accelerate their convergence. The...
详细信息
ISBN:
(纸本)9780769550947
The iterative methods are widely used to solve eigen problems in scientific computation. We focus on multiple restarted Arnoldi methods (MRAM) [4] which manage iterative co-methods to accelerate their convergence. These methods are considered as good candidate for emerging large scale computational systems thanks to their asynchronous communication schema, their potential load balancing and their multi-level parallelism. Both coarse grain parallelism between co-methods and fine grain parallelism inside each co-method need to be flexibly mapped to large scale distributed memory systems like hierarchical supercomputer, the cloud or P2P platforms. In this paper, we investigate such possibility in MRAMs by developing and executing them with a development and execution environment called FP2C (Framework for Post-Petascale Computing) [11]. The FP2C is a user-friendly and hierarchical-system-oriented development and execution environment based on workflow and distributedparallel methodologies. Our first goal is to show the feasibility of the approach FP2C to realize complex applications such as MRAMs. The next objective of the paper is to point out the adaptability of MRAMs to new generation of supercomputers (Petascale and futur Exascale). In addition, we show by our experiments that, collaboration of multiple iterative methods based on coarse grain parallelism accelerates convergence, and co-working processors within each iterative method based on fine grain parallelism accelerates the time per iteration.
Protection and measurement systems in electrical substations are required to have high availability. In an all-digital substation protection system, all the components (instrument transformers, processing units, mergi...
详细信息
Protection and measurement systems in electrical substations are required to have high availability. In an all-digital substation protection system, all the components (instrument transformers, processing units, merging units, intelligent electronic devices, communication network, and synchronization source) may affect the overall availability level. In this paper, a solution to enhance distributed PMU availability, during wired network failures, is presented. In the proposed scheme, the process bus has two parallel networks: 1) the classic wired Ethernet link and 2) a wireless link (implemented with industrial grade IEEE 802.11 devices), for sampled values packets, which carry measurement information. The time synchronization is carried out only through the wired Ethernet link, but the proposed solution is still able to compensate temporary failures of one of the communication links. Experimental tests have been performed to verify the performance of additional IEEE 802.11 link using different protocols and configurations. Communication parameters that can affect the PMU performance, like propagation latency, are characterized. It is shown that, if the measurement algorithm is opportunely designed, depending on the wireless link quality, it is possible to comply, with a single output, with M and P classes of the synchrophasor standard also during network restoration or, at least, to safeguard protection applications if higher latency occurs.
For a large class of scientific data analysis applications it is becoming important, due to the sheer size of datasets, to have the option to perform the analysis directly where the data are stored, rather than on rem...
详细信息
ISBN:
(纸本)9783642141218
For a large class of scientific data analysis applications it is becoming important, due to the sheer size of datasets, to have the option to perform the analysis directly where the data are stored, rather than on remote computational clusters. A possible strategy is the use of virtual clusters, thus guaranteeing a high degree of isolation from the underlying physical computational structure, and a very compact initial description. Deploying, saving and restoring HPC dedicated virtual clusters introduces, however, a different class of requirements on the virtual machines managing infrastructure, in particular for what concerns storage I/O requirements, whose scalability boundaries are easily reached. Here we discuss an alternative approach based on a storage model that leverages the WORM (write once, read many) character of the data used by VM management to increase, in a scalable way, the aggregate data bandwidth available to virtual cluster level operations and provide preliminary results indicating that it is a viable solution.
Current courses in parallel and distributed computing (PDC) often focus on programming models and techniques. However, PDC is embedded in a scientific workflow that incorporates more than programming skills. The workf...
详细信息
ISBN:
(纸本)9783030105495;9783030105488
Current courses in parallel and distributed computing (PDC) often focus on programming models and techniques. However, PDC is embedded in a scientific workflow that incorporates more than programming skills. The workflow spans from mathematical modeling to programming, data interpretation, and performance analysis. Especially the last task is covered insufficiently in educational courses. Often scientists from different fields of knowledge, each with individual expertise, collaborate to perform these tasks. In this work, the general design and the implementation of an exercise within the course "Supercomputers and their programming" at Technische Universitat Dresden, Faculty of Computer Science is presented. In the exercise, the students pass through a complete workflow for scientific computing. The students gain or improve their knowledge about: (i) mathematical modeling of systems, (ii) transferring the mathematical model to a (parallel) program, (iii) visualization and interpretation of the experiment results, and (iv) performance analysis and improvements. The exercise exactly aims at bridging the gap between the individual tasks of a scientific workflow and equip students with wide knowledge.
This paper presents the design and implementation methodology of the JCQ system, a Java-based Continual Query system for update monitoring over Web information sources. A continual query is a standing query that monit...
详细信息
ISBN:
(纸本)3540669035
This paper presents the design and implementation methodology of the JCQ system, a Java-based Continual Query system for update monitoring over Web information sources. A continual query is a standing query that monitors updates of interest using distributed triggers and notifies users whenever the updates reach specified thresholds. In this paper we focus on the strategies and techniques developed in JCQ for scalable and efficient trigger firing and the execution model for flexible and robust change notification. We evaluate our approach through a performance study of the most recent release of the JCQ system and a comparison with related work.
The proceedings contain 77 papers. The topics discussed include: on aggressive early deflation in parallel variants of the QR algorithm;a model for efficient onboard actualization of an instrumental cyclogram for the ...
ISBN:
(纸本)9783642281440
The proceedings contain 77 papers. The topics discussed include: on aggressive early deflation in parallel variants of the QR algorithm;a model for efficient onboard actualization of an instrumental cyclogram for the mars MetNet mission on a public cloud infrastructure;distributed Java programs initial mapping based on extremal optimization;global asynchronous parallel program control for multicore processors;streaming model computation of the FDTD problem;numerical investigation of the cumulant expansion for Fourier path integrals;simulated annealing with coarse graining and distributed computing;high performance computing techniques for scaling image analysis workflows;parallel computation of bivariate polynomial resultants on graphics processing units;an interval version of the Crank-Nicolson method - the first approach;and an interval finite difference method of Crank-Nicolson type for solving the one-dimensional heat conduction equation with mixed boundary conditions.
advent of the internet and its impact on healthcare domain made it possible to store, access and update medical records anywhere and anytime. The term 'Electronic Health Record(EHR)' refers to a digital format...
详细信息
The proceedings contain 77 papers. The topics discussed include: on aggressive early deflation in parallel variants of the QR algorithm;a model for efficient onboard actualization of an instrumental cyclogram for the ...
ISBN:
(纸本)9783642281501
The proceedings contain 77 papers. The topics discussed include: on aggressive early deflation in parallel variants of the QR algorithm;a model for efficient onboard actualization of an instrumental cyclogram for the mars MetNet mission on a public cloud infrastructure;distributed Java programs initial mapping based on extremal optimization;global asynchronous parallel program control for multicore processors;streaming model computation of the FDTD problem;numerical investigation of the cumulant expansion for Fourier path integrals;simulated annealing with coarse graining and distributed computing;high performance computing techniques for scaling image analysis workflows;parallel computation of bivariate polynomial resultants on graphics processing units;an interval version of the Crank-Nicolson method - the first approach;and an interval finite difference method of Crank-Nicolson type for solving the one-dimensional heat conduction equation with mixed boundary conditions.
暂无评论