The paper introduces the concept of collective breakpoints and classifies the possible parallel breakpoints comparing their mechanisms. Based on the collective breakpoints the macrostep-by-macrostep execution mode has...
详细信息
ISBN:
(纸本)0769501915
The paper introduces the concept of collective breakpoints and classifies the possible parallel breakpoints comparing their mechanisms. Based on the collective breakpoints the macrostep-by-macrostep execution mode has been defined. After introducing the concept of the execution tree and meta-breakpoints the systematic debugging of message passing parallel programs is explained. The main features and distributed structure of DIWIDE, a macrostep debugger is described. The integration of DIWIDE into the GRADE and WINPAR parallel programming environments is outlined. An algorithm is shown how to generate automatically the collective breakpoints in the GRADE environment.
GRIDs are large-scale distributed computing infrastructures that enable the integrated and collaborative use of high-end computers, networks, databases, and scientific instruments owned and managed by multiple organiz...
详细信息
One of challenges brought by large-scale scientific applications is how to avoid remote storage access by collectively using enough local storage resources to hold huge amount of data generated by the simulation while...
详细信息
ISBN:
(纸本)0769512577;0769512585
One of challenges brought by large-scale scientific applications is how to avoid remote storage access by collectively using enough local storage resources to hold huge amount of data generated by the simulation while providing high performance I/O. DPFS, a distributedparallel File System, is designed and implemented to address this problem. DPFS collects locally distributed unused storage resources as a supplement to the internal storage of parallel computing systems to satisfy the storage capacity requirement of large-scale applications. In addition, like parallel file systems, DPFS provides striping mechanisms that divides a file into small pieces and distributes them across multiple storage devices for parallel data access. The unique feature of DPFS is that it provides three file levels with each file level corresponding to a file striping method. In addition to the traditional linear striping method, DPFS also provides a novel Multidimensional striping method that can solve performance problems of linear striping for many popular access patterns. Other issues such as load-balanceing and user interface are also addressed in DPFS.
The method of the efficient score statistic is used extensively to conduct inference for high throughput genomic data due to its computational efficiency and ability to accommodate simple and complex phenotypes. Infer...
详细信息
ISBN:
(纸本)9781509036820
The method of the efficient score statistic is used extensively to conduct inference for high throughput genomic data due to its computational efficiency and ability to accommodate simple and complex phenotypes. Inference based on these statistics can readily incorporate a priori knowledge from a vast collection of bioinformatics databases to further refine the analyses. The sampling distribution of the efficient score statistic is typically approximated using asymptotics. As this may be inappropriate in the context of small study size, or uncommon or rare variants, resampling methods are often used to approximate the exact sampling distribution. We propose SparkScore, a set of distributed computational algorithms implemented in Apache Spark, to leverage the embarrassingly parallel nature of genomic resampling inference on the basis of the efficient score statistics. We illustrate the application of this computational approach for the analysis of data from genome-wide analysis studies (GWAS). This computational approach also harnesses the fault-tolerant features of Spark and can be readily extended to analysis of DNA and RNA sequencing data, including expression quantitative trait loci (eQTL) and phenotype association studies.
A parallel factorization of a sparse multilevel representation of the EFIE impedance matrix is discussed. A nonblocking asynchronous communication protocol is used to transfer data between processors. Numerical exampl...
详细信息
ISBN:
(纸本)9781424449682
A parallel factorization of a sparse multilevel representation of the EFIE impedance matrix is discussed. A nonblocking asynchronous communication protocol is used to transfer data between processors. Numerical examples demonstrate the performance and accuracy of the factorization on a distributed memory parallel cluster.
distributed software transactional memory (D-STM) is an emerging, alternative concurrency control model for distributedsystems that promises to alleviate the difficulties of lock-based distributed synchronization-e.g...
详细信息
ISBN:
(纸本)9780769546759
distributed software transactional memory (D-STM) is an emerging, alternative concurrency control model for distributedsystems that promises to alleviate the difficulties of lock-based distributed synchronization-e.g., distributed deadlocks, livelocks, and lock convoying. We consider Herlihy and Sun's dataflow D-STM model, where objects are migrated to invoking transactions, and the closed nesting model of managing inner (distributed) transactions. We present a transactional scheduler called, reactive transactional scheduler (or RTS) to boost the throughput of closed-nested transactions. RTS determines whether a conflicting parent transaction must be aborted or enqueued according to the level of contention. If a transaction is enqueued, its nested inner transactions do not have to retrieve objects again, resulting in reduced communication delays. Our implementation of RTS in the HyFlow D-STM framework and experimental evaluations reveal that RTS improves throughput over D-STM without RTS, by as much as 88%.
A large scale simulation for polymer chains in good solvent is performed. The implementation technique for efficient parallel execution, optimization, and load-balancing are discussed on this practical application. Fi...
详细信息
A large scale simulation for polymer chains in good solvent is performed. The implementation technique for efficient parallel execution, optimization, and load-balancing are discussed on this practical application. Finally a simple performance model is proposed.
In this paper, the idea of operating parallel inverters to mimic the dynamic stability of a synchronous generator (SG) is investigated when the inertia and damping constants are differed. These inverters are virtual s...
详细信息
ISBN:
(纸本)9781538667057
In this paper, the idea of operating parallel inverters to mimic the dynamic stability of a synchronous generator (SG) is investigated when the inertia and damping constants are differed. These inverters are virtual synchronous machines (VSM) due to the replication of the inertial dynamics inherent to the SGs. Instead of using a conventional Phase-Lock Loop (PLL) in order to synchronize distributed generation (DG) to the grid frequency, the swing equation inherent to SG dynamics implemented. parallel VSM controlled inverters can have behaviors based on the constants of their individual swing equation. This can cause the phase angles to differ beyond IEEE synchronization limits. The proposed algorithm is implemented to correct the phase angle and return parallel VSMs to acceptable operating ranges. Simulation results are performed in PSCAD-EMTDC simulation environment.
Model checking is a powerful technique for verifying and analyzing complex systems in many application fields. The analysis process of complex and concurrent systems often requires large computation resources which re...
详细信息
ISBN:
(纸本)9781538637906
Model checking is a powerful technique for verifying and analyzing complex systems in many application fields. The analysis process of complex and concurrent systems often requires large computation resources which represents a real challenge. Even with simple configurations, the well-known state explosion problem is faced as the generated state space of such systems grows exponentially with the number of the system components. Numerous methods and techniques have been developed to overcome this problem including parallel and distributed-memory processing. In this paper, we aim at improving the performances of the so called Symbolic Observation Graph (SOG) construction by using parallelization techniques. A SOG is a hybrid structure where the transitions of a system are divided into observed and unobserved ones. The nodes of this graph are then defined as sets of states linked with unobserved transitions (and encoded symbolically with a BDD) and edges are labeled with observed transitions only (and are explicitly represented). We propose two parallel algorithms to build the SOG. The first algorithm is dedicated for shared memory architectures, and is based on the distribution of the SOG construction on several threads using a dynamic load balancing scheme. The second algorithm is proposed for distributed memory architectures, and distributes the SOG construction on processes using a static load balancing scheme. These two algorithms are implemented and their performances are studied and compared to each other and to the sequential construction of the SOG.
Normality, consitancy criteria stronger than sequentiality and equivalant to linearizability for the unary operations case, has the main advantage that it avoids the use of the "global real-time ordering". T...
详细信息
ISBN:
(纸本)0769517609
Normality, consitancy criteria stronger than sequentiality and equivalant to linearizability for the unary operations case, has the main advantage that it avoids the use of the "global real-time ordering". This work presents the first algorithm that implements normality without using strong communication primitives (i.e. atomic broadcast or global clock syncronization). Moreover, our implementation allows the dynamic changes of the system configuration, handles replication and refers the general case of multi-object operations. Although the use of terms as client or server our algorithm is entirely based on a peer-to-peer approach.
暂无评论