Dependability analysis of a large network is NP-hard due to the state space explosion. A hierarchical Boolean algebraic method was recently introduced to efficiently evaluate static terminal reliability and task-based...
详细信息
Dependability analysis of a large network is NP-hard due to the state space explosion. A hierarchical Boolean algebraic method was recently introduced to efficiently evaluate static terminal reliability and task-based reliability by dividing the problem into smaller, more manageable pieces. We extend this method to evaluate time-dependent reliability and availability, collectively referred to as "dependability", and to approximate MTTF.< >
Data partitioning and mapping is one of the most important steps of writing a parallel program, especially a data parallel one. Recently, Fortran D, and subsequently, High Performance Fortran (HPF) have been proposed ...
详细信息
Data partitioning and mapping is one of the most important steps of writing a parallel program, especially a data parallel one. Recently, Fortran D, and subsequently, High Performance Fortran (HPF) have been proposed to allow users to specify data distributions and alignments for the arrays in their programs. the paper presents the design of the data partitioning module of Fortran 90D compiler that processes the alignment and distribution directives.< >
Doacross loops are generally used to exploit the parallelism in loops with cross-iteration dependences. On shared memory machines, Doacross execution usually achieves useful speedup. this is not the case with distribu...
详细信息
the paper presents a fault-tolerant manager for distributed applications. this manager provides an efficient recovery of hosts' failures on networks of workstations. An independent checkpointing is used to automat...
详细信息
the paper presents a fault-tolerant manager for distributed applications. this manager provides an efficient recovery of hosts' failures on networks of workstations. An independent checkpointing is used to automatically recover application processes affected by host failures. Domino-effects are avoided by means of message logging and file versions management. STAR provides an efficient software failure detection by structuring hosts in a logical ring. Performance measurements in a real environment show the interest and the limits of our system.< >
this paper examines the parallelprocessing of exclusion join in a shared-nothing multiprocessor environment. First, a parallel hash-based exclusion join algorithm is presented. Unlike the case of equijoin, this algor...
详细信息
this paper examines the parallelprocessing of exclusion join in a shared-nothing multiprocessor environment. First, a parallel hash-based exclusion join algorithm is presented. Unlike the case of equijoin, this algorithm does not work correctly in the presence of nulls in the join attributes. One solution is to restrict the hash-on attributes to non-nullable fields. However, this can lead to the well known data skew problem. If the number of tuples containing null values in their join attributes is small, an alternative is to replicate those tuples to all processors. Otherwise, we can consider a range partitioning algorithm where those tuples are only sent to a small subset of the processors. the hash-based algorithm usually outperforms the range partitioning algorithm except when the number of tuples containing null values in their join attributes is large or when the data is highly skewed.< >
In PRAM emulations, universal hashing is a well-known method for distributing the address space among memory modules. However, if the memory access patterns of an application often result in high module congestion, it...
详细信息
In PRAM emulations, universal hashing is a well-known method for distributing the address space among memory modules. However, if the memory access patterns of an application often result in high module congestion, it is necessary to rehash by choosing another hash function and redistributing data on the fly. For the case of linear hash functions h(x) - ax mod m, we present an algorithm to rehash an address space of size m on a p processor PRAM emulation in time O(m/p + log p). the algorithm requires O(log m) words of local storage per processor.< >
Many distributed applications require the knowledge of the causality relation induced by the computation. Reconstructing this relation appears to be an interesting tool for such applications, but a vector of size S - ...
详细信息
Many distributed applications require the knowledge of the causality relation induced by the computation. Reconstructing this relation appears to be an interesting tool for such applications, but a vector of size S - where S is the number of processes - must be attached to each event to achieve this reconstruction. this induces a large overhead in secondary memory. After defining special events of the computation - some kind of checkpoints - we propose two algorithms that discard unnecessary data for the causal relationship reconstruction. the first algorithm acts on-the-fly while the second acts during reconstruction.< >
this paper presents experimental evidence that the ability to exploit a small degree of control parallelism can provide a significant improvement in performance for SIMD machines. this evidence has been obtained throu...
详细信息
this paper presents experimental evidence that the ability to exploit a small degree of control parallelism can provide a significant improvement in performance for SIMD machines. this evidence has been obtained through analysis of SIMD instruction traces gathered from a MasPar MP-1 system. Potential speedups of up to 2.5 times conventional SIMD were found. this paper also discusses details of such an architecture, called superscalar SIMD, and of the trace analysis process. We also consider some synchronization and communication issues that may arise.< >
the parallel State Processor (PSP) is intended as a design concept of the basic engine in large scale multiprocessors. Rather than switching between threads, PSP maintains a basic processor state which is itself a par...
详细信息
the parallel State Processor (PSP) is intended as a design concept of the basic engine in large scale multiprocessors. Rather than switching between threads, PSP maintains a basic processor state which is itself a parallel conjunction of fetch and decode along multiple threads as well as the synchronization which occurs when operands are passed between them. this view gives rise to the possibility of an intelligent parallel state, i.e., one which dynamically maximizes the lifetime of the threads it comprises by having them provide one another withthe arguments they require over time. We present experimental data verifying that such behavior can be observed in actual code.< >
Multithreading is often seen as a solution to the problem of large memory latencies that occur when remote data is needed for local computation. this paper quantifies the costs and benefits of software multithreading ...
详细信息
Multithreading is often seen as a solution to the problem of large memory latencies that occur when remote data is needed for local computation. this paper quantifies the costs and benefits of software multithreading on a distributed memory multiprocessor. We describe the design of a machine-independent software multithreading system as part of a runtime system for a high-level parallel programming language, and present a quantitative analysis of the costs of our multithreading system, as well as its performance on the nCUBE/2 multiprocessor. We show that, in the presence of a sufficient number of remote references to cover the initial costs, or multithreading system provides speedup factors of between 1.27 and 1.65.< >
暂无评论