Multicomputer systems connected by SCSI buses have been studied and proved to be a cost-efficient and high-speed architecture. In order to build highly scalable multiple computers based on this design, we have to take...
详细信息
ISBN:
(纸本)1892512459
Multicomputer systems connected by SCSI buses have been studied and proved to be a cost-efficient and high-speed architecture. In order to build highly scalable multiple computers based on this design, we have to take into consideration of different network topologies. The performance of large scale SCSI networks with linear and mesh structures has been proposed and evaluated. In this paper, we study the message routing algorithms of circular, cylindrical, torus and hypercube SCSI networks.
We investigate the problem of scheduling dags with parallel tasks on binarily partitionable systems by analyzing alpha(n) (LL), the average-case performance ratio of the level-by-level (LL) scheduling algorithm, where...
详细信息
ISBN:
(纸本)1892512459
We investigate the problem of scheduling dags with parallel tasks on binarily partitionable systems by analyzing alpha(n) (LL), the average-case performance ratio of the level-by-level (LL) scheduling algorithm, where n is the number of tasks. It is shown that for arbitrary probability distributions of task parallelisms and task execution times, the asymptotic average-case performance ratio of algorithm LL in scheduling parallel computations with wide dags is a(infinity) (LL) = 1. In particular for exponential distributions of task execution times, the average-case performance ratio of LL in scheduling iterative computations, complete trees, partitioning algorithms, and diamond dags is alpha(n) (LL) = 1 + O(log n/n), 1 + O((log n)(2)/n), 1 + O((log n)(2)/n), and 1 + O((log n)(2)/rootn), respectively.
Current systems for managing workload on clusters of workstations, particularly those available for Linux-based (Beowulf) clusters, are typically based on traditional process-based, coarse-grained parallel and distrib...
详细信息
ISBN:
(纸本)1932415262
Current systems for managing workload on clusters of workstations, particularly those available for Linux-based (Beowulf) clusters, are typically based on traditional process-based, coarse-grained parallel and distributed programming. The DESPOT project is building a sophisticated thread-level resource-monitoring system for computational, storage and network resources [2, 3]. In this paper we present an evaluation of several scheduling algorithms within DESPOT our architecture for low-overhead, fine-grained resource-monitoring tools for per-process network and other resource usage. We also present experimental results show our performance using a genetic algorithm, the MOSIX default scheduler, and a range of parameters for the multi-facetted DESPOT algorithm. We also give several examples where our scheduling algorithm outperforms the MOSIX scheduler.
Transactional Memory is a novel, promising approach for simplifying parallel programming and increasing its acceptance and diffusion. Until now, almost all the research work on TM has been focused on shared-memory arc...
详细信息
ISBN:
(纸本)9780769539393
Transactional Memory is a novel, promising approach for simplifying parallel programming and increasing its acceptance and diffusion. Until now, almost all the research work on TM has been focused on shared-memory architectures, while very limited effort has been dedicated to TM on distributed-memory architectures. In this paper, we propose an extension of the transactional engine DSTM2, originally designed for hardware shared-memory systems, so as to run transactional applications on the nodes of a computer cluster. The framework obtained provides a software distributed shared memory with transactional consistency whereby threads running on the nodes of a cluster can access a shared memory with atomicity and isolation. So the physical private memory of each node contributes to form a global address space accessible through programming statements having transactional semantics. The extension proposed is also useful for experimentally evaluating different techniques to be employed in a distributed implementation of TM.
In this paper, we have proposed a new approach toward designing a simple and efficient non-block synchronous check-pointing algorithm for distributed systems. In general, such algorithms require all processes to take ...
详细信息
ISBN:
(纸本)1932415610
In this paper, we have proposed a new approach toward designing a simple and efficient non-block synchronous check-pointing algorithm for distributed systems. In general, such algorithms require all processes to take checkpoints, even though some of them may not be necessary. In the present work, if a process since its last checkpoint has sent some message(s), but none of which has yet been received, the process does not take a checkpoint. It reduces the number of checkpoints to be taken. This approach offers advantage particularly in case of mobile computing environment where both non-block checkpointing and reduction in the number of checkpoints help in the efficient use of the limited resources of mobile computing environment.
We discuss analytic procedures for evaluating the availability of parallel computer systems comprised of P processors with N tasks subject to failures and repairs. In addition, we argue, via analytic and numeric examp...
详细信息
ISBN:
(纸本)1892512416
We discuss analytic procedures for evaluating the availability of parallel computer systems comprised of P processors with N tasks subject to failures and repairs. In addition, we argue, via analytic and numeric examples, that not incorporating the task-stream into the model is an inadequate approach for evaluating system performance.
In this paper we discuss an approach that can be used to assess the performability of computing systems using a matrix analytic approach. This is realized by constructing an analytic model representative of parallel s...
详细信息
ISBN:
(纸本)1892512416
In this paper we discuss an approach that can be used to assess the performability of computing systems using a matrix analytic approach. This is realized by constructing an analytic model representative of parallel systems that consists of N tasks using P processors subject to failures and repairs. It is assumed that the failure, and repair times are exponentially distributed, however, service times can be non-exponentially distributed. Using this model we show how the mean job execution time can be computed and provide numerical examples when N P.
Virtual stress testing machine is a client-server software environment with a distributed memory system. VSTM encompasses a stress testing machine, a server component that initiates and controls all processes, a compu...
详细信息
ISBN:
(纸本)1932415262
Virtual stress testing machine is a client-server software environment with a distributed memory system. VSTM encompasses a stress testing machine, a server component that initiates and controls all processes, a computational finite element component, and a number of visualization components. The goal of this design is to provide a virtual stress analysis environment, in which a specimen undergoes a series of stress applications. The finite element component calculates the displacement (deformation) of the specimen under stress. The visualization clients read the results of the finite element analysis, and construct an image of the deformed specimen color-coded to indicate varying levels of stress. A multitude of applications have been envisioned for this system, including studies in medicine.
parallel file systems provide high-performance disk access by transparently striping data across multiple disks and I/O nodes. Similar to peer-to-peer systems (e.g. Freenet, Oceanstore, Chord/ CFS, Past), parallel fil...
详细信息
ISBN:
(纸本)1892512459
parallel file systems provide high-performance disk access by transparently striping data across multiple disks and I/O nodes. Similar to peer-to-peer systems (e.g. Freenet, Oceanstore, Chord/ CFS, Past), parallel file systems for clusters are employed on networked computers whose nodes are not guaranteed to be always available, due to node failures or network failure. However, different from peer-to-peer systems, cluster file systems like PVFS do not handle these failures very well. In this work, we explore how cluster file system can utilize certain peer-to-peer techniques that can handle failing nodes and thus allow for high data availability.
It is known that the vector operation such as the sum-product computation is appeared as the fundamental calculation in many fields related to the signal processing and its fast computation is an important thing. One ...
详细信息
ISBN:
(纸本)1892512459
It is known that the vector operation such as the sum-product computation is appeared as the fundamental calculation in many fields related to the signal processing and its fast computation is an important thing. One of methods for the high-performance of the vector operation is the parallel computing with the multiprocessor. In our laboratory, we have developed the Loop Structured Computer(LSC) which is a data-flow multiprocessor system. The data-flow procedure provides opportunities for parallel and pipelined execution. at the level of indevidual instructions since it is constrained only by the data dependences among the instructions. In practice, however, the parallelism is obstructed by the architecture of the target machine. In this work, we measure the computation performance of the vector operation by using simulation with varying some factors arised from the architecture of LSC. Consequently we find the efficient combination of factors in the implementation of the vector operation on LSC.
暂无评论