In this paper, we consider the problem of scheduling comparisons of motifs against biological databanks. We experimentally show that this problem lies in the divisible load framework with negligible communication cost...
详细信息
ISBN:
(纸本)0769523129
In this paper, we consider the problem of scheduling comparisons of motifs against biological databanks. We experimentally show that this problem lies in the divisible load framework with negligible communication costs. In this framework, we propose a polynomial-time algorithm to optimally solve the maximum weighted flow off-line scheduling problem on unrelated machines. We also show how to optimally solve the maximum weighted flow off-line scheduling problem with preemption on unrelated machines.
For a Peer-to-Peer (P2P) system which contains huge amount of data, an efficient content search technique is definitely necessary. In this paper, we present a novel category overlay search infrastructure which can sit...
详细信息
The Chandy-Lamport checkpointing algorithm is widely used in fault tolerant implementations of MPI. However, it assumes the FIFO property of message passing, which is not guaranteed by the MPI standard at the applicat...
详细信息
ISBN:
(纸本)0769523129
The Chandy-Lamport checkpointing algorithm is widely used in fault tolerant implementations of MPI. However, it assumes the FIFO property of message passing, which is not guaranteed by the MPI standard at the application level. Therefore, this algorithm cannot serve as a basis for an implementation-independent fault tolerant MPI. In this paper, we present a variant of the Chandy-Lamport algorithm that does not rely on the FIFO property. This algorithm can be implemented on top of MPI and, hence, used for development of a supplement software component enabling the fault tolerance of any MPI implementation compliant with the MPI standard. We prove the correctness of the algorithm and analyze its performance. Experimental results demonstrating the efficiency of the algorithm are also presented.
Dynamic Reconfigurable Systems (DRS) offer a very interesting alternative for embedded digital systems design. Tasks scheduling within a reconfigurable environment allows the development of systems with better executi...
详细信息
ISBN:
(纸本)0769523129
Dynamic Reconfigurable Systems (DRS) offer a very interesting alternative for embedded digital systems design. Tasks scheduling within a reconfigurable environment allows the development of systems with better execution performance, chip area economy and lower power consumption. This paper describes an algorithm for design of dynamically reconfigurable systems where tasks scheduling have as prime objective the overall application performance speedup. The methodology includes the generation of an embedded controller supporting the scheduling process in a target architecture.
作者:
Melab, N.Mezmaz, M.Talbi, E.-G.LIFL
CNRS UMR 8022 Université des Sciences et Technologies de Lille 59655 - Villeneuve d'Ascq Cedex France
Solving large size and time-intensive combinatorial optimization problems with parallel hybrid multi-objective evolutionary algorithms (MO-EAs) requires a large amount of computational resources. Peer-to-Peer (P2P) co...
详细信息
ISBN:
(纸本)0769523129
Solving large size and time-intensive combinatorial optimization problems with parallel hybrid multi-objective evolutionary algorithms (MO-EAs) requires a large amount of computational resources. Peer-to-Peer (P2P) computing is recently revealed as a powerful way to harness these resources and efficiently deal with such problems. In this paper, we focus on the parallel hybrid multi-objective island model for P2P systems. We address its design, implementation, and fault-tolerant deployment in a P2P context. The proposed model have been experimented on the Bi-criterion Permutation Flow-Shop Problem (BPFSP) on a network of 120 heterogeneous PCs. The preliminary results demonstrate the effectiveness of this model and its capabilities to fully exploit the hybridization.
This paper presents a set of distributed-shared-memory protocols that provide fault tolerance on broadcast-based and switch-based architectures with no decrease in performance. These augmented DSM protocols combine th...
详细信息
ISBN:
(纸本)0769523129
This paper presents a set of distributed-shared-memory protocols that provide fault tolerance on broadcast-based and switch-based architectures with no decrease in performance. These augmented DSM protocols combine the data duplication required by fault tolerance with the data duplication that naturally results in distributed-shared-memory implementations. The recovery memory at each backup node is continuously maintained consistent and is accessible by all processes executing at the backup node. Simulation results show that the additional data duplication necessary to create fault-tolerant DSM causes no reduction in system performance during normal operation and eliminates most of the overhead at checkpoint creation. Data blocks which are duplicated to maintain the recovery memory are also utilized by the DSM protocol, reducing network traffic, and increasing the processor utilization significantly. We use simulation and multiprocessor address trace files to compare the performance of a broadcast architecture called the SOME-Bus to the performance of two representative switch architectures.
This paper studies distributed scheduling of parallel I/O data transfers on systems that provide data replication. In our previous work, we proposed a centralized algorithm for solving this problem in systems where da...
详细信息
ISBN:
(纸本)0769523129
This paper studies distributed scheduling of parallel I/O data transfers on systems that provide data replication. In our previous work, we proposed a centralized algorithm for solving this problem in systems where data transfer information is centrally available. This algorithm finds the optimal scheduling by constructing augmenting paths in the data transfer bipartite graph, requiring O(nmlog n + n2log3/2 n) time, with n nodes and m edges in the bipartite graph. In this paper, we investigate this scheduling problem in distributed systems where data transfer information may not be centrally available. We propose a distributed scheduling algorithm, Highest Degree Lowest Workload First (HDLWF), which approximates the augmenting path algorithm in distributed environments. HDLWF is based on a distributed, two-step scheme that determines appropriate execution order of data requests through a small number of rounds of bidding between clients and I/O servers. Our experimental results indicate that HDLWF yields schedules close to the centralized optimal solution, and in some cases within 3% of the optimal solution.
Grayscale electron-beam lithography is one of the techniques used in transferring circuit patterns with multi-level structures onto substrates. The proximity effect caused by electron scattering imposes a severe limit...
详细信息
ISBN:
(纸本)0769523129
Grayscale electron-beam lithography is one of the techniques used in transferring circuit patterns with multi-level structures onto substrates. The proximity effect caused by electron scattering imposes a severe limitation on the ultimate spatial resolution attainable by E-beam lithography. Therefore, proximity effect correction is essential particularly for fine-feature, high-density circuit patterns. One difficulty is that it is extremely time-consuming due to the intensive computation required in the correction procedure and a large size of circuit data to be processed. Hence, it is an ideal candidate for distributed computing where the otherwise-unused CPU cycles of a number of computers on a network (cluster) can be efficiently utilized. In this paper, optimization of the distributed grayscale proximity effect correction procedure previously developed is addressed for minimizing the correction time. Also, a different circuit partitioning approach (localized partitioning) is considered.
The goal of this paper is to introduce a new approach to the building of efficient distributed linear system solvers. The starting point of the results of this paper lies in the fact that the parallelization of direct...
详细信息
ISBN:
(纸本)0769523129
The goal of this paper is to introduce a new approach to the building of efficient distributed linear system solvers. The starting point of the results of this paper lies in the fact that the parallelization of direct algorithms requires frequent synchronizations in order to obtain the solution for a linear problem. In a grid computing environment, communication times are significant and the bandwidth is variable, therefore frequent synchronizations slow down performances. Thus it is desirable to reduce the number of synchronizations in a parallel direct algorithm. Inspired from multisplitting techniques, the method we present consists in solving several linear problems obtained by splitting the original one. Each linear system is solved independently on a cluster by using the direct method. This paper uses the theoretical results of [6] in order to build coarse grained algorithms designed for solving linear systems in the grid computing context.
GRAPE (Graph processing Environment) is an industrial distributed computer vision system currently in use in Orbotech's Automated Optical Inspection (AOI) machines. These machines are designed for the automatic de...
详细信息
ISBN:
(纸本)0769523129
GRAPE (Graph processing Environment) is an industrial distributed computer vision system currently in use in Orbotech's Automated Optical Inspection (AOI) machines. These machines are designed for the automatic detection of defects in Flat Panel Displays (FPD), Printed Circuit Boards (PCB) and Ball Grid Arrays (BGA). The GRAPE system is designed to be easy to use for algorithm and systems engineers with little or no special training in parallel or distributed systems. Algorithms are written in standard C++ and joined together in a visual dataflow graph. The user then partitions the graph into "contexts" which are used by the system to automatically parallelize the computation. The underlying execution model of GRAPE is based on a large-grained dynamic data-flow paradigm. In contrast to traditional dataflow engines GRAPE algorithms can hold "state" over multiple executions while also making use of data parallelism. This is useful for computer vision applications, which typically need to assemble and process data collected over many execution cycles. In this paper we present an overview of the GRAPE system with its context oriented parallelism and synchronization.
暂无评论