We present a general framework for approximation schemes on parallel processor scheduling. We propose epsilon-approximation algorithms for scheduling on identical, uniform and unrelated machines when the number of pro...
详细信息
ISBN:
(纸本)0818676833
We present a general framework for approximation schemes on parallel processor scheduling. We propose epsilon-approximation algorithms for scheduling on identical, uniform and unrelated machines when the number of processors is fixed. For each of the three problems considered we perform grouping on job processing times in order to produce a transformed scheduling instance where the number of distinct task types is bounded. We optimally solve the corresponding mixed integer program and we prove that the optimal makespans for the initial and the transformed problems can differ at most by a factor of 1 + epsilon. The complexity of all epsilon-approximation algorithms is O(n), where n is the number of jobs to be scheduled.
We investigate the efficient implementation of algorithms with a two-level parallelism on distributed memory machines. We consider parallel specifications consisting of an upper level of multi-processor tasks each of ...
详细信息
ISBN:
(纸本)0818676833
We investigate the efficient implementation of algorithms with a two-level parallelism on distributed memory machines. We consider parallel specifications consisting of an upper level of multi-processor tasks each of which having an internal structure of uni-processor tasks. To achieve an optimal parallel execution time, the parallel execution of such a program requires an optimal scheduling of the multi-processor tasks and an appropriate treatment of uniprocessor tasks. In particular;we consider an important class of parallel programs that are generated within a specific parallel programming model designing group-SPMD programs for scientific computing. We show how the costs of data redistributions between M-tasks can be taken into consideration and how the special structure of the resulting program can be exploited by using a simple approximation algorithm with a provable good performance.
This paper presents on-line perturbation tracking and intrusion removal techniques which are designed to accommodate delays which occur due to monitoring activities. These accommodations eliminate the effect of monito...
详细信息
ISBN:
(纸本)0818676833
This paper presents on-line perturbation tracking and intrusion removal techniques which are designed to accommodate delays which occur due to monitoring activities. These accommodations eliminate the effect of monitoring intrusion on the execution behavior and the scheduling of the monitored computation. By maintaining an adjusted time view, the intrusion removal system preserves the execution order of processes and the message selection decisions that would have been made in an unmonitored execution.
When data are uniformly distributed, parallel join algorithms scale up well. However, scalability is curtailed by data skew-nonuniform distribution of data between processors. Investigation of this problem has been ha...
详细信息
The author proposes a family of adaptive protocols for managing replicated databases. Adaptive protocols use information about the system configuration to determine how operations are executed. Special system transact...
This paper presents a parallel algorithm running in time O(log m log* m(log log m + log(n/m))) time on an EREW PRAM with O(m/(log m log* m)) processors for the problem of selection in an m x n matric with sorted rows ...
详细信息
ISBN:
(纸本)0818676833
This paper presents a parallel algorithm running in time O(log m log* m(log log m + log(n/m))) time on an EREW PRAM with O(m/(log m log* m)) processors for the problem of selection in an m x n matric with sorted rows and columns, m less than or equal to n. Our algorithm generalizes the result of Sarnath and He [13] for selection in a sorted matrix of equal dimensions, and thus answers the opera question they posted. The algorithm is work-optimal when n greater than or equal to m log m, and near optimal within O(log log m) factor otherwise, We show that our algorithm can be generalized to solve the selection problem on or set of sorted matrices of arbitrary dimensions.
Improvements in the processing speed of multiprocessors are outpacing improvements in the speed of disk hardware. parallel disk I/O subsystems have been proposed as one way to close the gap between processor and disk ...
详细信息
A mixed-mode parallel machine's processing elements (PEs) are capable of operating in and switching between the SIMD and MIMD modes of parallelism. The paper analyzes various mappings of image correlation algorith...
详细信息
SRAM (static random access memory)-based pipelined algorithmic solutions have become competitive alternatives to TCAMs (ternary content addressable memories) for high-throughput IP lookup. Multiple pipelines can be ut...
详细信息
SRAM (static random access memory)-based pipelined algorithmic solutions have become competitive alternatives to TCAMs (ternary content addressable memories) for high-throughput IP lookup. Multiple pipelines can be utilized in parallel to improve the throughput further. However, several challenges must be addressed to make such solutions feasible. First, the memory distribution over different pipelines, as well as across different stages of each pipeline, must be balanced. Second, the traffic among these pipelines should be balanced. third, the intra-flow packet order (i.e. the sequence) must be preserved. In this paper, we propose a parallel SRAM-based multi-pipeline architecture for IP lookup. A two-level mapping scheme is developed to balance the memory requirement among the pipelines as well as across the stages in each pipeline. To balance the traffic, we propose an early caching scheme to exploit the data locality inherent in the architecture. Our technique uses neither a large reorder buffer nor complex reorder logic. Instead, a flow-aware queuing scheme exploiting the flow information is used to maintain the intra-flow sequence. Extensive simulation using real-life traffic traces shows that the proposed architecture with 8 pipelines can achieve a throughput of up to 10 billion packets per second, i.e. 3.2 Tbps for minimum size (40 bytes) packets, while preserving intra-flow packet order. (c) 2009 Elsevier Inc. All rights reserved.
暂无评论