Multiprocessor scheduling in a shared multiprogramming environment can be structured as two-level scheduling, where a kernel-level job scheduler allots processors to jobs and a user-level thread scheduler schedules th...
详细信息
Multiprocessor scheduling in a shared multiprogramming environment can be structured as two-level scheduling, where a kernel-level job scheduler allots processors to jobs and a user-level thread scheduler schedules the work of a job on its allotted processors. We present a randomized work-stealing thread scheduler for fork-join multithreaded jobs that provides continual parallelism feedback to the job scheduler in the form of requests for processors. Our A-STEAL algorithm is appropriate for large parallel servers where many jobs share a common multiprocessor resource and in which the number of processors available to a particular job may vary during the job's execution. Assuming that the job scheduler never allots a job more processors than requested by the job's thread scheduler, A-STEAL guarantees that the job completes in near-optimal time while utilizing at least a constant fraction of the allotted processors. We model the job scheduler as the thread scheduler's adversary, challenging the thread scheduler to be robust to the operating environment as well as to the job scheduler's administrative policies. For example, the job scheduler might make a large number of processors available exactly when the job has little use for them. To analyze the performance of our adaptive thread scheduler under this stringent adversarial assumption, we introduce a new technique called trim analysis, which allows us to prove that our thread scheduler performs poorly on no more than a small number of time steps, exhibiting near-optimal behavior on the vast majority. More precisely, suppose that a job has work T-1 and span T-infinity. On a machine with P processors, A-STEAL completes the job in an expected duration of O(T-1/(P) over tilde + T-infinity + L lg P) time steps, where L is the length of a scheduling quantum, and (P) over tilde denotes the O(T-infinity + L lg P)-trimmed availability. This quantity is the average of the processor availability over all time steps except the
The intersection of sorted arrays problem has applications in search engines such as Google. Previous work has proposed and compared deterministic algorithms for this problem, in an adaptive analysis based on the enco...
详细信息
The intersection of sorted arrays problem has applications in search engines such as Google. Previous work has proposed and compared deterministic algorithms for this problem, in an adaptive analysis based on the encoding size of a certificate of the result (cost analysis). We define the alternation analysis, based on the nondeterministic complexity of an instance. In this analysis we prove that there is a deterministic algorithm asymptotically performing as well as any randomized algorithm in the comparison model. We define the redundancy analysis, based on a measure of the internal redundancy of the instance. In this analysis we prove that any algorithm optimal in the redundancy analysis is optimal in the alternation analysis, but that there is a randomized algorithm which performs strictly better than any deterministic algorithm in the comparison model. Finally, we describe how these results can be extended beyond the comparison model.
Clustering analysis is one of the important problems in the fields of data mining and machine learning. There are many different clustering methods. Among them, k-means clustering is one of the most popular schemes ow...
详细信息
Clustering analysis is one of the important problems in the fields of data mining and machine learning. There are many different clustering methods. Among them, k-means clustering is one of the most popular schemes owing to its simple and practicality. This paper investigates the approximate algorithm for the A-means clustering by means of selecting the k initial points from the input point set. An expected 2-approximation algorithm is presented in this paper. Meanwhile, an efficient algorithm for selecting the initial points is also proposed. At last some experimental results are given to test the valid of these algorithms.
The sampling problem for input-queued (IQ) randomized scheduling algorithms is *** observe that if the current scheduling decision is a maximum weighted matching (MWM),the MWM for the next slot mostly falls in those m...
详细信息
The sampling problem for input-queued (IQ) randomized scheduling algorithms is *** observe that if the current scheduling decision is a maximum weighted matching (MWM),the MWM for the next slot mostly falls in those matchings whose weight is closed to the current *** this heuristic,a novel randomized algorithm for IQ scheduling,named genetic algorithm-like scheduling algorithm (GALSA),is *** strategy is used for choosing sampling points in *** works with only O(N) samples which means that GALSA has lower complexity than the famous randomized scheduling algorithm,*** results show that the delay performance of GALSA is quite competitive with respect to that of APSARA.
In a slowly time-varying fading broadcast channel, a proposed randomized scheduler achieves multi-user diversity gain while reducing the amount of feedback. The scheduler requests feedback of signal-to-noise ratios (S...
详细信息
ISBN:
(纸本)1424407281
In a slowly time-varying fading broadcast channel, a proposed randomized scheduler achieves multi-user diversity gain while reducing the amount of feedback. The scheduler requests feedback of signal-to-noise ratios (SNR) from a random subset of users in conjunction with the previously scheduled user, and then selects the user with the largest SNR. With temporal correlation, this scheduler achieves near optimal sum-rate even with feedback from a small subset of users, which considerably reduces the amount of feedback.
We present a new algorithm to compute motorcycle graphs. It runs in O(n root n log n) time when n is the number of motorcycles. We give a new characterization of the straight skeleton of a nondegenerate polygon. For a...
详细信息
We present a new algorithm to compute motorcycle graphs. It runs in O(n root n log n) time when n is the number of motorcycles. We give a new characterization of the straight skeleton of a nondegenerate polygon. For a polygon with n vertices and h holes, we show that it yields a randomized algorithm that reduces the straight skeleton computation to a motorcycle graph computation in expected O(n root n + 1 log(2) n) time. Combining these results, we can compute the straight skeleton of a nondegenerate polygon with h holes and with n vertices, among which r are reflex vertices, in O(n root n + 1 log(2) n + r root r log r) expected time. In particular, we can compute the straight skeleton of a nondegenerate polygon with n vertices in O(n root n log(2) n) expected time.
We present a randomized polynomial-time approximation algorithm for the fixed linear crossing number problem (FLCNP). In this problem, the vertices of a graph are placed in a fixed order along a horizontal "node ...
详细信息
We present a randomized polynomial-time approximation algorithm for the fixed linear crossing number problem (FLCNP). In this problem, the vertices of a graph are placed in a fixed order along a horizontal "node line" in the plane, each edge is drawn as an arc in one of the two half-planes (pages), and the objective is to minimize the number of edge crossings. FLCNP is NP-hard, and no previous polynomial-time approximation algorithms are known. We show that the problem can be generalized to k pages and transformed to the maximum k-cut problem which admits a randomized polynomial-time approximation. For the 2-page case, our approach leads to a randomized polynomial time 0.878 + 0.122 rho approximation algorithm for FLCNP, where rho is the ratio of the number of conflicting pairs (pairs of edges that cross if drawn in the same page) to the crossing number. We further investigate this performance ratio on the random graph family G(n), 1/2, where each edge of the complete graph K,1 occurs with probability 1/2. We show that a longstanding conjecture for the crossing number of K-n implies that with probability at least 1 - 4e(-lambda 2) the expected performance bound of the algorithm on a random graph from G(n). 1/2 is 1.366 + O(lambda/n). A series of experiments is performed to compare the algorithm against two other leading heuristics on a set of test graphs. The results indicate that the randomized algorithm yields near-optimal solutions and outperforms the other heuristics overall. (C) 2007 Elsevier B.V. All rights reserved.
The maximum weight matching algorithm is a high-performance scheduling algorithm for cross-bar switches. It is known that it performs optimally under heavy loads. However, its centralized nature and high computational...
详细信息
The maximum weight matching algorithm is a high-performance scheduling algorithm for cross-bar switches. It is known that it performs optimally under heavy loads. However, its centralized nature and high computational complexity limit the algorithm's applicability. This paper presents a randomized algorithm for distributed switch scheduling that is capable of delivering high throughput. (c) 2007 Elsevier B.V. All rights reserved.
In data streaming applications, data arrives at rapid rates and in high volume, thus making it essential to process each stream update very efficiently in terms of both time and space. A data stream is a sequence of d...
详细信息
In data streaming applications, data arrives at rapid rates and in high volume, thus making it essential to process each stream update very efficiently in terms of both time and space. A data stream is a sequence of data records that must be processed continuously in an online fashion using sub-linear space and sub-linear processing time. We consider the problem of tracking the number of distinct items over data streams that allow insertion and deletion operations. We present two algorithms that improve on the space and time complexity of existing algorithms. (c) 2007 Elsevier B. V. All rights reserved.
The importance of the sensitivity of an algorithm to the output size of a problem is well-known especially if the upper bound on the output size is known to be not too large. In this paper we focus on the problem of d...
详细信息
The importance of the sensitivity of an algorithm to the output size of a problem is well-known especially if the upper bound on the output size is known to be not too large. In this paper we focus on the problem of designing very fast parallel algorithms for constructing the upper envelope of straight-line segments that achieve the O(n log H) work-bound for input size n and output size H. When the output size is small, our algorithms run faster than the algorithms whose running times are sensitive only to the input size. Since the upper bound on the output size of the upper envelop problem is known to be small (n alpha(n)), where a(n) is a slowly growing inverse-Ackerman's function, the algorithms are no worse in cost than the previous algorithms in the worst case of the output size. Our algorithms are designed for the arbitrary CRCW PRAM model. We first describe an O (log n center dot (log H + log log n)) time deterministic algorithm for the problem, that achieves O (n log H) work bound for H = Omega(log n). We then present a fast randomized algorithm that runs in expected time O (log H center dot log log n) with high probability and does O (n log H) work. For log H = Omega(log log n), we can achieve the running time of O (log H) while simultaneously keeping the work optimal. We also present a fast randomized algorithm that runs in O (log n / log k) time with nk processors, k > log(Omega(1)) n. The algorithms do not assume any prior input distribution and the running times hold with high probability. (C) 2007 Elsevier Inc. All rights reserved.
暂无评论