Genome rearrangements are events that affect large portions of a genome. When using the rearrangement distance to compare two genomes, one wants to find a minimum cost sequence of rearrangements that transforms one in...
详细信息
Genome rearrangements are events that affect large portions of a genome. When using the rearrangement distance to compare two genomes, one wants to find a minimum cost sequence of rearrangements that transforms one into another. Since we represent genomes as permutations, we can reduce this problem to the problem of sorting a permutation with a minimum cost sequence of rearrangements. In the traditional approach, we consider that all rearrangements are equally likely to occur and we set a unitary cost for all rearrangements. However, there are two variations of the problem motivated by the observation that rearrangements involving large segments of a genome rarely occur. The first variation adds a restriction to the rearrangement's length. The second variation uses a cost function based on the rearrangement's length. In this work, we present approximation algorithms for five problems combining both variations, that is, problems with a length-limit restriction and a cost function based on the rearrangement's length.
We study a natural model of coordinated social ad campaigns over a social network, based on models of Datta et al. and Aslay et al. Multiple advertisers are willing to pay the host - up to a known budget - per user ex...
详细信息
ISBN:
(纸本)9781450369763
We study a natural model of coordinated social ad campaigns over a social network, based on models of Datta et al. and Aslay et al. Multiple advertisers are willing to pay the host - up to a known budget - per user exposure, whether that exposure is sponsored or organic (i.e., shared by a friend). Campaigns are seeded with sponsored ads to some users, but no network user must be exposed to too many sponsored ads. As a result, while ad campaigns proceed independently over the network, they need to be carefully coordinated with respect to their seed sets. We study the objective of maximizing the network's total ad revenue. Our main result is to show that under a broad class of social influence models, the problem can be reduced to maximizing a submodular function subject to two matroid constraints;it can therefore be approximated within a factor essentially 1/2 in polynomial time. When there is no bound on the individual seed set sizes of advertisers, the constraints correspond only to a single matroid, and the guarantee can be improved to 1 - 1/e;in that case, a factor 1/2 is achieved by a practical greedy algorithm. The 1 - 1/e approximation algorithm for the matroid-constrained problem is far from practical;however, we show that specifically under the Independent Cascade model, LP rounding and Reverse Reachability techniques can be combined to obtain a 1 - 1/e approximation algorithm which scales to several tens of thousands of nodes. Our theoretical results are complemented by experiments evaluating the extent to which the coordination of multiple ad campaigns inhibits the revenue obtained from each individual campaign, as a function of the similarity of the influence networks and the strength of ties in the network. Our experiments suggest that as networks for different advertisers become less similar, the harmful effect of competition decreases. With respect to tie strengths, we show that the most harm is done in an intermediate range.
Two-stage stochastic optimization is a widely used framework for modeling uncertainty, where we have a probability distribution over possible realizations of the data, called scenarios, and decisions are taken in two ...
详细信息
ISBN:
(纸本)9781450367059
Two-stage stochastic optimization is a widely used framework for modeling uncertainty, where we have a probability distribution over possible realizations of the data, called scenarios, and decisions are taken in two stages: we make first-stage decisions knowing only the underlying distribution and before a scenario is realized, and may take additional second-stage recourse actions after a scenario is realized. The goal is typically to minimize the total expected cost. A common criticism levied at this model is that the underlying probability distribution is itself often imprecise! To address this, an approach that is quite versatile and has gained popularity in the stochastic-optimization literature is the distributionally robust 2-stage model: given a collection D of probability distributions, our goal now is to minimize the maximum expected total cost with respect to a distribution in D. We provide a framework for designing approximation algorithms in such settings when the collection D is a ball around a central distribution and the central distribution is accessed only via a sampling black box. We first show that one can utilize the sample average approximation (SAA) method-solve the distributionally robust problem with an empirical estimate of the central distribution-to reduce the problem to the case where the central distribution has polynomial-size support. Complementing this, we show how to approximately solve a fractional relaxation of the SAA (i.e., polynomial-scenario central-distribution) problem. Unlike in 2-stage stochastic- or robust-optimization, this turns out to be quite challenging. We utilize the ellipsoid method in conjunction with several new ideas to show that this problem can be approximately solved provided that we have an (approximation) algorithm for a certain max-min problem that is akin to, and generalizes, the k-maxmin problem-find the worst-case scenario consisting of at most k elements-encountered in 2-stage robust optimization. We
This dissertation aims to consider different problems in the area of stochastic optimization, where we are provided with more information about the instantiation of the stochastic parameters over time. With uncertaint...
详细信息
This dissertation aims to consider different problems in the area of stochastic optimization, where we are provided with more information about the instantiation of the stochastic parameters over time. With uncertainty being an inseparable part of every industry, several applications can be modeled as discussed. In this dissertation we focus on three main areas of applications: 1) ranking problems, which can be helpful for modeling product ranking, designing recommender systems, etc., 2) routing problems which can cover applications in delivery, transportation and networking, and 3) classification problems with possible applications in medical diagnosis and chemical identification. We consider three types of solutions for these problems based on how we want to deal with the observed information: static, adaptive and a priori solutions. In Chapter II, we study two general stochastic submodular optimization problems that we call Adaptive Submodular Ranking and Adaptive Submodular Routing. In the ranking version, we want to provide an adaptive sequence of weighted elements to cover a random submodular function with minimum expected cost. In the routing version, we want to provide an adaptive path of vertices to cover a random scenario with minimum expected length. We provide (poly)logarithmic approximation algorithms for these problems that (nearly) match or improve the best-known results for various special cases. We also implemented different variations of the ranking algorithm and observed that it outperforms other practical algorithms on real-world and synthetic data sets. In Chapter III, we consider the Optimal Decision Tree problem: an identification task that is widely used in active learning. We study this problem in presence of noise, where we want to perform a sequence of tests with possible noisy outcomes to identify a random hypothesis. We give different static (non-adaptive) and adaptive algorithms for this task with almost logarithmic approximation ratios
A fundamental problem in scheduling is makespan minimization on unrelated parallel machines (R||Cmax). Let there be a set J of jobs and a set M of parallel machines, where every job Jj â J has processing time o...
详细信息
A fundamental problem in scheduling is makespan minimization on unrelated parallel machines (R||Cmax). Let there be a set J of jobs and a set M of parallel machines, where every job Jj â J has processing time or length pi,j â â+ on machine Mi â M. The goal in R||Cmax is to schedule the jobs non-preemptively on the machines so as to minimize the length of the schedule, the makespan. A Ï-approximation algorithm produces in polynomial time a feasible solution such that its objective value is within a multiplicative factor Ï of the optimum, where Ï is called its approximation ratio. The best-known approximation algorithms for R||Cmax have approximation ratio 2, but there is no Ï-approximation algorithm with Ï < 3/2 for R||Cmax unless P=NP. A longstanding open problem in approximation algorithms is to reconcile this hardness gap. We take a two-pronged approach to learn more about the hardness gap of R||Cmax: (1) find approximation algorithms for special cases of R||Cmax whose approximation ratios are tight (unless P=NP); (2) identify special cases of R||Cmax that have the same 3/2-hardness bound of R||Cmax, but where the approximation barrier of 2 can be broken. This thesis is divided into four parts. The first two parts investigate a special case of R||Cmax called the graph balancing problem when every job has one of two lengths and the machines may have one of two speeds. First, we present 3/2-approximation algorithms for the graph balancing problem with one speed and two job lengths. In the second part of this thesis we give an approximation algorithm for the graph balancing problem with two speeds and two job lengths with approximation ratio (â65+7)/8 â 1.88278. In the third part of the thesis we present approximation algorithms and hardness of approximation results for two problems called R||Cmax with simple job-intersection structure and R||Cmax with bounded job assignments. We conclude this thesis by presenting algorithmic and computational comple
We improve the running times of O(1)-approximation algorithms for the set cover problem in geometric settings, specifically, covering points by disks in the plane, or covering points by halfspaces in three dimensions....
详细信息
Two-stage stochastic optimization is a widely used framework for modeling uncertainty, where we have a probability distribution over possible realizations of the data, called scenarios, and decisions are taken in two ...
详细信息
Two-stage stochastic optimization is a widely used framework for modeling uncertainty, where we have a probability distribution over possible realizations of the data, called scenarios, and decisions are taken in two stages: we take first-stage actions knowing only the underlying distribution and before a scenario is realized, and may take additional second-stage recourse actions after a scenario is realized. The goal is typically to minimize the total expected cost. A common criticism levied at this model is that the underlying probability distribution is itself often imprecise. To address this, an approach that is quite versatile and has gained popularity in the stochastic-optimization literature is the two-stage distributionally robust stochastic model: given a collection D of probability distributions, our goal now is to minimize the maximum expected total cost with respect to a distribution in D. There has been almost no prior work however on developing approximation algorithms for distributionally robust problems where the underlying scenario collection is discrete, as is the case with discrete-optimization problems. We provide frameworks for designing approximation algorithms in such settings when the collection D is a ball around a central distribution, defined relative to two notions of distance between probability distributions: Wasserstein metrics (which include the L_1 metric) and the L_infinity metric. Our frameworks yield efficient algorithms even in settings with an exponential number of scenarios, where the central distribution may only be accessed via a sampling oracle. For distributionally robust optimization under a Wasserstein ball, we first show that one can utilize the sample average approximation (SAA) method (solve the distributionally robust problem with an empirical estimate of the central distribution) to reduce the problem to the case where the central distribution has a polynomial-size support, and is represented explicitly. This follows
Facility location is a prominent optimization problem that has inspired a large quantity of both theoretical and practical studies in combinatorial optimization. Although the problem has been investigated under variou...
详细信息
We study a generalized version of the load balancing problem on unrelated machines with cost constraints: Given a set of m machines (of certain types) and a set of n jobs, each job j processed on machine i requires pi...
详细信息
The girth is one of the most basic graph parameters, and its computation has been studied for many decades. Under widely believed fine-grained assumptions, computing the girth exactly is known to require mn1−o(1) time...
详细信息
暂无评论