Connected facility location combines cost-efficient facility placement and the requirement to connect the facilities among each other. Such problems arise, e.g., in telecommunication applications where networks consis...
详细信息
We consider the problem of identifying motifs that abstracts the task of finding short conserved sites in genomic DNA. The planted (l, d)- motif problem, PMP, is the mathematical abstraction of this problem, which con...
详细信息
We consider the problem of identifying motifs that abstracts the task of finding short conserved sites in genomic DNA. The planted (l, d)- motif problem, PMP, is the mathematical abstraction of this problem, which consists of finding a substring of length l that occurs in each si in a set of input sequences S = {s1, s2,..., st} with at most d substitutions. Our propose algorithm combines the voting algorithm and pattern matching algorithm to find exact motifs. The combined algorithm is achieved by running the voting algorithm on t' sequences, t' < t. After that we use the pattern matching on the output of the voting algorithm and the reminder sequences, t - t'. Two values of t' are calculated. The first value of t' makes the running time of our proposed algorithm less than the running time of voting algorithm. The second value of t' makes the running time of our proposed algorithm is minimal. We show that our proposed algorithm is faster than the voting algorithm by testing both algorithms on simulated data from (9, d <= 2) to (19, (d <= 7). Finally, we test the performance of the combined algorithm on realistic biological data.
Motif stem search (MSS) is a recent motif search problem to search motifs on large-alphabet inputs. A motif stem is an l-length string with some wildcards. The goal of the MSS problem is to find a set of stems that re...
详细信息
ISBN:
(纸本)9781479913091;9781479913107
Motif stem search (MSS) is a recent motif search problem to search motifs on large-alphabet inputs. A motif stem is an l-length string with some wildcards. The goal of the MSS problem is to find a set of stems that represents a superset of all (l, d) motifs present in the input sequences. The three main contributions of this paper are as follows: (1) We build motif stem representation more precisely by using regular expressions. (2) We give a new method for generating all possible motif stems. (3) We propose an efficient algorithm, called StemFinder, for solving the MSS problem. Compared with the previous algorithms, StemFinder runs much faster and first solves the (17, 8), (19, 9) and (21, 10) challenging instances on protein sequences;moreover, StemFinder reports fewer stems representing a smaller superset of all (l, d) motifs.
Phylogenetic networks provide a way to describe and visualize evolutionary histories that have undergone so-called reticulate evolutionary events such as recombination, hybridization or horizontal gene transfer. The l...
详细信息
The task of finding the Minimum Routing Cost Spanning Tree (MRCST) can be found in many network design problems. In general cases, MRCST problem is a NP-hard problem. Till now, several algorithms for solving the probl...
详细信息
ISBN:
(纸本)9781479934003
The task of finding the Minimum Routing Cost Spanning Tree (MRCST) can be found in many network design problems. In general cases, MRCST problem is a NP-hard problem. Till now, several algorithms for solving the problem are proposed and their performance were evaluated on different data sets. This paper presents a short survey of well known MRCST algorithms and gives a report on an extensive experimentation based on benchmark instances collected from the literature. In addition, the paper is also the first pioneering research that experiment on graphs with up to 1000 vertices.
We develop a novel framework, the implicit hitting set approach, for solving a class of combinatorial optimization problems. The explicit hitting set problem is as follows: given a set U and a family S of subsets of U...
详细信息
We develop a novel framework, the implicit hitting set approach, for solving a class of combinatorial optimization problems. The explicit hitting set problem is as follows: given a set U and a family S of subsets of U, find a minimum-cardinality set that intersects (hits) every set in S. In the implicit hitting set problem (IHSP), the family of subsets S is not explicitly listed (its size is, generally, exponential in terms of the size of U);instead, it is given via a polynomial-time oracle that verifies if a given set H is a hitting set or returns a set in S that is not hit by H. Many NP-hard problems can be straightforwardly formulated as implicit hitting set problems. We show that the implicit hitting set approach is valuable in developing exact and heuristic algorithms for solving this class of combinatorial optimization problems. Specifically, we provide a generic algorithmic strategy, which combines efficient heuristics and exact methods, to solve any IHSP. Given an instance of an IHSP, the proposed algorithmic strategy gives a sequence of feasible solutions and lower bounds on the optimal solution value and ultimately yields an optimal solution. We specialize this algorithmic strategy to solve the multigenome alignment problem and present computational results that illustrate the effectiveness of the implicit hitting set approach.
Background: Phylogeny estimation from aligned haplotype sequences has attracted more and more attention in the recent years due to its importance in analysis of many fine-scale genetic data. Its application fields ran...
详细信息
Background: Phylogeny estimation from aligned haplotype sequences has attracted more and more attention in the recent years due to its importance in analysis of many fine-scale genetic data. Its application fields range from medical research, to drug discovery, to epidemiology, to population dynamics. The literature on molecular phylogenetics proposes a number of criteria for selecting a phylogeny from among plausible alternatives. Usually, such criteria can be expressed by means of objective functions, and the phylogenies that optimize them are referred to as optimal. One of the most important estimation criteria is the parsimony which states that the optimal phylogeny T* for a set H of n haplotype sequences over a common set of variable loci is the one that satisfies the following requirements: (i) it has the shortest length and (ii) it is such that, for each pair of distinct haplotypes hi, hj is an element of H, the sum of the edge weights belonging to the path from hi to hj in T* is not smaller than the observed number of changes between hi and hj. Finding the most parsimonious phylogeny for H involves solving an optimization problem, called the Most Parsimonious Phylogeny Estimation Problem (MPPEP), which isNP-hard in many of its versions. Results: In this article we investigate a recent version of the MPPEP that arises when input data consist of single nucleotide polymorphism haplotypes extracted from a population of individuals on a common genomic region. Specifically, we explore the prospects for improving on the implicit enumeration strategy of implicit enumeration strategy used in previous work using a novel problem formulation and a series of strengthening valid inequalities and preliminary symmetry breaking constraints to more precisely bound the solution space and accelerate implicit enumeration of possible optimal phylogenies. We present the basic formulation and then introduce a series of provable valid constraints to reduce the solution space. We the
In this paper, we deal with the problem of finding quasi-independent sets in graphs. This problem is formally defined in three versions, which are shown to be polynomially equivalent. The one that looks most general, ...
详细信息
In this paper, we deal with the problem of finding quasi-independent sets in graphs. This problem is formally defined in three versions, which are shown to be polynomially equivalent. The one that looks most general, namely, f -MAX QUASIINDEPENDENT SET, consists of, given a graph and a non-decreasing function f, finding a maximum size subset Q of the vertices of the graph, such that the number of edges in the induced subgraph is less than or equal to f (|Q|). For this problem, we show an exact solution method that runs within time O*(2(d-27/23/d+1 n)) on graphs of average degree bounded by d. For the most specifically defined gamma-MAX QUASI-INDEPENDENT SET and k-MAX QUASI-INDEPENDENT SET problems, several results on complexity and approximation are shown, and greedy algorithms are proposed, analyzed and tested.
We focus on some single machine scheduling problems for which a set of new jobs have to be scheduled after a schedule of old jobs has been set. Each new and old job belongs to a family and changing the production from...
详细信息
We focus on some single machine scheduling problems for which a set of new jobs have to be scheduled after a schedule of old jobs has been set. Each new and old job belongs to a family and changing the production from one family to another requires a setup. The initial schedule of old jobs is assumed to minimize the sum of setup times. The new jobs can be either scheduled after the old jobs or inserted within the existing schedule, which results in a disruption cost that has to be minimized together with the sum of setup times of the overall schedule. In this paper we tackle several simple setup time configurations yielding different scheduling problems for which we propose optimal polynomial time algorithms or provide NP-hardness proofs. In the former case we consider the problem of enumerating the set of strict Pareto optima for the sum of setup times and disruption cost criteria. (C) 2012 Elsevier B.V. All rights reserved.
In this paper we merge recent developments on exact algorithms for finding an ordering of vertices of a given graph that minimizes bandwidth (the BANDWIDTH problem) and for finding an embedding of a given graph into a...
详细信息
In this paper we merge recent developments on exact algorithms for finding an ordering of vertices of a given graph that minimizes bandwidth (the BANDWIDTH problem) and for finding an embedding of a given graph into a line that minimizes distortion (the DISTORTION problem). For both problems we develop algorithms that work in O(9.363(n)) time and polynomial space. For BANDWIDTH, this improves O*(10(n)) algorithm by Feige and Kilian from 2000, for DISTORTION this is the first polynomial space exact algorithm that works in O(c(n)) time we are aware of. As a byproduct, we enhance the O(5(n+o(n)))-time and O*(2(n))-space algorithm for DISTORTION by Fomin et al. to an algorithm working in O(4.383(n))-time and space. (c) 2011 Elsevier B.V. All rights reserved.
暂无评论