Given DNA sequence fragments from a pair of chromosomes, the goal of the haplotype assembly problem is to reconstruct the two haplotypes of the underlying chromosomes. Many heuristic algorithms and exact algorithms ha...
详细信息
ISBN:
(纸本)9781479927616
Given DNA sequence fragments from a pair of chromosomes, the goal of the haplotype assembly problem is to reconstruct the two haplotypes of the underlying chromosomes. Many heuristic algorithms and exact algorithms have been introduced for the problem, and they aim to reconstruct a pair of haplotypes that is optimal or near-optimal. However given an input fragments data the optimal solution may be not unique, but these algorithms can only choose one randomly or the one they find at first. This paper proposes a parameterized enumeration algorithm for the Minimum Single Nucleotide Polymorphism (SNP) Removal model of the problem. Extensive experiments show that the algorithm can effectively provide multiple optimal solutions to biologists for further analyses.
Background:Given the sequenced fragments from a pair of chromosomes,the goal of the haplotype assembly problem is to reconstruct the two haplotypes for the chromosomes. Many heuristic algorithms and parameterized algo...
详细信息
Background:Given the sequenced fragments from a pair of chromosomes,the goal of the haplotype assembly problem is to reconstruct the two haplotypes for the chromosomes. Many heuristic algorithms and parameterized algorithms have been introduced for the *** far for a given input instance,the algorithms all aim to construct a pair of haplotypes that is optimal or *** there are many optimal solutions,they can only choose one randomly or the one they find at *** enumerating multiple optimal solutions to the problem is of practical importance. Methods:Based on the characteristics of real DNA sequence fragment data,we propose a parameterized dynamic programming algorithm to enumerate multiple optimal solutions to the Minimum SNPs Removal(MSR) model of the haplotypes assembly problem when the fragments contain no missing *** some sequencing technologies tend to produce more errors in some specific DNA sequence sites than others,assuming there are just a few error- prone sites,MSR try to build the real pair of haplotypes with minimum SNPs *** construct initial optimal solutions for the sub-dataset consisting of the data at the first SNP site;we then consider the data containing the first two SNP sites,and so *** reaching the last SNP site,we will obtain multiple optimal solutions to the MSR model. Results:The algorithm can find out at most k optimal solutions to the MSR model in the time complexity 0(nkklk2+mlogm+mkl) and the space complexity 0(mn+kn2),where n is the number of SNPs,m is the number of fragments,k1 is the maximum number of SNP sites that a fragment covers,and k2 is the maximum number of fragments that cover a SNP *** experimental results show that for a general input dataset,the minimum SNP site set whose removal make building a pair of haplotypes feasible is not unique,and that the different haplotypes pairs reconstructed by removing different SNP sets usually show different haplotype reconstruction a
In this paper, we study the NP-complete colorful variant of the classical Matching problem, namely, the Rainbow Matching problem. Given an edge-colored graph G and a positive integer k, this problem asks whether there...
详细信息
In this paper, we study the NP-complete colorful variant of the classical Matching problem, namely, the Rainbow Matching problem. Given an edge-colored graph G and a positive integer k, this problem asks whether there exists a matching of size at least k such that all the edges in the matching have distinct colors. We first develop a deterministic algorithm that solves Rainbow Matching on paths in time > and polynomial space. This algorithm is based on a curious combination of the method of bounded search trees and a divide-and-conquer-like approach, where the branching process is guided by the maintenance of an auxiliary bipartite graph where one side captures divided-and-conquered pieces of the path. Our second result is a randomized algorithm that solves Rainbow Matching on general graphs in time O(2k) and polynomial-space. Here, we show how a result by Bjorklund et al. (J Comput Syst Sci 87:119-139, 2017) can be invoked as a black box, wrapped by a probability-based analysis tailored to our problem. We also complement our two main results by designing kernels for Rainbow Matching on general and bounded-degree graphs.
Motivated by the need for succinct representations of all "small" transversals (or hitting sets) of a hypergraph of fixed rank, we study the complexity of computing such a representation. Next, the existence...
详细信息
Motivated by the need for succinct representations of all "small" transversals (or hitting sets) of a hypergraph of fixed rank, we study the complexity of computing such a representation. Next, the existence of a minimal hitting set of at least a given size arises as a subproblem. We give one algorithm for hypergraphs of any fixed rank, and we largely improve an earlier algorithm (by H. Fernau, 2005, [10]) for the rank-2 case, i.e., for computing a minimal vertex cover of at least a given size in a graph. We were led to these questions by combinatorial aspects of the protein inference problem in shotgun proteomics. (C) 2010 Elsevier B.V. All rights reserved.
MODULE Mont is a pattern matching problem that was introduced in the context of biological networks. Informally, given a multiset of colors P and a graph H in which each node is associated with a set of colors, it ask...
详细信息
MODULE Mont is a pattern matching problem that was introduced in the context of biological networks. Informally, given a multiset of colors P and a graph H in which each node is associated with a set of colors, it asks if P occurs in a module of H (i.e., in a set of nodes that have the same neighborhood outside the set). We present three parameterized algorithms for this problem, which both measure similarity between matched colors and handle deletions and insertions of colors to P. Moreover, we observe that the running times of two of them might be essentially tight, and prove that the problem is unlikely to admit a polynomial kernel. (C) 2016 Elsevier Inc. All rights reserved.
We study a broad class of graph partitioning problems. Each problem is defined by two constants, alpha (1) and alpha (2). The input is a graph G, an integer k and a number p, and the objective is to find a subset of s...
详细信息
We study a broad class of graph partitioning problems. Each problem is defined by two constants, alpha (1) and alpha (2). The input is a graph G, an integer k and a number p, and the objective is to find a subset of size k, such that alpha (1) m (1) + alpha (2) m (2) is at most (or at least) p, where m (1), m (2) are the cardinalities of the edge sets having both endpoints, and exactly one endpoint, in U, respectively. This class of fixed-cardinality graph partitioning problems (FGPPs) encompasses Max (k, n - k)-Cut, Min k-Vertex Cover, k-Densest Subgraph, and k-Sparsest Subgraph. Our main result is a 4 (k + o(k))Delta (k) ai...n (O(1)) time algorithm for any problem in this class, where Delta 1 is the maximum degree in the input graph. This resolves an open question posed by Bonnet et al. (Proc. International Symposium on parameterized and Exact Computation, 2013). We obtain faster algorithms for certain subclasses of FGPPs, parameterized by p, or by (k + p). In particular, we give a 4 (p + o(p))ai...n (O(1)) time algorithm for Max (k, n - k)-Cut, thus improving significantly the best known p (p) ai...n (O(1)) time algorithm by Bonnet et al.
In the Min-Sum 2-Clustering problem, we are given a graph and a parameter k, and the goal is to determine if there exists a 2-partition of the vertex set such that the total conflict number is at most k, where the con...
详细信息
In the Min-Sum 2-Clustering problem, we are given a graph and a parameter k, and the goal is to determine if there exists a 2-partition of the vertex set such that the total conflict number is at most k, where the conflict number of a vertex is the number of its non-neighbors in the same cluster and neighbors in the different cluster. The problem is equivalent to 2-Cluster Editing and 2-Correlation Clustering with an additional multiplicative factor two in the cost function. In this paper we show an algorithm for Min-Sum 2-Clustering with time complexity O(na <...2.619 (r/(1-4r/n))+n (3)), where n is the number of vertices and r=k/n. Particularly, the time complexity is O (au)(2.619 (k/n) ) for kao(n (2)) and polynomial for kaO(nlogn), which implies that the problem can be solved in subexponential time for kao(n (2)). We also design a parameterized algorithm for a variant in which the cost is the sum of the squared conflict-numbers. For kao(n (3)), the algorithm runs in subexponential O(n (3)a <...5.171 (theta) ) time, where .
For a given graph and an integer t, the Min-Max 2-Clustering problem asks if there exists a modification of a given graph into two maximal disjoint cliques by inserting or deleting edges such that the number of the ed...
详细信息
For a given graph and an integer t, the Min-Max 2-Clustering problem asks if there exists a modification of a given graph into two maximal disjoint cliques by inserting or deleting edges such that the number of the editing edges incident to each vertex is at most t. It has been shown that the problem can be solved in polynomial time for , where n is the number of vertices. In this paper, we design parameterized algorithms for different ranges of t. Let . We show that the problem is polynomial-time solvable when roughly . When , we design a randomized and a deterministic algorithm with sub-exponential time parameterized complexity, i.e., the problem is in SUBEPT. We also show that the problem can be solved in time for and in time for , where .
Module Motif is a pattern matching problem that was introduced in the context of biological networks. Informally, given a multiset of colors P and a graph H whose nodes have sets of colors, it asks if P occurs in a mo...
详细信息
ISBN:
(纸本)9783642403132;9783642403125
Module Motif is a pattern matching problem that was introduced in the context of biological networks. Informally, given a multiset of colors P and a graph H whose nodes have sets of colors, it asks if P occurs in a module of H (i.e. in a set of nodes that have the same neighborhood outside the set). We present three parameterized algorithms for this problem that measure similarity between matched colors and handle deletions and insertions of colors to P. We observe that the running time of two of them might be essentially tight and prove that the problem is unlikely to admit a polynomial kernel.
For a digraph G and ordering of G, the degreewidth of the ordering is the maximum number of backward arcs incident to any vertex of G. The degreewidth Delta(G) of G is defined as the minimum degreewidth of an ordering...
详细信息
ISBN:
(纸本)9783031754081;9783031754098
For a digraph G and ordering of G, the degreewidth of the ordering is the maximum number of backward arcs incident to any vertex of G. The degreewidth Delta(G) of G is defined as the minimum degreewidth of an ordering of G. A digraph G is semi-complete if every pair of vertices is connected by at least one arc, oriented if every pair of vertices is connected by at most one arc, and a tournament if every pair of vertices is connected by exactly one arc. We give a fixed parameter tractable (FPT) algorithm, with running time Delta(G)(O(Delta(G)))n+ O(n(2)), to compute the degreewidth of semi-complete digraphs. We then show that both the FEEDBACK ARC SET and CUTWIDTH problems on semi-complete digraphs admit algorithms with running time Delta(G)(O(Delta(G)))n + O(n(2)). Our algorithms resolve in the affirmative two open problems of Davot et al. [WG 2023], who asked whether there exists an FPT algorithm to compute the degreewidth of a tournament, and whether FEEDBACK ARC SET on tournaments admits an FPT algorithm when parameterized by the degreewidth of the input digraph. Additionally, extending an argument of Davot et al. [WG 2023], we show that sorting by in-degree is a 3-approximation algorithm for degreewidth on semi-complete digraphs. Finally we prove that it is NP-hard to determine whether a given oriented digraph has degreewidth at most 2.
暂无评论