Background: Reconstructing the genome of a species from short fragments is one of the oldest bioinformatics problems. Metagenomic assembly is a variant of the problem asking to reconstruct the circular genomes of all ...
详细信息
Background: Reconstructing the genome of a species from short fragments is one of the oldest bioinformatics problems. Metagenomic assembly is a variant of the problem asking to reconstruct the circular genomes of all bacterial species present in a sequencing sample. This problem can be naturally formulated as finding a collection of circular walks of a directed graph G that together cover all nodes, or edges, of G. Approach: We address this problem with the "safe and complete" framework of Tomescu and Medvedev (Research in computational Molecular biology-20th annual conference, RECOMB 9649: 152-163, 2016). An algorithm is called safe if it returns only those walks (also called safe) that appear as subwalk in all metagenomic assembly solutions for G. A safe algorithm is called complete if it returns all safe walks of G. Results: We give graph-theoretic characterizations of the safe walks of G, and a safe and complete algorithm finding all safe walks of G. In the node-covering case, our algorithm runs in time O(m(2) + n(3)), and in the edge-covering case it runs in time O(m(2)n);n and m denote the number of nodes and edges, respectively, of G. This algorithm constitutes the first theoretical tight upper bound on what can be safely assembled from metagenomic reads using this problem formulation.
Given a graph G = (V, epsilon), we want to nd the vertex sets of the components of an unknown subgraph F = (V, E) of G such that E subset of or equal to epsilon. We learn about F by sending an oracle a query set S sub...
详细信息
Given a graph G = (V, epsilon), we want to nd the vertex sets of the components of an unknown subgraph F = (V, E) of G such that E subset of or equal to epsilon. We learn about F by sending an oracle a query set S subset of or equal to V, and the oracle tells us the vertices connected to S in F. The objective is to use the minimum number of queries to partition the vertex set V into components of F. In electronic circuit design, the problem is also known as structural diagnosis of wiring networks.
As a fundamental and effective tool for document understanding and organization, multi-document summarization enables better information services by creating concise and informative reports for large collections of do...
详细信息
As a fundamental and effective tool for document understanding and organization, multi-document summarization enables better information services by creating concise and informative reports for large collections of documents. In this paper, we propose a sentence-word two layer graph algorithm combining with keyword density to generate the multi-document summarization, known as graph & Keywordp. The traditional graph methods of multi-document summarization only consider the influence of sentence and word in all documents rather than individual documents. Therefore, we construct multiple word graph and extract right keywords in each document to modify the sentence graph and to improve the significance and richness of the summary. Meanwhile, because of the differences in the words importance in documents, we propose to use keyword density for the summaries to provide rich content while using a small number of words. The experiment results show that the graph & Keywordp method outperforms the state of the art systems when tested on the Duc2004 data set. Key words: multi-document, graph algorithm, keyword density, graph & Keywordp, Due2004
We prove that maximum weight branchings in directed graphs can be approximated in time O(m) tip to a factor of 1 - epsilon. where epsilon > 0 is an arbitrary constant. (C) 2008 Elsevier B.V. All rights reserved.
We prove that maximum weight branchings in directed graphs can be approximated in time O(m) tip to a factor of 1 - epsilon. where epsilon > 0 is an arbitrary constant. (C) 2008 Elsevier B.V. All rights reserved.
Let G = (V, E) be a directed graph with a distinguished source vertex s. The single-source path expression problem is to find, for each vertex v, a regular expression P(s, v) which represents the set of all paths in G...
详细信息
Let G = (V, E) be a simple undirected graph with a set V of vertices and a set E of edges. Each vertex v is an element of V has a demand d(v) is an element of Z(+), and a cost c(v) is an element of R+, where Z(+) and ...
详细信息
Let G = (V, E) be a simple undirected graph with a set V of vertices and a set E of edges. Each vertex v is an element of V has a demand d(v) is an element of Z(+), and a cost c(v) is an element of R+, where Z(+) and R+ denote the set of nonnegative integers and the set of nonnegative reals, respectively. The source location problem with vertex-connectivity requirements in a given graph G asks to find a set S of vertices minimizing Sigma(v is an element of S) c(v) such that there are at least d(v) pairwise vertex-disjoint paths from S to v for each vertex v is an element of V-S. It is known that the problem is not approximable within a ratio of O(ln Sigma(v is an element of V) d(v)), unless NP has an O(N-loglogN)-time deterministic algorithm. Also, it is known that even if every vertex has a uniform cost and d* = 4 holds, then the problem is NP-hard, where d* = max{d(v) vertical bar v is an element of V}. In this paper, we consider the problem in the case where every vertex has uniform cost. We propose a simple greedy algorithm for providing a max{d*, 2d* - 6}-approximate solution to the problem in O(min{d*, root vertical bar V vertical bar d*vertical bar V vertical bar(2)) time, while we also show that there exists an instance for which it provides no better than a (d* - 1)-approximate solution. Especially, in the case of d* <= 4, we give a tight analysis to show that it achieves an approximation ratio of 3. We also show the APX-hardness of the problem even restricted to d* <= 4. (C) 2009 Elsevier B.V. All rights reserved.
Let G be a directed graph such that for each vertex v in G, the successors of v are ordered Let C be any equivalence relation on the vertices of G. The congruence closure C* of C is the finest equivalence relation con...
详细信息
This paper presents an algorithm for constructing a cactus representation for ail minimum cuts in an undirected network. Our algorithm runs in O(nm + n(2) log n + gamma m log n) time, where n and m are the number of v...
详细信息
This paper presents an algorithm for constructing a cactus representation for ail minimum cuts in an undirected network. Our algorithm runs in O(nm + n(2) log n + gamma m log n) time, where n and m are the number of vertices and edges respectively, and gamma is the number of cycles in a cactus representation, which is the one of the best deterministic time complexities to compute a cactus representation.
Dinic has shown that the classic maximum flow problem on a graph of n vertices and m edges can be reduced to a sequence of at most n − 1 so-called ‘blocking flow’ problems on acyclic graphs. For dense graphs, the be...
详细信息
Dinic has shown that the classic maximum flow problem on a graph of n vertices and m edges can be reduced to a sequence of at most n − 1 so-called ‘blocking flow’ problems on acyclic graphs. For dense graphs, the best time bound known for the blocking flow problems is O( n 2 ). Karzanov devised the first O( n 2 )-time blocking flow algorithm, which unfortunately is rather complicated. Later Malhotra, Kumar and Maheshwari devise another O( n 2 )-time algorithm, which is conceptually very simple but has some other drawbacks. In this paper we propose a simplification of Karzanov's algorithm that is easier to implement than Malhotra, Kumar and Maheshwari's method.
暂无评论