Among matching techniques for observational studies, full matching is in principle the best, in the sense that its alignment of comparable treated and control subjects is as good as that of any alternate method, and p...
详细信息
Among matching techniques for observational studies, full matching is in principle the best, in the sense that its alignment of comparable treated and control subjects is as good as that of any alternate method, and potentially much better. This article evaluates the practical performance of full matching for the first time, modifying it in order to minimize variance as well as bias and then using it to compare coached and uncoached takers of the SAT. In this new version, with restrictions on the ratio of treated subjects to controls within matched sets, full matching makes use of many more observations than does pair matching, but achieves far closer matches than does matching with k greater than or equal to 2 controls. Prior to matching, the coached and uncoached groups are separated on the propensity score by 1.1 SDs. Full matching reduces this separation to 1% or 2% of an SD. In older literature comparing matching and regression, Cochran expressed doubts that ani method of adjustment could substantially reduce observed bias of this magnitude. To accommodate missing data, regression-based analyses by ETS researchers rejected a subset of the available sample that differed significantly from the subsample they analyzed. Full matching on the propensity score handles the same problem simply and without rejecting observations. In addition, it eases the detection and handling of nonconstancy of treatment effects, which the regression-based analyses had obscured, and it makes fuller use of covariate information. It estimates a somewhat larger effect of coaching on the math score than did ETS's methods.
In this paper, we study the generalized capacitated tree-routing problem (GCTR), which was introduced to unify the several known multicast problems in networks with edge/demand capacities. Let G = (V, E) be a connecte...
详细信息
In this paper, we study the generalized capacitated tree-routing problem (GCTR), which was introduced to unify the several known multicast problems in networks with edge/demand capacities. Let G = (V, E) be a connected underlying graph with a bulk edge capacity lambda > 0 and an edge weight w(e) >= 0, e is an element of E;we are allowed to construct a network on G by installing any edge capacity k(e)lambda with an integer ke >= 0 for each edge e is an element of E, where the resulting network costs Sigma e is an element of (E) k(e)w(e). Given a sink s is an element of V, a set M subset of V of terminals with a demand q(V) >= 0, V is an element of M and a demand capacity. > 0, we wish to construct the minimum cost network so that all the demands can be sent to s along a suitable collection T = {T1, T2,..., T p} of trees rooted at s, where the total demand collected by each tree Ti is bounded from above by., and the flow amount f (e) of T that goes through each edge e is bounded from above by the edge capacity ke lambda. In this paper, f (e) is defined as Sigma Ti is an element of T: e is an element of Ti-[ (alpha +) (beta qTi (e)]) for prescribed constants alpha, beta >= 0, where qTi (e) denotes the total demand that passes through the edge e along Ti. The term a means a fixed amount used to establish the routing Ti by separating the inside of Ti from the outside while the term alpha qTi (e) means the net capacity proportional to the demand qTi (e). The objective of GCTR is to construct a minimum cost network that admits a collection T of trees to send all demand to sink. It was left open to show whether GCTR with lambda< alpha+ beta k. is approximable by a constant factor or not. In this paper, we present a 13.037-approximation algorithm to GCTR for this case. (C) 2009 Elsevier B. V. All rights reserved.
We propose a polynomial algorithm computing a minimum plain-text representation of k-mer sets, as well as an efficient near-minimum greedy heuristic. When compressing read sets of large model organisms or bacterial pa...
详细信息
We propose a polynomial algorithm computing a minimum plain-text representation of k-mer sets, as well as an efficient near-minimum greedy heuristic. When compressing read sets of large model organisms or bacterial pangenomes, with only a minor runtime increase, we shrink the representation by up to 59% over unitigs and 26% over previous work. Additionally, the number of strings is decreased by up to 97% over unitigs and 90% over previous work. Finally, a small representation has advantages in downstream applications, as it speeds up SSHash-Lite queries by up to 4.26x over unitigs and 2.10x over previous work.
Protein structure prediction, when construed as a fold recognition problem, is one of the most important applications of similarity search in bioinformatics. A new protein-fold recognition method is reported which com...
详细信息
Protein structure prediction, when construed as a fold recognition problem, is one of the most important applications of similarity search in bioinformatics. A new protein-fold recognition method is reported which combines a single-source K diverse shortest path (SSKDSP) algorithm with Enrichment of Network Topological Similarity (ENTS) algorithm to search a graphic feature space generated using sequence similarity and structural similarity metrics. A modified, more efficient SSKDSP algorithm is developed to improve the performance of graph searching. The new implementation of the SSKDSP algorithm empirically requires 82% less memory and 61% less time than the current implementation, allowing for the analysis of larger, denser graphs. Furthermore, the statistical significance of fold ranking generated from SSKDSP is assessed using ENTS. The reported ENTS-SSKDSP algorithm outperforms original ENTS that uses random walk with restart for the graph search as well as other state-of-the-art protein structure prediction algorithms HHSearch and Sparks-X, as evaluated by a benchmark of 600 query proteins. The reported methods may easily be extended to other similarity search problems in bioinformatics and chemoinformatics. The SSKDSP software is available at . Proteins 2016;84:467-472. (c) 2016 Wiley Periodicals, Inc.
In this paper, we provide an algorithm for traversing geometric graphs which visits all vertices and reports every vertex and edge exactly once. To achieve this, we combine a given geometric graph G with the integer l...
详细信息
In this paper, we provide an algorithm for traversing geometric graphs which visits all vertices and reports every vertex and edge exactly once. To achieve this, we combine a given geometric graph G with the integer lattice, seen as a graph, in such a way that the resulting hypothetical graph can be traversed using a known algorithm which is based on face routing. To overcome the problem with hypothetical vertices and edges, we develop an algorithm for visiting any k-th neighborhood of a vertex in a graph straight-line drawn in the plane using O(k log k) memory. The memory needed to complete the traversal of a geometric graph then turns out to depend on the maximum graph distance among pairs of distinct vertices of G whose Euclidean distance is greater than one and less than 2 root 2.
Given a graph G and a demand function p: V(G) -> N, a proper n-[p]coloring is a mapping f : V(G) -> 2([1.....n]) such that vertical bar f (v)vertical bar >= p(v) for every vertex v epsilon V(G) and f(v) boole...
详细信息
Given a graph G and a demand function p: V(G) -> N, a proper n-[p]coloring is a mapping f : V(G) -> 2([1.....n]) such that vertical bar f (v)vertical bar >= p(v) for every vertex v epsilon V(G) and f(v) boolean AND f(u) = emptyset for any two adjacent vertices u and v. The least integer n for which a proper n-[p]coloring exists, chi(p)(G), is called multichrornatic number of G. Finding multichromatic number of induced subgraphs of the triangular lattice (called hexagonal graphs) has applications in cellular networks. Weighted clique number of a graph G, omega(p)(G), is the maximum weight of a clique in G, where the weight of a clique is the total demand of its vertices. McDiarmid and Reed (2000) [8] conjectured that chi(p)(G) <= (9/8)omega(p)(G) + C for triangle-free hexagonal graphs, where C is some absolute constant. In this article, we provide an algorithm to find a 7-[3]coloring of triangle-free hexagonal graphs (that is, when p(v) = 3 for all v epsilon V (G)), which implies that chi(p)(G) <= (7/6)omega(p)(G) + C. Our result constitutes a shorter alternative to the inductive proof of Havet (2001) [5] and improves the short proof of Sudeep and Vishwanathan (2005) [17], who proved the existence of a 14-[6]coloring. (It has to be noted, however, that our proof makes use of the 4-color theorem.) All steps of our algorithm take time linear in vertical bar V(G)vertical bar, except for the 4-coloring of an auxiliary planar graph. The new techniques may shed some light on the conjecture of McDiarmid and Reed (2000) [8]. (C) 2011 Elsevier B.V. All rights reserved.
The Clar number of a (hydro)carbon molecule, introduced by Clar (The aromatic sextet, 1972), is the maximum number of mutually disjoint resonant hexagons in the molecule. Calculating the Clar number can be formulated ...
详细信息
The Clar number of a (hydro)carbon molecule, introduced by Clar (The aromatic sextet, 1972), is the maximum number of mutually disjoint resonant hexagons in the molecule. Calculating the Clar number can be formulated as an optimization problem on 2-connected plane graphs. Namely, it is the maximum number of mutually disjoint even faces a perfect matching can simultaneously alternate on. It was proved by Abeledo and Atkinson (Linear Algebra Appl 420(2):441-448, 2007) that the Clar number can be computed in polynomial time if the plane graph has even faces only. We prove that calculating the Clar number in general 2-connected plane graphs is -hard. We also prove -hardness of the maximum independent set problem for 2-connected plane graphs with odd faces only, which may be of independent interest. Finally, we give an exact algorithm that determines the Clar number of a given 2-connected plane graph. The algorithm has a polynomial running time if the length of the shortest odd join in the planar dual graph is fixed, which gives an efficient algorithm for some fullerene classes, such as carbon nanotubes.
The minimum spanning tree problem consists of determining for a connected and undirected graph a spanning tree of minimal total edge cost. The sensitivity analysis problem involves a determination of the extent to wh...
详细信息
The minimum spanning tree problem consists of determining for a connected and undirected graph a spanning tree of minimal total edge cost. The sensitivity analysis problem involves a determination of the extent to which each edge cost can be perturbed individually without upsetting the minimality of the subgraph. Sensitivity analysis also can be applied to the shortest path tree problem. This problem requires computing for a given directed graph and a given vertex a spanning tree rooted at the vertex and containing a minimum-cost path from the vertex to every other vertex. algorithms are presented for performing sensitivity analysis of minimum spanning trees and shortest path trees. The algorithms involve the use of an auxiliary graph called a transmuter. Figures.
Steiner connected dominating set (SCDS) is a generalization of the famous connected dominating set problem, where only a specified set of required vertices has to be dominated by a connected dominating set, and know...
详细信息
Steiner connected dominating set (SCDS) is a generalization of the famous connected dominating set problem, where only a specified set of required vertices has to be dominated by a connected dominating set, and known to be NP- hard. This paper firstly modifies the SCDS algorithm of Guha and Khuller and achieves a worst case approximation ratio of (2 + 1/(m - 1))H(min(△, k)) +O(1), which outperforms the previous best result (c + 1)H(min(△, k)) + O(1) in the case of m ≥ 1 +1/(c - 1), where c is the best approximation ratio for Steiner tree, A is the maximum degree of the graph, k is the cardinality of the set of required vertices, m is an optional integer satisfying 0 ≤ m ≤ min(△, k) and H is the harmonic function. This paper also proposes another approximation algorithm which is based on a greedy approach. The second algorithm can establish a worst case approximation ratio of 2 ln(min(△, k)) + O(1), which can also be improved to 2 lnk if the optimal solution is greater than c·e^2c+1/2(c+1).
暂无评论