Matching is one of the simplest approaches for estimating causal effects from observational data. Matching techniques compare the observed outcomes across pairs of individuals with similar covariate values but differe...
详细信息
Matching is one of the simplest approaches for estimating causal effects from observational data. Matching techniques compare the observed outcomes across pairs of individuals with similar covariate values but different treatment statuses in order to estimate causal effects. However, traditional matching techniques are unreliable given high-dimensional covariates due to the infamous curse of dimensionality. To overcome this challenge, we propose a simple, fast, yet highly effective approach to matching using Random Hyperplane Tessellations (RHPT). First, we prove that the RHPT representation is an approximate balancing score - thus maintaining the strong ignorability assumption - and provide empirical evidence for this claim. Second, we report results of extensive experiments showing that matching using RHPT outperforms traditional matching techniques and is competitive with state-of-the-art deep learning methods for causal effect estimation. In addition, RHPT avoids the need for computationally expensive training of deep neural networks.
Computational geometry algorithms branch on the signs of predicates. Prior predicate evaluation techniques are slow on degenerate (zero sign) predicates, especially on predicates on algebraic numbers. Degeneracy is co...
详细信息
Linear-probing hash tables have been classically believed to support insertions in time Theta(x(2)), where 1-1/x is the load factor of the hash table. Recent work by Bender, Kuszmaul, and Kuszmaul (FOCS'21), howev...
详细信息
ISBN:
(纸本)9798331516758;9798331516741
Linear-probing hash tables have been classically believed to support insertions in time Theta(x(2)), where 1-1/x is the load factor of the hash table. Recent work by Bender, Kuszmaul, and Kuszmaul (FOCS'21), however, has added a new twist to this story: in some versions of linear probing, if the maximum load factor is at most 1 - 1/x, then the amortized expected time per insertion will never exceed x polylog x (even in workloads that operate continuously at a load factor of 1-1/x). Determining the exact asymptotic value for the amortized insertion time remains open. In this paper, we settle the amortized complexity with matching upper and lower bounds of Theta(x log(1.5) x). Along the way, we also obtain tight bounds for the so-called path surplus problem, a problem in combinatorial geometry that has been shown to be closely related to linear probing. We also show how to extend Bender et al.'s bounds to say something not just about ordered linear probing (the version they study) but also about classical linear probing, in the form that is most widely implemented in practice.
In the well-known Minimum Linear Arrangement problem (MinLA), the goal is to arrange the nodes of an undirected graph into a permutation so that the total stretch of the edges is minimized. This paper studies an onlin...
详细信息
ISBN:
(纸本)9798350386066;9798350386059
In the well-known Minimum Linear Arrangement problem (MinLA), the goal is to arrange the nodes of an undirected graph into a permutation so that the total stretch of the edges is minimized. This paper studies an online variant of MinLA where the graph is not given at the beginning, but rather revealed piece-by-piece. The algorithm starts in a fixed initial permutation, and after a piece of the graph is revealed, the algorithm must update its current permutation to be a MinLA of the subgraph revealed so far. The objective is to minimize the total number of swaps of adjacent nodes as the algorithm updates the permutation. The main result of this paper is an online randomized algorithm that solves the online MinLA problem for the restricted cases where the graph is either a collection of cliques or a collection of lines. We show that the algorithm is 8 ln n-competitive, where n is the number of nodes of the graph. We complement this result by constructing a lower bound of Omega(ln n) for competitiveness of any online algorithm, concluding that our randomized algorithm is asymptotically optimal.
Finding dense subnetworks, with density based on edges or more complex structures, such as subgraphs or k-cliques, is a fundamental algorithmic problem with many applications. While the problem has been studied extens...
详细信息
ISBN:
(纸本)9798400704901
Finding dense subnetworks, with density based on edges or more complex structures, such as subgraphs or k-cliques, is a fundamental algorithmic problem with many applications. While the problem has been studied extensively in static networks, much remains to be explored for temporal networks. In this work we introduce the novel problem of identifying the temporal motif densest subnetwork, i.e., the densest subnetwork with respect to temporal motifs, which are high-order patterns characterizing temporal networks. Identifying temporal motifs is an extremely challenging task, and thus, efficient methods are required. To address this challenge, we design two novel randomized approximation algorithms with rigorous probabilistic guarantees that provide high-quality solutions. We perform extensive experiments showing that our methods outperform baselines. Furthermore, our algorithms scale on networks with up to billions of temporal edges, while baselines cannot handle such large networks. We use our techniques to analyze a financial network and show that our formulation reveals important network structures, such as bursty temporal events and communities of users with similar interests.
Backoff algorithms are used in many distributed systems where multiple devices contend for a shared resource. For the classic balls-into-bins problem, the number of singletons- those bins with a single ball-is importa...
详细信息
Backoff algorithms are used in many distributed systems where multiple devices contend for a shared resource. For the classic balls-into-bins problem, the number of singletons- those bins with a single ball-is important to the analysis of several backoff algorithms;however, existing analyses employ advanced probabilistic tools. Here, we show that standard Chernoff bounds can be used instead, and the simplicity of this approach is illustrated by re-analyzing some well-known backoff algorithms. (C) 2022 Elsevier B.V. All rights reserved.
In this article, we consider a network of processors aiming at cooperatively solving mixed-integer convex programs subject to uncertainty. Each node only knows a common cost function and its local uncertain constraint...
详细信息
In this article, we consider a network of processors aiming at cooperatively solving mixed-integer convex programs subject to uncertainty. Each node only knows a common cost function and its local uncertain constraint set. We propose a randomized, distributed algorithm working under asynchronous, unreliable, and directed communication. The algorithm is based on a local computation and communication paradigm. At each communication round, nodes perform two updates: 1) A verification in which they check-in a randomized fashion-the robust feasibility of a candidate optimal point, and 2) an optimization step in which they exchange their candidate basis (the minimal set of constraints defining a solution) with neighbors and locally solve an optimization problem. As a main result, we show that processors can stop the algorithm after a finite number of communication rounds (either because verification has been successful for a sufficient number of rounds or because a given threshold has been reached) so that candidate optimal solutions are consensual. The common solution has proven to be-with high confidence-feasible and, hence, optimal for the entire set of uncertainty except a subset having an arbitrarily small probability measure. We show the effectiveness of the proposed distributed algorithm using two examples: a random, uncertain mixed-integer linear program and a distributed localization in wireless sensor networks. The distributed algorithm is implemented on a multicore platform in which the nodes communicate asynchronously.
Standard tools to update approximations to a matrix A (for example, Quasi-Newton Hessian approximations in optimization) incorporate computationally expensive one-sided samples AV. This article develops randomized alg...
详细信息
ISBN:
(数字)9783030954703
ISBN:
(纸本)9783030954703;9783030954697
Standard tools to update approximations to a matrix A (for example, Quasi-Newton Hessian approximations in optimization) incorporate computationally expensive one-sided samples AV. This article develops randomized algorithms to efficiently approximate A by iteratively incorporating cheaper two-sided samples U-inverted perpendicular AV. Theoretical convergence rates are proved and realized in numerical experiments. A heuristic accelerated variant is developed and shown to be competitive with existing methods based on one-sided samples.
We study comparison sorting in the evolving data model, introduced by Anagnostopoulos, Kumar, Mahdian and Upfal (2011), where the true total order changes while the sorting algorithm is processing the input. More prec...
详细信息
ISBN:
(纸本)9798331516758;9798331516741
We study comparison sorting in the evolving data model, introduced by Anagnostopoulos, Kumar, Mahdian and Upfal (2011), where the true total order changes while the sorting algorithm is processing the input. More precisely, each comparison operation of the algorithm is followed by a sequence of evolution steps, where an evolution step perturbs the rank of a random item by a "small" random value. The goal is to maintain an ordering that remains close to the true order over time. Previous works have analyzed adaptations of classic sorting algorithms, assuming that an evolution step changes the rank of an item by just one, and that a fixed constant number b of evolution steps take place between two comparisons. In fact, the only previous result achieving optimal linear total deviation, by Besa Vial, Devanny, Eppstein, Goodrich and Johnson (2018a), applies just for b = 1. We analyze a very simple sorting algorithm suggested by Mahdian (2014), which samples a random pair of adjacent items in each step and swaps them if they are out of order. We show that the algorithm achieves and maintains, with high probability, optimal total deviation, O(n), and optimal maximum deviation, O(log n), under very general model settings. Namely, the perturbation introduced by each evolution step is sampled from a general distribution of bounded moment generating function, and we just require that the average number of evolution steps between two sorting steps be bounded by an (arbitrary) constant, where the average is over a linear number of steps. The key ingredients of our proof are a novel potential function argument that inserts "gaps" in the list of items, and a general analysis framework which separates the analysis of sorting from that of the evolution steps, and is applicable to a variety of settings for which previous approaches do not apply. Our results settle conjectures and open problems in the three aforementioned works, and provide theoretical support that simple quadratic al
We study the classical, randomized RANKING algorithm, which is known to be (1 - 1/epsilon)-competitive in expectation for the Online Bipartite Matching Problem. We give a tail inequality bound (Theorem 1), namely that...
详细信息
ISBN:
(纸本)9783031710322;9783031710339
We study the classical, randomized RANKING algorithm, which is known to be (1 - 1/epsilon)-competitive in expectation for the Online Bipartite Matching Problem. We give a tail inequality bound (Theorem 1), namely that RANKING is (1 - 1/epsilon - alpha)-competitive with probability at least 1 - e(-2 alpha 2n) where n is the size of the maximum matching in the instance. Building on this, we show similar concentration results for several generalizations of the Online Bipartite Matching Problem, including the Fully Online Matching Problem and the Online Vertex-Weighted Bipartite Matching Problem.
暂无评论