We propose a new optimization-based method for learning causal structures from observational data, a process known as causal discovery. Our method takes as input observational data over a set of variables and returns ...
详细信息
We propose a new optimization-based method for learning causal structures from observational data, a process known as causal discovery. Our method takes as input observational data over a set of variables and returns a graph in which causal relations are specified by directed edges. We consider a highly general search space that accommodates latent confounders and feedback cycles, which few extant methods do. We formulate the discovery problem as an integer program, and propose a solution technique that exploits the conditional independence structure in the data to identify promising edges for inclusion in the output graph. In the large-sample limit, our method recovers a graph that is (Markov) equivalent to the true data-generating graph. Computationally, our method is competitive with the state-ofthe-art, and can solve in minutes instances that are intractable for alternative causal discovery methods. We leverage our method to develop a procedure for investigating the validity of an instrumental variable and demonstrate it on the influential quarter-of-birth and proximity-tocollege instruments for estimating the returns to education. In particular, our procedure complements existing instrument tests by revealing the precise causal pathways that undermine instrument validity, highlighting the unique merits of the graphical perspective on causality.
In most Australian cities, container ports are located close to the city, with transportation to and from the port facilitated by trucks. Recently, with a view to reducing container-truck induced city congestion and p...
详细信息
In most Australian cities, container ports are located close to the city, with transportation to and from the port facilitated by trucks. Recently, with a view to reducing container-truck induced city congestion and pollution, state and federal governments have begun championing a modal switch to short-haul rail for these transportation tasks. In this paper, we describe a metropolitan container transportation problem arising from this context that seeks to effectively leverage both modes of transport from a least-cost perspective. We propose a mathematical programming formulation and develop a new modified Benders decomposition method for the problem. We show that the simultaneous Magnanti-Wong method finds Pareto-optimal cuts by solving an augmented version of the subproblem that exploits subproblem dual-degeneracy without destroying its underlying structure. Computational results demonstrate the effectiveness of this routine over the performance of commercial solver implementations of the mathematical programming formulation.
The application of decision diagrams in combinatorial optimization has proliferated in the last decade. In recent years, authors have begun to investigate how to use not one, but a set of diagrams, to model constraint...
详细信息
The application of decision diagrams in combinatorial optimization has proliferated in the last decade. In recent years, authors have begun to investigate how to use not one, but a set of diagrams, to model constraints and objective function terms. Optimizing over a collection of decision diagrams, the problem we refer to as the consistent path problem (CPP) can be addressed by associating a network-flow model with each decision diagram, jointly linked through channeling constraints. A direct application of integerprogramming to the ensuing model has already been shown to result in algorithms that provide orders-of-magnitude performance gains over classical methods. Lacking, however, is a careful study of dedicated solution methods designed to solve the CPP. This paper provides a detailed study of the CPP, including a discussion on complexity results and a complete polyhedral analysis. We propose a cut-generation algorithm, which, under a structured ordering property, finds a cut, if one exists, through an application of the classical maximum flow problem, albeit in an exponentially sized network. We use this procedure to fuel a cutting-plane algorithm that is applied to unconstrained binary cubic optimization and a variant of the market split problem, resulting in an algorithm that compares favorably with CPLEX, using standard integerprogramming formulations for both problems.
A graph is chordal if every cycle with at least four edges contains a chord-that is, an edge connecting two nonconsecutive vertices of the cycle. Several classical applications in sparse linear systems, database manag...
详细信息
A graph is chordal if every cycle with at least four edges contains a chord-that is, an edge connecting two nonconsecutive vertices of the cycle. Several classical applications in sparse linear systems, database management, computer vision, and semidefinite programming can be reduced to finding the minimum number of edges to add to a graph so that it becomes chordal, known as the minimum chordal completion problem (MCCP). We propose a new formulation for the MCCP that does not rely on finding perfect elimination orderings of the graph, as has been considered in previous work. We introduce several families of facet-defining inequalities for cycle subgraphs and investigate the underlying separation problems, showing that some key inequalities are NP-hard to separate. We also identify conditions through which facets and inequalities associated with the polytope of a certain graph can be adapted in order to become facet defining for some of its subgraphs or supergraphs. Numerical studies combining heuristic separation methods and lazy-constraint generation indicate that our approach substantially outperforms existing methods for the MCCP.
Cancer is a leading cause of death worldwide, and advanced cancer is often treated with combinations of multiple chemotherapy drugs. In this work, we develop models to predict the outcomes of clinical trials testing c...
详细信息
Cancer is a leading cause of death worldwide, and advanced cancer is often treated with combinations of multiple chemotherapy drugs. In this work, we develop models to predict the outcomes of clinical trials testing combination chemotherapy regimens before they are run and to select the combination chemotherapy regimens to be tested in new phase II and phase III clinical trials, with the primary objective of improving the quality of regimens tested in phase III trials compared to current practice. We built a database of 414 clinical trials for advanced gastric cancer and use it to build statistical models that attain an out-of-sample R-2 of 0.56 when predicting a trial's median overall survival (OS) and an out-of-sample area under the curve (AUC) of 0.83 when predicting if a trial has unacceptably high toxicity. We propose models that use machine learning and optimization to suggest regimens to be tested in phase II and phase III trials. Though it is inherently challenging to evaluate the performance of such models without actually running clinical trials, we use two techniques to obtain estimates for the quality of regimens selected by our models compared with those actually tested in current clinical practice. Both techniques indicate that the models might improve the efficacy of the regimens selected for testing in phase III clinical trials without changing toxicity outcomes. This evaluation of the proposed models suggests that they merit further testing in a clinical trial setting.
Replication is a widely-used technique in information retrieval and database systems for providing fault tolerance and reducing parallelization and processing costs. Combinatorial models based on hypergraph partitioni...
详细信息
Replication is a widely-used technique in information retrieval and database systems for providing fault tolerance and reducing parallelization and processing costs. Combinatorial models based on hypergraph partitioning are proposed for various problems arising in information retrieval and database systems. We consider the possibility of using vertex replication to improve the quality of hypergraph partitioning. In this study, we focus on the constrained min-cut replication (CMCR) problem, where we are initially given a maximum replication capacity and a K-way hypergraph partition with an initial imbalance ratio. The objective in the CMCR problem is finding the optimal vertex replication sets for each part of the given partition such that the initial cut size of the partition is minimized, where the initial imbalance is either preserved or reduced under the given replication capacity constraint. In this study, we present a complexity analysis of the CMCR problem and propose a model based on a unique blend of coarsening and integer linear programming (ILP) schemes. This coarsening algorithm is derived from a novel utilization of the Dulmage-Mendelsohn decomposition. Experiments show that the ILP formulation coupled with the Dulmage-Mendelsohn decomposition-based coarsening provides high quality results in practical execution times for reducing the cut size of a given K-way hypergraph partition.
Motivated by addressing probabilistic 0-1 programs we study the conic quadratic knapsack polytope with generalized upper bound (GUB) constraints. In particular, we investigate separating and extending GUB cover inequa...
详细信息
Motivated by addressing probabilistic 0-1 programs we study the conic quadratic knapsack polytope with generalized upper bound (GUB) constraints. In particular, we investigate separating and extending GUB cover inequalities. We show that, unlike in the linear case, determining whether a cover can be extended with a single variable NP-hard. We describe and compare a number of exact and heuristic separation and extension algorithms which make use of the structure of the constraints. Computational experiments are performed for comparing the proposed separation and extension algorithms. These experiments show that a judicious application of the extended GUB cover cuts can reduce the solution time of conic quadratic 0-1 programs with GUB constraints substantially.
Most studies in airline operations planning research are focused on the optimization problems that deal with a daily flight schedule, which is considered to be the same for every day in the week. While the weekly sche...
详细信息
Most studies in airline operations planning research are focused on the optimization problems that deal with a daily flight schedule, which is considered to be the same for every day in the week. While the weekly schedule is more realistic and practical, it increases the complexity of the optimization problems drastically. In this paper, we present a novel weekly rotation-tour network representation for the weekly aircraft maintenance routing problem (WAMRP). Based on this representation, we propose a new network-based mixed-integer linear programming (LP) formulation for the WAMRP;namely, weekly rotation-tour network model (WRTNM). The main advantage of this formulation is that the size of WRTNM only increases linearly with the size of the weekly schedule, and it provides a very tight LP relaxation. In addition, because of the tight LP relaxation, we develop a diving heuristic to solve WRTNM efficiently and effectively. To assess the performance of WRTNM, we tested the WRTNM using eight real-life test cases. The computational results show that the proposed model is very compact and scalable, and is able to find the optimal solutions to the schedule with 5,700 flights and 330 aircraft, approximately the size of the world's largest airlines fleet, Within five minutes. We also propose an integrated model to solve the WAMRP with the weekly fleet assignment problem simultaneously We tested the integrated model on nine self-constructed test cases. The computational results show that the integrated model generates near-optimal solutions to the schedules with 1,700 flights, 8 fleets with 110 aircraft, and approximately a medium-sized airline, in a reasonable time.
We describe a multistage, stochastic, mixed-integerprogramming model for planning capacity expansion of production facilities. A scenario tree represents uncertainty in the model;a general mixed-integer program defin...
详细信息
We describe a multistage, stochastic, mixed-integerprogramming model for planning capacity expansion of production facilities. A scenario tree represents uncertainty in the model;a general mixed-integer program defines the operational submodel at each scenario-tree node, and capacity-expansion decisions link the stages. We apply "variable splitting" to two model variants, and solve those variants using Dantzig-Wolfe decomposition. The Dantzig-Wolfe master problem can have a much stronger linear programming relaxation than is possible without variable splitting, over 700% stronger in one case. The master problem solves easily and tends to yield integer solutions, obviating the need for a full branch-and-price solution procedure. For each scenario-tree node, the decomposition defines a subproblem that may be viewed as a single-period, deterministic, capacity-planning problem. An effective solution procedure results as long as the subproblems solve efficiently, and the procedure incorporates a good "duals stabilization method." We present computational results for a model to plan the capacity expansion of an electricity distribution network in New Zealand, given uncertain future demand. The largest problem we solve to optimality has six stages and 243 scenarios, and corresponds to a deterministic equivalent with a quarter of a million binary variables.
We develop a robust optimization framework for dynamic empty repositioning problems modeled using time-space networks. In such problems, uncertainty arises primarily from forecasts of future supplies and demands for a...
详细信息
We develop a robust optimization framework for dynamic empty repositioning problems modeled using time-space networks. In such problems, uncertainty arises primarily from forecasts of future supplies and demands for assets at different time epochs. The proposed approach models such uncertainty using intervals about nominal forecast values and a limit on the systemwide scaled deviation from the nominal forecast values. A robust repositioning plan is defined as one in which the typical flow balance constraints and flow bounds are satisfied for the nominal forecast values, and the plan is recoverable under a limited set of recovery actions. A plan is recoverable when feasibility can be reestablished for any outcome in a defined uncertainty set. We develop necessary and sufficient conditions for flows to be robust under this definition for three types of allowable recovery actions. When recovery actions allow only flow changes on inventory arcs, we show that the resulting problem is polynomially solvable. When recovery actions allow limited reactive repositioning flows, we develop feasibility conditions that are independent of the size of the uncertainty set. A computational study establishes the practical viability of the proposed framework.
暂无评论