The minimum sum-of-squares clustering (MSSC), or kappa-means type clustering, is traditionally considered an unsupervised learning task. In recent years, the use of background knowledge to improve the cluster quality ...
详细信息
The minimum sum-of-squares clustering (MSSC), or kappa-means type clustering, is traditionally considered an unsupervised learning task. In recent years, the use of background knowledge to improve the cluster quality and promote interpretability of the clustering process has become a hot research topic at the intersection of mathematical optimization and machine learning research. The problem of taking advantage of background information in data clustering is called semi-supervised or constrained clustering. In this paper, we present branch-and-cut algorithm for semi-supervised MSSC, where background knowledge is incorporated as pairwise must-link and cannot-link constraints. For the lower bound procedure, we solve the semidefinite programming relaxation of the MSSC discrete optimization model, and we use a cutting-plane procedure for strengthening the bound. For the upper bound, instead, by using integer programming tools, we use an adaptation of the kappa-means algorithm to the constrained case. For the first time, the proposed global optimization algorithm efficiently manages to solve real-world instances up to 800 data points with different combinations of must-link and cannot-link constraints and with a generic number of features. This problem size is about four times larger than the one of the instances solved by state-of-the-art exact algorithms.
We consider the collaborative use of amplify-and-forward relays to forma beamforming system and provide physical layer security for a wireless machine-to-machine (M2M) communication network. We investigate two objecti...
详细信息
We consider the collaborative use of amplify-and-forward relays to forma beamforming system and provide physical layer security for a wireless machine-to-machine (M2M) communication network. We investigate two objectives: (i) the achievable secrecy rate maximization subject to the relay power constraint and (ii) the relay transmit power minimization under a secrecy rate constraint. For the first objective, we propose a secrecy rate maximization (SRM) beamforming scheme. The secrecy rate maximization problem can be formed into a two-level optimization problem and we solve it using semidefinite relaxation (SDR) techniques. To reduce the complexity of the SRM beamforming scheme, a virtual eavesdropper-based SRM (VE-SRM) beamforming scheme is proposed, in which we hypothesize a virtual eavesdropper instead of all eavesdroppers and maximize the secrecy rate according to the virtual eavesdropper. In addition, for the second objective, we design a relay power minimization (RPM) beamforming scheme, in which an iterative algorithm combining the SDR technology and the gradient-based method is devised by studying the convexity of the RPM problem. By relaxing the constraints of the RPM beamforming scheme, we propose a virtual eavesdropper-based RPM (VERPM) beamforming scheme, which reduces the multivariate RPM problem to a problem of a single variable, and thus an analytical solution is obtained. Our proposed beamforming designs can work well even if the number of eavesdroppers is larger than that of relays, while the existing schemes, for example, the null-space beamforming schemes, cannot work under this condition. Simulation results are presented to demonstrate the significance of performance improvements with the SRM and RPM beamforming schemes. It is also shown that the virtual eavesdropper approaches significantly reduce the complexity with acceptable performance degradation.
Inverse Optimal Control (IOC) is a powerful framework for learning a behavior from observations of experts. The framework aims to identify the underlying cost function that the observed optimal trajectories (the exper...
详细信息
Inverse Optimal Control (IOC) is a powerful framework for learning a behavior from observations of experts. The framework aims to identify the underlying cost function that the observed optimal trajectories (the experts' behavior) are optimal with respect to. In this work, we considered the case of identifying the cost and the feedback law from observed trajectories generated by an "average cost per stage"linear quadratic regulator. We show that identifying the cost is in general an ill-posed problem, and give necessary and sufficient conditions for non-identifiability. Moreover, despite the fact that the problem is in general ill-posed, we construct an estimator for the cost function and show that the control gain corresponding to this estimator is a statistically consistent estimator for the true underlying control gain. In fact, the constructed estimator is based on convex optimization, and hence the proved statistical consistency is also observed in practice. We illustrate the latter by applying the method on a simulation example from rehabilitation robotics.
Optimal experiment design for system identification involves determining an optimal input that is used to perturb the system so that the resulting input-output data is maximally informative. Plant friendly identificat...
详细信息
Optimal experiment design for system identification involves determining an optimal input that is used to perturb the system so that the resulting input-output data is maximally informative. Plant friendly identification requires that constraints on input move sizes, output sizes or variance and experiment time be respected. The solution to the optimum input design problem depends on the unknown parameters to be estimated which is often approximated by an initial estimate. Use of the estimate is likely to result in loss in performance or violation of the constraints. An alternative is to formulate a robust optimization problem with uncertain parameters. The contribution of this work is to use the uncertainty sets originating from a prior identification exercise to solve a robust plant friendly input design problem. The methodology is derived for a general class of systems illustrated using numerical simulations. Simulations validate the expectation that the constraints are probabilistically more likely to be satisfied using the robust design than a nominal design based on uncertain parameters.
In this paper we consider the problem of modelling observed data using a class of multivariate models with unknown-but-bounded (ubb) noise and uncertainty. Standard ARX models with additive and multiplicative bounded ...
详细信息
In this paper we consider the problem of modelling observed data using a class of multivariate models with unknown-but-bounded (ubb) noise and uncertainty. Standard ARX models with additive and multiplicative bounded noise belong to the considered class, as well as the deterministic counterpart of ARCH models extensively used in econometrics. We outline a method to fit these models based on historical data, and discuss the issues of set-valued forecasting.
Since 1984 there has been a concentrated effort to develop efficient interior-point methods for linear programming (LP). In the last few years researchers have begun to appreciate a very important property of these in...
详细信息
Since 1984 there has been a concentrated effort to develop efficient interior-point methods for linear programming (LP). In the last few years researchers have begun to appreciate a very important property of these interior-point methods (beyond their efficiency for LP): they extend gracefully to nonlinear convex optimization problems. New interior-point algorithms for problem classes such as semidefinite programming (SDP) or second-order cone programming (SOCP) are now approaching the extreme efficiency of modern linear programming codes. In this paper we discuss three examples of areas of control where our ability to efficiently solve nonlinear convex optimization problems opens up new applications. In the first example we show how SOCP can be used to solve robust open-loop optimal control problems. In the second example, we show how SOCP can be used to simultaneously design the set-point and feedback gains for a controller, and compare this method with the more standard approach. Our final application concerns analysis and synthesis via linear matrix inequalities and SDP.
Construction of phylogenetic trees from observations is a fundamental challenge in both evolutionary biology and evolutionary linguistics. Here we approach the problem from a new perspective by adopting algebraic inva...
详细信息
Construction of phylogenetic trees from observations is a fundamental challenge in both evolutionary biology and evolutionary linguistics. Here we approach the problem from a new perspective by adopting algebraic invariants associated with topological structures of phylogenetic trees. Our key development is based on machine learning to optimize the power of phylogenetic invariants for the construction of phylogenetic tree quartets, the building blocks of general evolutionary trees. Phylogenetic invariants are polynomials in the joint probabilities which vanish under a model of evolution on a phylogenetic tree. We give algorithms for selecting a good set of invariants and for learning a metric on this set of invariants which optimally distinguishes the different models. Our learning algorithms involve linear and semidefinite programming on data simulated over a wide range of parameters. We provide extensive tests of the learned metrics on simulated data from phylogenetic trees with four leaves under the Jukes-Cantor and Kimura 3-parameter models of DNA evolution. Our method greatly improves other uses of invariants and is competitive with or better than the popular neighbor-joining method. In particular, we obtain metrics trained on trees with short internal branches which perform much better than neighbor joining on this region of parameter space. These results exhibit potential advantages of applying the new methodology to evolutionary linguistics. 从观测数据中构建演化树是生命演化和进化语言学的一个基础问题。本文试图从一个新角度来研究这个问题,即通过演化树的代数不变量来重建演化树的拓扑结构。我们关键的新发展是基于机器学习来优化选择演化树的代数不变量,针对四元演化树发展了一种新的构造方法。演化树代数不变量是指关于联合分布的一种特殊的代数多项式,其在树上的演化模型下恒等于零。本文主要贡献在于发展了一类算法,用于选择一组更好区分不同演化树模型拓扑结构的代数不变量以及相应的度量结构。我们的算法基于给定演化模型下的广泛参数变化而产生的仿真数据,采用线性规划和半正定规划来学习。文中对于 DNA 演化的 Jukes-Cantor 模型和Kimura三参数模型进行了广泛的仿真试验测试。试验表明:本文方法整体上同目前广泛使用的Neighbor—Joining算法相比,具有相 似或者更好的性能;特别是对于四元树具有较短内部分支的Felsenstein参数区,本文方法远远超过后者的性能。这些结果展示了将我们的新方法应用于进化语言学研究时可能具有的优势。
Consider an unknown smooth function f : [0, 1]d → R, and assume we are given n noisy mod 1 samples of f, i.e., yi = (f(xi)+ηi) mod 1, for xi ∈ [0, 1]d, where ηi denotes the noise. Given the samples (xi, yi)ni=1, o...
详细信息
Consider an unknown smooth function f : [0, 1]d → R, and assume we are given n noisy mod 1 samples of f, i.e., yi = (f(xi)+ηi) mod 1, for xi ∈ [0, 1]d, where ηi denotes the noise. Given the samples (xi, yi)ni=1, our goal is to recover smooth, robust estimates of the clean samples f(xi) mod 1. We formulate a natural approach for solving this problem, which works with angular embeddings of the noisy mod 1 samples over the unit circle, inspired by the angular synchronization framework. This amounts to solving a smoothness regularized least-squares problem - a quadratically constrained quadratic program (QCQP) - where the variables are constrained to lie on the unit circle. Our proposed approach is based on solving its relaxation, which is a trust-region sub-problem and hence solvable efficiently. We provide theoretical guarantees demonstrating its robustness to noise for adversarial, as well as random Gaussian and Bernoulli noise models. To the best of our knowledge, these are the first such theoretical results for this problem. We demonstrate the robustness and efficiency of our proposed approach via extensive numerical simulations on synthetic data, along with a simple least-squares based solution for the unwrapping stage, that recovers the original samples of f (up to a global shift). It is shown to perform well at high levels of noise, when taking as input the denoised modulo 1 ***, we also consider two other approaches for denoising the modulo 1 samples that leverage tools from Riemannian optimization on manifolds, including a Burer-Monteiro approach for a semidefinite programming relaxation of our formulation. For the two-dimensional version of the problem, which has applications in synthetic aperture radar interferometry (InSAR), we are able to solve instances of real-world data with a million sample points in under 10 seconds, on a personal laptop.
The last few years witnessed an increasing interest in the problem of control synthesis of nonlinear systems. A recently derived stability criterion for nonlinear systems-which has a remarkable convexity property- and...
详细信息
The last few years witnessed an increasing interest in the problem of control synthesis of nonlinear systems. A recently derived stability criterion for nonlinear systems-which has a remarkable convexity property- and the development of numerical methods for verification of positivity allows the computation-via semidefinite programming- of stabilizing controllers for the case of systems with polynomial or rational vector fields. Using the theory of semialgebraic sets these computational tools are extended in this paper for the case of polynomial or rational systems with uncertainty parameters.
In time-of-arrival (TOA) localization systems, errors caused by non-line-of-sight (NLOS) signal propagation could significantly degrade the location accuracy. Existing works on NLOS error mitigation commonly assume th...
详细信息
In time-of-arrival (TOA) localization systems, errors caused by non-line-of-sight (NLOS) signal propagation could significantly degrade the location accuracy. Existing works on NLOS error mitigation commonly assume that NLOS error statistics or the TOA measurement noise variances are known. Such information is generally unavailable in practice. The goal of this paper is to develop an NLOS error mitigation scheme without requiring such information. The core of the proposed algorithm is a constrained least-squares optimization, which is converted into a semidefinite programming (SDP) problem that can be easily solved by using the CVX toolbox. This scheme is then extended for cooperative source localization. Additionally, its performance is better than existing schemes for most of the scenarios, which will be validated via extensive simulation.
暂无评论