检索结果-内蒙古大学图书馆

Low-Rank Riemannian optimization for Graph-Based Clustering Applications

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022年第9期44卷 5133-5148页

作者： Douik, Ahmed Hassibi, Babak CALTECH Dept Elect Engn Pasadena CA 91125 USA

With the abundance of data, machine learning applications engaged increased attention in the last decade. An attractive approach to robustify the statistical analysis is to preprocess the data through clustering. This paper develops a low-complexity Riemannian optimization framework for solving optimization problems on the set of positive semidefinite stochastic matrices. The low-complexity feature of the proposed algorithms stems from the factorization of the optimization variable X = YYT and deriving conditions on the number of columns of Y under which the factorization yields a satisfactory solution. The paper further investigates the embedded and quotient geometries of the resulting Riemannian manifolds. In particular, the paper explicitly derives the tangent space, Riemannian gradients and Hessians, and a retraction operator allowing the design of efficient first and second-order optimization methods for the graph-based clustering applications of interest. The numerical results reveal that the resulting algorithms present a clear complexity advantage as compared with state-of-the-art euclidean and Riemannian approaches for graph clustering applications.

关键词： Riemannian manifolds low-rank factorization graph-based clustering convex and non-convex optimization

来源：评论

学校读者我要写书评

暂无评论

Forecasting the demand of mobile clinic services at vulnerable communities based on integrated multi-source data

引用

IISE TRANSACTIONS ON HEALTHCARE SYSTEMS ENGINEERING 2021年第2期11卷 113-127页

作者： Majeed, Bilal Peng, Jiming Li, Ang Lin, Ying Delgado, Rigoberto, I Univ Houston Dept Ind Engn Houston TX 77204 USA Univ Texas El Paso Coll Business Adm El Paso TX 79968 USA

Demand forecasting plays an important role in the deployment of mobile clinic services to vulnerable communities such as school zones and census tracts as it can help the service provider to maximize its coverage under limited resources. In this paper, we consider the issue of how to predict the vaccination delinquency in schools and census tracts. Such an issue is rather challenging as the delinquency is only observed in schools for which very limited information is available;while rich demographic and economic information is available for census tracts, no observations of delinquency have been made at the census tract level. To address the above challenge, we first develop a hierarchical approach to forecast the demand for vaccinations in schools and census tracts. In the first stage of the hierarchical approach, we solve a linear optimization model to compute an association matrix that can align some common features in both census tracts and school zones. Then we use the estimated association to develop a forecasting model to predict the vaccination delinquency in both schools and census tracts. A non-convex quadratic optimization (QO) model is also proposed to find the association matrix and the forecasting model simultaneously. We also introduce an alternative update scheme for the non-convex QO and establish the convergence of the algorithm. Moreover, the two association matrices generated from the proposed approaches can be used to impute the information in the school zone data, which further allows us to apply existing forecasting models to predict the demand in school zones based on the imputed data. A case study from the Houston Independent School District (HISD) and its associated communities is reported to demonstrate the efficacy of the new models and techniques.

关键词： Demand forecasting convex and non-convex optimization algorithm design data imputation integration of multi-source data[]

来源：评论

学校读者我要写书评

暂无评论

Learning the sparse prior: Modern approaches

引用

Wiley Interdisciplinary Reviews: Computational Statistics 2024年第1期16卷 e1646-e1646页

作者： Peng, Guan-Ju Institute of Data Science and Information Computing National Chung Hsing University Taichung Taiwan

The sparse prior has been widely adopted to establish data models for numerous applications. In this context, most of them are based on one of three foundational paradigms: the conventional sparse representation, the convolutional sparse representation, and the multi-layer convolutional sparse representation. When the data morphology has been adequately addressed, a sparse representation can be obtained by solving the sparse coding problem specified by the data model. This article presents a comprehensive overview of these three models and their corresponding sparse coding problems and demonstrates that they can be solved using convex and non-convex optimization approaches. When the data morphology is not known or cannot be analyzed, it must be learned from training data, thereby formulating dictionary learning problems. This article addresses two different dictionary learning paradigms. In an unsupervised scenario, dictionary learning involves the alternating or joint resolution of sparse coding and dictionary updating. Another option is to create a recurrent neural network by unrolling algorithms designed to solve sparse coding problems. These networks can then be used in a supervised learning setting to facilitate the training of dictionaries via forward-backward optimization. This article lists numerous applications in various domains and outlines several directions for future research related to the sparse prior. This article is categorized under: Statistical Learning and Exploratory Methods of the Data Sciences > Modeling Methods Statistical and Graphical Methods of Data Analysis > Modeling Methods and Algorithms Statistical Models > nonlinear Models. © 2024 Wiley Periodicals LLC.

关键词： algorithm unrolling convex and non-convex optimization convolutional sparse model dictionary learning multi-layer convolutional sparse model recurrent neural network sparse coding sparse prior

来源：评论

学校读者我要写书评

暂无评论

A General System of Differential Equations to Model First-Order Adaptive Algorithms

引用

JOURNAL OF MACHINE LEARNING RESEARCH 2020年第1期21卷 1-42页

作者： da Silva, Andre Belotto Gazeau, Maxime Univ Aix Marseille Ctr Math & Informat 39 Rue F Joliot Curie F-13013 Marseille France Borealis AI MaRS Heritage Bldg101 Coll StSuite 350 Toronto ON M5G 1L7 Canada

First-order optimization algorithms play a major role in large scale machine learning. A new class of methods, called adaptive algorithms, were recently introduced to adjust iteratively the learning rate for each coordinate. Despite great practical success in deep learning, their behavior and performance on more general loss functions are not well understood. In this paper, we derive a non-autonomous system of differential equations, which is the continuous time limit of adaptive optimization methods. We study the convergence of its trajectories and give conditions under which the differential system, underlying all adaptive algorithms, is suitable for optimization. We discuss convergence to a critical point in the non-convex case and give conditions for the dynamics to avoid saddle points and local maxima. For convex loss function, we introduce a suitable Lyapunov functional which allows us to study its rate of convergence. Several other properties of both the continuous and discrete systems are briefly discussed. The differential system studied in the paper is general enough to encompass many other classical algorithms (such as Heavy Ball and Nesterov's accelerated method) and allow us to recover several known results for these algorithms.

关键词： Adaptive algorithms convex and non-convex optimization first-order methods differential equation forward Euler discretization

来源：评论

学校读者我要写书评

暂无评论

On the line-search gradient methods for stochastic optimization 21st

On the line-search gradient methods for stochastic optimizat...

引用

21st IFAC World Congress on Automatic Control - Meeting Societal Challenges

作者： Dvinskikh, Darina Ogaltsov, Aleksandr Gasnikov, Alexander Dvurechensky, Pavel Spokoiny, Vladimir Moscow Antiplagiat Co Higher Sch Econ Moscow Russia Weierstrass Inst Appl Anal & Stochast Berlin Germany Moscow Inst Phys & Technol Dolgoprudnyi Russia Inst Informat Transmiss Problems RAS Moscow Russia Higher Sch Econ Moscow Russia

We consider several line-search based gradient methods for stochastic optimization: a gradient and accelerated gradient methods for convex optimization and gradient method for non-convex optimization. The methods simultaneously adapt to the unknown Lipschitz constant of the gradient and variance of the stochastic approximation for the gradient. The focus of this paper is to numerically compare such methods with state-of-the-art adaptive methods which are based on a different idea of taking norm of the stochastic gradient to define the stepsize, e.g., AdaGrad and Adam. Copyright (C) 2020 The Authors.

关键词： convex and non-convex optimization stochastic optimization first-order method adaptive method gradient descent complexity bounds mini-batch

来源：评论

学校读者我要写书评

暂无评论

Improved linear embeddings via Lagrange duality

引用

MACHINE LEARNING 2019年第4期108卷 575-594页

作者： Sheth, Kshiteej Garg, Dinesh Dasgupta, Anirban Indian Inst Technol Gandhinagar 382355 Gujarat India

Near isometric orthogonal embeddings to lower dimensions are a fundamental tool in data science and machine learning. In this paper, we present the construction of such embeddings that minimizes the maximum distortion for a given set of points. We formulate the problem as a non convex constrained optimization problem. We first construct a primal relaxation and then use the theory of Lagrange duality to create a dual relaxation. We also suggest a polynomial time algorithm based on the theory of convex optimization to solve the dual relaxation provably. We provide a theoretical upper bound on the approximation guarantees for our algorithm, which depends only on the spectral properties of the dataset. We experimentally demonstrate the superiority of our algorithm compared to baselines in terms of the scalability and the ability to achieve lower distortion.

关键词： Near isometric embeddings Dimensionality reduction convex and non-convex optimization convex relaxation

来源：评论

学校读者我要写书评

暂无评论

A general system of differential equations to model first-order adaptive algorithms

The Journal of Machine Learning Research

引用

The Journal of Machine Learning Research 2020年第1期21卷 5072-5113页

作者： André Belotto Da Silva Maxime Gazeau Université Aix-Marseille Centre de Mathématiques et Informatique Marseille France Borealis AI Toronto ON

First-order optimization algorithms play a major role in large scale machine learning. A new class of methods, called adaptive algorithms, were recently introduced to adjust iteratively the learning rate for each coordinate. Despite great practical success in deep learning, their behavior and performance on more general loss functions are not well understood. In this paper, we derive a non-autonomous system of differential equations, which is the continuous time limit of adaptive optimization methods. We study the convergence of its trajectories and give conditions under which the differential system, underlying all adaptive algorithms, is suitable for optimization. We discuss convergence to a critical point in the non-convex case and give conditions for the dynamics to avoid saddle points and local maxima. For convex loss function, we introduce a suitable Lyapunov functional which allows us to study its rate of convergence. Several other properties of both the continuous and discrete systems are briefly discussed. The differential system studied in the paper is general enough to encompass many other classical algorithms (such as HEAVY BALL and NESTEROV's accelerated method) and allow us to recover several known results for these algorithms.

关键词： adaptive algorithms convex and non-convex optimization first-order methods differential equation forward Euler discretization

来源：评论

学校读者我要写书评

暂无评论

Supervised Feature Selection in Graphs with Path Coding Penalties and Network Flows

引用

JOURNAL OF MACHINE LEARNING RESEARCH 2013年第8期14卷 2449-2485页

作者： Mairal, Julien Yu, Bin Univ Calif Berkeley Dept Stat Berkeley CA 94720 USA

We consider supervised learning problems where the features are embedded in a graph, such as gene expressions in a gene network. In this context, it is of much interest to automatically select a subgraph with few connected components;by exploiting prior knowledge, one can indeed improve the prediction performance or obtain results that are easier to interpret. Regularization or penalty functions for selecting features in graphs have recently been proposed, but they raise new algorithmic challenges. For example, they typically require solving a combinatorially hard selection problem among all connected subgraphs. In this paper, we propose computationally feasible strategies to select a sparse and well-connected subset of features sitting on a directed acyclic graph (DAG). We introduce structured sparsity penalties over paths on a DAG called "path coding" penalties. Unlike existing regularization functions that model long-range interactions between features in a graph, path coding penalties are tractable. The penalties and their proximal operators involve path selection problems, which we efficiently solve by leveraging network flow optimization. We experimentally show on synthetic, image, and genomic data that our approach is scalable and leads to more connected subgraphs than other regularization functions for graphs.

关键词： convex and non-convex optimization network flow optimization graph sparsity

来源：评论

学校读者我要写书评

暂无评论

Supervised feature selection in graphs with path coding penalties and network flows

The Journal of Machine Learning Research

引用

The Journal of Machine Learning Research 2013年第1期14卷

作者： Kevin Murphy Bernhard Schölkopf Julien Mairal Bin Yu Google MPI for Intelligent Systems INRIA Grenoble Rhône-Alpes France and Department of Statistics University of California Berkeley CA Department of Statistics University of California Berkeley CA and Department of Electrical Engineering & Computer Science

We consider supervised learning problems where the features are embedded in a graph, such as gene expressions in a gene network. In this context, it is of much interest to automatically select a subgraph with few connected components; by exploiting prior knowledge, one can indeed improve the prediction performance or obtain results that are easier to interpret. Regularization or penalty functions for selecting features in graphs have recently been proposed, but they raise new algorithmic challenges. For example, they typically require solving a combinatorially hard selection problem among all connected subgraphs. In this paper, we propose computationally feasible strategies to select a sparse and well-connected subset of features sitting on a directed acyclic graph (DAG). We introduce structured sparsity penalties over paths on a DAG called "path coding" penalties. Unlike existing regularization functions that model long-range interactions between features in a graph, path coding penalties are tractable. The penalties and their proximal operators involve path selection problems, which we efficiently solve by leveraging network flow optimization. We experimentally show on synthetic, image, and genomic data that our approach is scalable and leads to more connected subgraphs than other regularization functions for graphs.

关键词： convex and non-convex optimization graph sparsity network flow optimization

来源：评论

学校读者我要写书评

暂无评论

Fast-Lipschitz optimization With Wireless Sensor Networks Applications

引用

IEEE TRANSACTIONS ON AUTOMATIC CONTROL 2011年第10期56卷 2319-2331页

作者： Fischione, Carlo KTH Royal Inst Technol Automat Control Lab ACCESS Linnaeus Ctr S-10044 Stockholm Sweden

Motivated by the need for fast computations in wireless sensor networks, the new F-Lipschitz optimization theory is introduced for a novel class of optimization problems. These problems are defined by simple qualifying properties specified in terms of increasing objective function and contractive constraints. It is shown that feasible F-Lipschitz problems have always a unique optimal solution that satisfies the constraints at equality. The solution is obtained quickly by asynchronous algorithms of certified convergence. F-Lipschitz optimization can be applied to both centralized and distributed optimization. Compared to traditional Lagrangian methods, which often converge linearly, the convergence time of centralized F-Lipschitz problems is at least superlinear. Distributed F-Lipschitz algorithms converge fast, as opposed to traditional Lagrangian decomposition and parallelization methods, which generally converge slowly and at the price of many message passings. In both cases, the computational complexity is much lower than traditional Lagrangian methods. Examples of application of the new optimization method are given for distributed estimation and radio power control in wireless sensor networks. The drawback of the F-Lipschitz optimization is that it might be difficult to check the qualifying properties. For more general optimization problems, it is suggested that it is convenient to have conditions ensuring that the solution satisfies the constraints at equality.

关键词： convex and non-convex optimization distributed optimization interference function theory multi-objective optimization parallel and distributed computation wireless sensor networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：