检索结果-内蒙古大学图书馆

JOURNAL OF MACHINE LEARNING RESEARCH 2016年第1期17卷 1-49页

作者： Nie, Jiazhong Kotlowski, Wojciech Warmuth, Manfred K. Univ Calif Santa Cruz Dept Comp Sci Santa Cruz CA 95064 USA Poznan Univ Tech Inst Comp Sci Poznan Poland

We investigate the online version of Principle Component Analysis (PCA), where in each trial t the learning algorithm chooses a k-dimensional subspace, and upon receiving the next instance vector x(t), suffers the "compression loss", which is the squared Euclidean distance between this instance and its projection into the chosen subspace. When viewed in the right parameterization, this compression loss is linear, i.e. it can be rewritten as tr(W(t)x(t)x(t)(inverted perpendicular)), where W-t is the parameter of the algorithm and the outer product x(t)x(t)(inverted perpendicular) (with parallel to x(t)parallel to <= 1) is the instance matrix. In this paper generalize PCA to arbitrary positive definite instance matrices X-t with the linear loss tr(WtXt). We evaluate online algorithms in terms of their worst-case regret, which is a bound on the additional total loss of the online algorithm on all instances matrices over the compression loss of the best k-dimensional subspace (chosen in hindsight). We focus on two popular online algorithms for generalized PCA: the gradient Descent (GD) and matrix exponentiated gradient (MEG) algorithms. We show that if the regret is expressed as a function of the number of trials, then both algorithms are optimal to within a constant factor on worst-case sequences of positive definite instances matrices with trace norm at most one (which subsumes the original PCA problem with outer products). This is surprising because MEG is believed be suboptimal in this case. We also show that when considering regret bounds as a function of a loss budget, then MEG remains optimal and strictly outperforms GD when the instance matrices are trace norm bounded. Next, we consider online PCA when the adversary is allowed to present the algorithm with positive semide finite instance matrices whose largest eigenvalue is bounded (rather than their trace which is the sum of their eigenvalues). Again we can show that MEG is optimal and strictly better than GD i

关键词： online learning regret bounds expert setting k-sets PCA gradient Descent matrix exponentiated gradient algorithm

来源：评论

学校读者我要写书评

暂无评论

Online PCA with optimal regret

The Journal of Machine Learning Research

引用

The Journal of Machine Learning Research 2016年第1期17卷

作者： Kevin Murphy Bernhard Schölkopf Jiazhong Nie Wojciech Kotłowski Manfred K. Warmuth Google MPI for Intelligent Systems Department of Computer Science University of California Santa Cruz Institute of Computing Science Poznan University of Technology Poland

We investigate the online version of Principle Component Analysis (PCA), where in each trial t the learning algorithm chooses a k-dimensional subspace, and upon receiving the next instance vector xt, suffers the "compression loss", which is the squared Euclidean distance between this instance and its projection into the chosen subspace. When viewed in the right parameterization, this compression loss is linear, i.e. it can be rewritten as tr(Wtxtxt⊤), where Wt is the parameter of the algorithm and the outer product xtxt⊤ (with ||xt|| ≤ 1) is the instance matrix. In this paper generalize PCA to arbitrary positive definite instance matrices Xt with the linear loss tr(WtXt).We evaluate online algorithms in terms of their worst-case regret, which is a bound on the additional total loss of the online algorithm on all instances matrices over the compression loss of the best k-dimensional subspace (chosen in hindsight). We focus on two popular online algorithms for generalized PCA: the gradient Descent (GD) and matrix exponentiated gradient (MEG) algorithms. We show that if the regret is expressed as a function of the number of trials, then both algorithms are optimal to within a constant factor on worst-case sequences of positive definite instances matrices with trace norm at most one (which subsumes the original PCA problem with outer products). This is surprising because MEG is believed be suboptimal in this case. We also show that when considering regret bounds as a function of a loss budget, then MEG remains optimal and strictly outperforms GD when the instance matrices are trace norm ***, we consider online PCA when the adversary is allowed to present the algorithm with positive semidefinite instance matrices whose largest eigenvalue is bounded (rather than their trace which is the sum of their eigenvalues). Again we can show that MEG is optimal and strictly better than GD in this setting.

关键词： PCA expert setting gradient descent k-sets matrix exponentiated gradient algorithm online learning regret bounds

来源：评论

学校读者我要写书评

暂无评论

Online variance minimization

引用

MACHINE LEARNING 2012年第1期87卷 1-32页

作者： Warmuth, Manfred K. Kuzmin, Dima UC Calif Santa Cruz CA USA Google Mountain View CA USA

We consider the following type of online variance minimization problem: In every trial t our algorithms get a covariance matrix C-t and try to select a parameter vector w(t-1) such that the total variance over a sequence of trials Sigma(T)(t=1)(w(t-1))(Tau)C(t)w(t-1) is not much larger than the total variance of the best parameter vector u chosen in hindsight. Two parameter spaces in R-n are considered-the probability simplex and the unit sphere. The first space is associated with the problem of minimizing risk in stock portfolios and the second space leads to an online calculation of the eigenvector with minimum eigenvalue of the total covariance matrix Sigma C-T(t=1)t. For the first parameter space we apply the exponentiated gradient algorithm which is motivated with a relative entropy regularization. In the second case, the algorithm has to maintain uncertainty information over all unit directions u. For this purpose, directions are represented as dyads uu(Tau) and the uncertainty over all directions as a mixture of dyads which is a density matrix. The motivating divergence for density matrices is the quantum version of the relative entropy and the resulting algorithm is a special case of the matrix exponentiated gradient algorithm. In each of the two cases we prove bounds on the additional total variance incurred by the online algorithm over the best offline parameter.

关键词： Hedge algorithm Weighted majority algorithm Online learning Expert setting Density matrix matrix exponentiated gradient algorithm Quantum relative entropy

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：