The scalability of statistical estimators is of increasing importance in modern applications. One approach to implementing scalable algorithms is to compress data into a low dimensional latent space using dimension re...
详细信息
The scalability of statistical estimators is of increasing importance in modern applications. One approach to implementing scalable algorithms is to compress data into a low dimensional latent space using dimension reduction methods. In this paper, we develop an approach for dimension reduction that exploits the assumption of low rank structure in high dimensional data to gain both computational and statistical advantages. We adapt recent randomized low-rank approximation algorithms to provide an efficient solution to principal component analysis (PCA), and we use this efficient solver to improve estimation in large-scale linear mixed models (LMM) for association mapping in statistical genomics. A key observation in this paper is that randomization serves a dual role, improving both computational and statistical performance by implicitly regularizing the covariance matrix estimate of the random effect in an LMM. These statistical and computational advantages are highlighted in our experiments on simulated data and large-scale genomic studies.
Latent factor models are the canonical statistical tool for exploratory analyses of low-dimensional linear structure for a matrix of p features across n samples. We develop a structured Bayesian group factor analysis ...
详细信息
Latent factor models are the canonical statistical tool for exploratory analyses of low-dimensional linear structure for a matrix of p features across n samples. We develop a structured Bayesian group factor analysis model that extends the factor model to multiple coupled observation matrices; in the case of two observations, this reduces to a Bayesian model of canonical correlation analysis. Here, we carefully define a structured Bayesian prior that encourages both element-wise and column-wise shrinkage and leads to desirable behavior on high-dimensional data. In particular, our model puts a structured prior on the joint factor loading matrix, regularizing at three levels, which enables element-wise sparsity and unsupervised recovery of latent factors corresponding to structured variance across arbitrary subsets of the observations. In addition, our structured prior allows for both dense and sparse latent factors so that covariation among either all features or only a subset of features can be recovered. We use fast parameter-expanded expectation-maximization for parameter estimation in this model. We validate our method on simulated data with substantial structure. We show results of our method applied to three high-dimensional data sets, comparing results against a number of state-of-the-art approaches. These results illustrate useful properties of our model, including i) recovering sparse signal in the presence of dense effects; ii) the ability to scale naturally to large numbers of observations; iii) flexible observation- and factor-specific regularization to recover factors with a wide variety of sparsity levels and percentage of variance explained; and iv) tractable inference that scales to modern genomic and text data sizes.
Optimizing multivariate performance measure is an important task in machinelearning. Joachims (2005) introduced a Support Vector Method whose underlying optimization problem is commonly solved by cutting plane method...
详细信息
Optimizing multivariate performance measure is an important task in machinelearning. Joachims (2005) introduced a Support Vector Method whose underlying optimization problem is commonly solved by cutting plane methods (CPMs) such as SVM-Perf and BMRM. It can be shown that CPMs converge to an ε accurate solution in O(1/λε) iterations, where λ is the trade-off parameter between the regularizer and the loss function. Motivated by the impressive convergence rate of CPM on a number of practical problems, it was conjectured that these rates can be further improved. We disprove this conjecture in this paper by constructing counter examples. However, surprisingly, we further discover that these problems are not inherently hard, and we develop a novel smoothing strategy, which in conjunction with Nesterov's accelerated gradient method, can find an ε accurate solution in O* (min{1/ε, 1/√λε}) iterations. Computationally, our smoothing technique is also particularly advantageous for optimizing multivariate performance scores such as precision/recall break-even point and ROCArea; the cost per iteration remains the same as that of CPMs. Empirical evaluation on some of the largest publicly available data sets shows that our method converges significantly faster than CPMs without sacrificing generalization ability.
A special track on directions in artificial intelligence at a Microsoft Research Faculty Summit included a panel discussion on key challenges and opportunities ahead in AI theory and practice. This article captures th...
详细信息
A special track on directions in artificial intelligence at a Microsoft Research Faculty Summit included a panel discussion on key challenges and opportunities ahead in AI theory and practice. This article captures the conversation among eight leading researchers. [PUBLICATION ABSTRACT]
Structured attributes have domains (value sets) that are partially ordered sets, typically *** attributes allow knowledge discovery programs to incorporate background knowledge about hierarchical relationships among a...
详细信息
Constructive induction divides the problem of learning an inductive hypothesis into two intertwined searches: one-for the "best" representation space, and two-for the "best" hypothesis in that spac...
详细信息
暂无评论