The Hammersley‐Clifford (H‐C) theorem relates the factorization properties of a probability distribution to the clique structure of an undirected graph. If a density factorizes according to the clique structure of a...
The Hammersley‐Clifford (H‐C) theorem relates the factorization properties of a probability distribution to the clique structure of an undirected graph. If a density factorizes according to the clique structure of an undirected graph, the theorem guarantees that the distribution satisfies the Markov property and vice versa. We show how to generalize the H‐C theorem to different notions of decomposability and the corresponding generalized‐Markov property. Finally we discuss how our technique might be used to arrive at other generalizations of the H‐C theorem, inducing a graph semantics adapted to the modeling problem.
Semi-supervised learning algorithms have been successfully applied in many applications with scarce labeled data, by utilizing the unlabeled data. One important category is graph based semi-supervised learning algorit...
详细信息
ISBN:
(纸本)9780262195683
Semi-supervised learning algorithms have been successfully applied in many applications with scarce labeled data, by utilizing the unlabeled data. One important category is graph based semi-supervised learning algorithms, for which the performance depends considerably on the quality of the graph, or its hyperparameters. In this paper, we deal with the less explored problem of learning the graphs. We propose a graph learning method for the harmonic energy minimization method;this is done by minimizing the leave-one-out prediction error on labeled data points. We use a gradient based method and designed an efficient algorithm which significantly accelerates the calculation of the gradient by applying the matrix inversion lemma and using careful pre-computation. Experimental results show that the graph learning method is effective in improving the performance of the classification algorithm.
We introduce a framework for filtering features that employs the Hilbert-Schmidt Independence Criterion (HSIC) as a measure of dependence between the features and the labels. The key idea is that good features should ...
详细信息
ISBN:
(纸本)9781595937933
We introduce a framework for filtering features that employs the Hilbert-Schmidt Independence Criterion (HSIC) as a measure of dependence between the features and the labels. The key idea is that good features should maximise such dependence. Feature selection for various supervised learning problems (including classification and regression) is unified under this framework, and the solutions can be approximated using a backward-elimination algorithm. We demonstrate the usefulness of our method on both artificial and real world datasets.
We propose a family of clustering algorithms based on the maximization of dependence between the input variables and their cluster labels, as expressed by the Hilbert-Schmidt Independence Criterion (HSIC). Under this ...
详细信息
ISBN:
(纸本)9781595937933
We propose a family of clustering algorithms based on the maximization of dependence between the input variables and their cluster labels, as expressed by the Hilbert-Schmidt Independence Criterion (HSIC). Under this framework, we unify the geometric, spectral, and statistical dependence views of clustering, and subsume many existing algorithms as special cases (e.g. k-means and spectral clustering). Distinctive to our framework is that kernels can also be applied on the labels, which can endow them with particular structures. We also obtain a perturbation bound on the change in k-means clustering.
The "rich-club phenomenon" in complex networks is characterized when nodes of higher degree are more interconnected than nodes with lower degree. The presence of this phenomenon may indicate several interest...
详细信息
The "rich-club phenomenon" in complex networks is characterized when nodes of higher degree are more interconnected than nodes with lower degree. The presence of this phenomenon may indicate several interesting high-level network properties, such as tolerance to hub failures. Here, the authors investigate the existence of this phenomenon across the hierarchies of several real-world networks. Their simulations reveal that the presence or absence of this phenomenon in a network does not imply its presence or absence in the network's successive hierarchies, and that this behavior is even nonmonotonic in some cases. (C) 2007 American Institute of Physics.
The quality measures used in information retrieval are particularly difficult to optimize directly, since they depend on the model scores only through the sorted order of the documents returned for a given query. Thus...
详细信息
We present a kernel-based approach to the classification of time series of gene expression profiles. Our method takes into account the dynamic evolution over time as well as the temporal characteristics of the data. M...
详细信息
ISBN:
(纸本)9812564632
We present a kernel-based approach to the classification of time series of gene expression profiles. Our method takes into account the dynamic evolution over time as well as the temporal characteristics of the data. More specifically, we model the evolution of the gene expression profiles as a Linear Time Invariant (LTI) dynamical system and estimate its model parameters, A kernel on dynamical systems is then used to classify these time series. We successfully test our approach on a published dataset to predict response to drug therapy in Multiple Sclerosis patients. For phartnacogenomics, our method offers a huge potential for advanced computational tools in disease diagnosis, and disease and drug therapy outcome prognosis.
If appropriately used, prior knowledge can significantly improve the predictive accuracy of learning algorithms or reduce the amount of training data needed. In this paper we introduce a simple method to incorporate p...
详细信息
ISBN:
(纸本)1595933832
If appropriately used, prior knowledge can significantly improve the predictive accuracy of learning algorithms or reduce the amount of training data needed. In this paper we introduce a simple method to incorporate prior knowledge in support vector machines by modifying the hypothesis space rather than the optimization problem. The optimization problem is amenable to solution by the constrained concave convex procedure, which finds a local optimum. The paper discusses different kinds of prior knowledge and demonstrates the applicability of the approach in some characteristic experiments.
In contrast to the standard inductive inference setting of predictive machinelearning, in real world learning problems often the test instances are already available at training time. Transductive inference tries to ...
详细信息
ISBN:
(纸本)354045375X
In contrast to the standard inductive inference setting of predictive machinelearning, in real world learning problems often the test instances are already available at training time. Transductive inference tries to improve the predictive accuracy of learning algorithms by making use of the information contained in these test instances. Although this description of transductive inference applies to predictive learning problems in general, most transductive approaches consider the case of classification only. In this paper we introduce a transductive variant of Gaussian process regression with automatic model selection, based on approximate moment matching between training and test data. Empirical results show the feasibility and competitiveness of this approach.
The quality measures used in information retrieval are particularly difficult to optimize directly, since they depend on the model scores only through the sorted order of the documents returned for a given query. Thus...
The quality measures used in information retrieval are particularly difficult to optimize directly, since they depend on the model scores only through the sorted order of the documents returned for a given query. Thus, the derivatives of the cost with respect to the model parameters are either zero, or are undefined. In this paper, we propose a class of simple, flexible algorithms, called LambdaRank, which avoids these difficulties by working with implicit cost functions. We describe LambdaRank using neural network models, although the idea applies to any differentiable function class. We give necessary and sufficient conditions for the resulting implicit cost function to be convex, and we show that the general method has a simple mechanical interpretation. We demonstrate significantly improved accuracy, over a state-of-the-art ranking algorithm, on several datasets. We also show that LambdaRank provides a method for significantly speeding up the training phase of that ranking algorithm. Although this paper is directed towards ranking, the proposed method can be extended to any non-smooth and multivariate cost functions.
暂无评论