Recent results have empirically proved that, given several related tasks with different data distributions and an algorithm that can utilize both the task-specific and cross-task knowledge, clustering performance of e...
详细信息
ISBN:
(纸本)9781467322164
Recent results have empirically proved that, given several related tasks with different data distributions and an algorithm that can utilize both the task-specific and cross-task knowledge, clustering performance of each task can be significantly enhanced. This kind of unsupervised learning method is called multi-task clustering. We focus on tackling the multi-task clustering problem via a 3-factor nonnegative matrix factorization. The object of our approach consists of two parts: (1) Within-task co-clustering: co-cluster the data in the input space individually. (2) Cross-task regularization: Learn and refine the relations of feature spaces among different tasks. We show that our approach has a sound information theoretic background and the experimental evaluation shows that it outperforms many state-of-the-art single-task or multi-task clustering methods.
Identifying the subject's simple judging states from fMRI data is the basis of studying complex logical relationship and has great theoretical significance. In this paper, we study judging states from fMRI data in...
详细信息
Ensuring reliability as the electrical grid morphs into the "smart grid" will require innovations in how we assess the state of the grid, for the purpose of proactive maintenance, rather than reactive mainte...
详细信息
ISBN:
(纸本)9781450308427
Ensuring reliability as the electrical grid morphs into the "smart grid" will require innovations in how we assess the state of the grid, for the purpose of proactive maintenance, rather than reactive maintenance;in the future, we will not only react to failures, but also try to anticipate and avoid them using predictive modeling (machinelearning and datamining) techniques. To help in meeting this challenge, we present the Neutral Online Visualization-aided Autonomic evaluation framework (NOVA) for evaluating machinelearning and datamining algorithms for preventive maintenance on the electrical grid. NOVA has three stages provided through a unified user interface: evaluation of input data quality, evaluation of machinelearning and datamining results, and evaluation of the reliability improvement of the power grid. A prototype version of NOVA has been deployed for the power grid in New York City, and it is able to evaluate machinelearning and datamining systems effectively and efficiently. Copyright 2011 ACM.
Similarity functions are widely used in many machinelearning or patternrecognition tasks. We consider here a recent framework for binary classification, proposed by Balcan et al., allowing to learn in a potentially ...
详细信息
ISBN:
(纸本)9783642244704;9783642244711
Similarity functions are widely used in many machinelearning or patternrecognition tasks. We consider here a recent framework for binary classification, proposed by Balcan et al., allowing to learn in a potentially non geometrical space based on good similarity functions. This framework is a generalization of the notion of kernels used in support vector machines in the sense that allows one to use similarity functions that do not need to be positive semi-definite nor symmetric. The similarities are then used to define an explicit projection space where a linear classifier with good generalization properties can be learned. In this paper, we propose to study experimentally the usefulness of similarity based projection spaces for transfer learning issues. More precisely, we consider the problem of domain adaptation where the distributions generating learningdata and testdata are somewhat different. We stand in the case where no information on the test labels is available. We show that a simple renormalization of a good similarity function taking into account the testdata allows us to learn classifiers more performing on the target distribution for difficult adaptation problems. Moreover, this normalization always helps to improve the model when we try to regularize the similarity based projection space in order to move closer the two distributions. We provide experiments on a toy problem and on a real image annotation task.
In kernel-based machinelearning algorithms, we can learn a combination of different kernel functions in order to obtain a similarity measure that better matches the underlying problem instead of using a single fixed ...
详细信息
ISBN:
(纸本)9783642244704;9783642244711
In kernel-based machinelearning algorithms, we can learn a combination of different kernel functions in order to obtain a similarity measure that better matches the underlying problem instead of using a single fixed kernel function. This approach is called multiple kernel learning (MKL). In this paper, we formulate a nonlinear MKL variant and apply it for nuclei classification in tissue microarray images of renal cell carcinoma (RCC). The proposed variant is tested on several feature representations extracted from the automatically segmented nuclei. We compare our results with single-kernel support vector machines trained on each feature representation separately and three linear MKL algorithms from the literature. We demonstrate that our variant obtains more accurate classifiers than competing algorithms for RCC detection by combining information from different feature representations nonlinearly.
Multiple-instance learning (MIL) deals with learning under ambiguity, in which patterns to be classified are described by bags of instances. There has been a growing interest in the design and use of MIL algorithms as...
详细信息
ISBN:
(纸本)9783642244704
Multiple-instance learning (MIL) deals with learning under ambiguity, in which patterns to be classified are described by bags of instances. There has been a growing interest in the design and use of MIL algorithms as it provides a natural framework to solve a wide variety of patternrecognition problems. In this paper, we address MIL from a view that transforms the problem into a standard supervised learning problem via instance selection. The novelty of the proposed approach comes from its selection strategy to identify the most representative examples in the positive and negative training bags, which is based on an effective pairwise clustering algorithm referred to as dominant sets. Experimental results on both standard benchmark data sets and on multi-class image classification problems show that the proposed approach is not only highly competitive with state-of-the-art MIL algorithms but also very robust to outliers and noise.
This paper demonstrates the derivation of a clustering model for paired comparison data. Similarities for non-Euclidean, ordinal data are handled in the model such that it is capable of performing an integrated analys...
详细信息
ISBN:
(纸本)9783642244704;9783642244711
This paper demonstrates the derivation of a clustering model for paired comparison data. Similarities for non-Euclidean, ordinal data are handled in the model such that it is capable of performing an integrated analysis on real-world data with different patterns of missings. Rank-based pairwise comparison matrices with missing entries can be described and compared by means of a probabilistic mixture model defined on the symmetric group. Our EM-method offers two advantages compared to models for pairwise comparison rank data available in the literature: (i) it identifies groups in the pairwise choices based on similarity (ii) it provides the ability to analyze a data set of heterogeneous character w.r.t. to the structural properties of individal data samples. Furthermore, we devise an active learningstrategy for selecting paired comparisons that are highly informative to extract the underlying ranking of the objects. The model can be employed to predict pairwise choice probabilities for individuals and, therefore, it can be used for preference modeling.
The proceedings contain 23 papers. The topics discussed include: on the usefulness of similarity based projection spaces for transfer learning;metric anomaly detection via asymmetric risk minimization;one shot similar...
ISBN:
(纸本)9783642244704
The proceedings contain 23 papers. The topics discussed include: on the usefulness of similarity based projection spaces for transfer learning;metric anomaly detection via asymmetric risk minimization;one shot similarity metric learning for action recognition;on a non-monotonicity effect of similarity measures;section-wise similarities for clustering and outlier detection of subjective sequential data;hybrid generative-discriminative nucleus classification of renal cell carcinoma;multi-task regularization of generative similarity models;a generative dyadic aspect model for evidence accumulation clustering;an information theoretic approach to learning generative graph prototypes;impact of the initialization in tree-based fast similarity search techniques;and multiple-instance learning with instance selection via dominant sets.
In patternrecognition, the principal component analysis (PCA) is one of the most famous feature extraction methods for dimensionality reduction of high-dimensional datasets. Furthermore, Simple-PCA (SPCA) which is a ...
详细信息
ISBN:
(纸本)9781457709661
In patternrecognition, the principal component analysis (PCA) is one of the most famous feature extraction methods for dimensionality reduction of high-dimensional datasets. Furthermore, Simple-PCA (SPCA) which is a faster version of the PCA, has been carried out effectively by iterative operated learning. However, in SPCA, when input data are distributed in a complex way, SPCA might not be efficient because it is learned without class information of the dataset. Thus, SPCA cannot be said that it is optimal for classification. In this paper, we propose a new learning algorithm, which is learned with the class information of the dataset. Eigenvectors spanning eigenspace of the dataset are obtained by calculation of data variations belonging to each class. We will show the derivation of the proposed algorithm and demonstrate some experiments to compare the SPCA with the proposed algorithm by using UCI datasets.
patternrecognition has become an attractive research oriented field of the computer vision and machinelearning for the last few decades. Neural patternrecognition techniques are also being exercised for pattern rec...
详细信息
暂无评论