The Minimax Probability machine Classification (MPMC) framework [Lanckriet et al., 2002] builds classifiers by minimizing the maximum probability of misclassification, and gives direct estimates of the probabilistic a...
详细信息
ISBN:
(纸本)0262201526
The Minimax Probability machine Classification (MPMC) framework [Lanckriet et al., 2002] builds classifiers by minimizing the maximum probability of misclassification, and gives direct estimates of the probabilistic accuracy bound Ω. The only assumptions that MPMC makes is that good estimates of means and covariance matrices of the classes exist. However, as with Support Vector machines, MPMC is computationally expensive and requires extensive cross validation experiments to choose kernels and kernel parameters that give good performance. In this paper we address the computational cost of MPMC by proposing an algorithm that constructs nonlinear sparse MPMC (SMPMC) models by incrementally adding basis functions (i.e. kernels) one at a time - greedily selecting the next one that maximizes the accuracy bound Ω. SMPMC automatically chooses both kernel parameters and feature weights without using computationally expensive cross validation. Therefore the SMPMC algorithm simultaneously addresses the problem of kernel selection and feature selection (i.e. feature weighting), based solely on maximizing the accuracy bound Ω. Experimental results indicate that we can obtain reliable bounds Ω, as well as test set accuracies that are comparable to state of the art classification algorithms.
We present a novel method for approximate inference in Bayesian models and regularized risk functionals. It is based on the propagation of mean and variance derived from the Laplace approximation of conditional probab...
详细信息
ISBN:
(纸本)0262201526
We present a novel method for approximate inference in Bayesian models and regularized risk functionals. It is based on the propagation of mean and variance derived from the Laplace approximation of conditional probabilities in factorizing distributions, much akin to Minka's Expectation Propagation. In the jointly normal case, it coincides with the latter and belief propagation, whereas in the general case, it provides an optimization strategy containing Support Vector chunking, the Bayes Committee machine, and Gaussian Process chunking as special cases.
Fuzzy Extension Matrix induction is an extraction technique of fuzzy rules, which can be used in handling ambiguous classification problems related to human's thought and sense. The entire process of building heur...
详细信息
ISBN:
(纸本)0780384032
Fuzzy Extension Matrix induction is an extraction technique of fuzzy rules, which can be used in handling ambiguous classification problems related to human's thought and sense. The entire process of building heuristic algorithm based on Fuzzy Extension Matrix is dependent of three specified parameters that seriously affect the computational effort and the rule extraction accuracy. Since the value of three parameters is usually given in terms of human experience or real requirements, it is very difficult to determine its optimal value. This paper makes an initial attempt to give some guidelines of how to automatically choose these parameters by analyzing the relationship between the values of parameters and the number of rules generated.
Decision trees and extension matrixes are two methodologies for (fuzzy) rule generation. This paper gives an initial study on the comparison between the two methodologies. Their computational complexity and the qualit...
详细信息
ISBN:
(纸本)0780384032
Decision trees and extension matrixes are two methodologies for (fuzzy) rule generation. This paper gives an initial study on the comparison between the two methodologies. Their computational complexity and the quality of rule generation are analyzed. The experimental results show that the number of generated rules of the heuristic algorithm based on extension matrix is fewer than the decision tree algorithm. Moreover, regarding the testing accuracy (i.e., the generalization capability for unknown cases), experiments also show that the extension matrix method is better than the other.
This paper brings together two strands of machinelearning of increasing importance: kernel methods and highly structured data. We propose a general method for constructing a kernel following the syntactic structure o...
详细信息
This paper brings together two strands of machinelearning of increasing importance: kernel methods and highly structured data. We propose a general method for constructing a kernel following the syntactic structure of the data, as defined by its type signature in a higher-order logic. Our main theoretical result is the positive definiteness of any kernel thus defined. We report encouraging experimental results on a range of real-world data sets. By converting our kernel to a distance pseudo-metric for 1-nearest neighbour, we were able to improve the best accuracy from the literature on the Diterpene data set by more than 10%.
It is important to study the relationship between pruning algorithms and the selection of parameters in fuzzy decision tree generation for controlling the tree size. This paper selects a pruning algorithm and a method...
详细信息
ISBN:
(纸本)0780384032
It is important to study the relationship between pruning algorithms and the selection of parameters in fuzzy decision tree generation for controlling the tree size. This paper selects a pruning algorithm and a method of fuzzy decision tree generation to experimentally show the relationship for some existing databases. It aims to give some guidelines for how to select an appropriate parametric value in fuzzy decision tree generation. When a suitable parametric value is selected, the pruning for fuzzy decision tree generation seems to be unnecessary.
Many real-world classification tasks involve the prediction of multiple, inter-dependent class labels. A prototypical case of this sort deals with prediction of a sequence of labels for a sequence of observations. Suc...
详细信息
ISBN:
(纸本)1581138385
Many real-world classification tasks involve the prediction of multiple, inter-dependent class labels. A prototypical case of this sort deals with prediction of a sequence of labels for a sequence of observations. Such problems arise naturally in the context of annotating and segmenting observation sequences. This paper generalizes Gaussian Process classification to predict multiple labels by taking dependencies between neighboring labels into account. Our approach is motivated by the desire to retain rigorous probabilistic semantics, while overcoming limitations of parametric methods like Conditional Random Fields, which exhibit conceptual and computational difficulties in high-dimensional input spaces. Experiments on named entity recognition and pitch accent prediction tasks demonstrate the competitiveness of our approach.
In this paper, a novel approach of rough set-based case-based reasoning (CBR) approach is proposed to tackle the task of text categorization (TC). The initial work of integrating both feature and document reduction/se...
详细信息
ISBN:
(纸本)0780384032
In this paper, a novel approach of rough set-based case-based reasoning (CBR) approach is proposed to tackle the task of text categorization (TC). The initial work of integrating both feature and document reduction/selection in TC using rough sets and CBR properties is presented. Rough set theory is incorporated to reduce the number of feature terms through generating reducts. On the other hand, two concepts of case coverage and case reachability in CBR are used in selecting representative documents. The main contribution of this paper is that both the number of features and the documents are reduced with minimal loss of useful information. Some experiments are conducted on the text datasets of Reuters21578. The experimental results show that, although the number of feature terms and documents are reduced greatly, the problem-solving quality in terms of classification accuracy is still preserved.
Motivated by the interest in relational reinforcement learning, we introduce a novel relational Bellman update operator called REBEL. It employs a constraint logic programming language to compactly represent Markov de...
详细信息
ISBN:
(纸本)1581138385
Motivated by the interest in relational reinforcement learning, we introduce a novel relational Bellman update operator called REBEL. It employs a constraint logic programming language to compactly represent Markov decision processes over relational domains. Using REBEL, a novel value iteration algorithm is developed in which abstraction (over states and actions) plays a major role. This framework provides new insights into relational reinforcement learning. Convergence results as well as experiments are presented.
This paper provides a brief development roadmap of the neural network sensitivity analysis, from 1960's to now on. The two main streams of the sensitivity measures: partial derivative and stochastic sensitivity me...
详细信息
ISBN:
(纸本)0780384032
This paper provides a brief development roadmap of the neural network sensitivity analysis, from 1960's to now on. The two main streams of the sensitivity measures: partial derivative and stochastic sensitivity measures are compared. The partial derivative sensitivity (PD-SM) finds the rate of change of the network output with respect to parameter changes, while the stochastic sensitivity (ST-SM) finds the magnitudes of the output perturbations between the original training samples and the perturbed samples, in statistical sense. Their computational complexities are compared. Furthermore, how to evaluate multiple parameters of the neural network with or without correlation will be explored too. In addition, the differences of them in the application of supervised pattern classification problems are also discussed. The evaluations are based on three major applications of sensitivity analysis in supervised pattern classification problems: feature selection, sample selection and neural network generalization assessment ST-SM and PD-SM of the RBFNN will be used for investigations.
暂无评论