The consistency problem associated with a concept class C is to determine, given two sets A and B of examples, whether there exists a concept c in C such that each x in A is a positive example of c and each y in B is ...
详细信息
Introduces articles in the special issue of the 'Journal of Machine learning Research' on computational learning theory. 'Tracking a Small Set of Experts by Mixing Past Posteriors'; 'Using Confiden...
详细信息
Introduces articles in the special issue of the 'Journal of Machine learning Research' on computational learning theory. 'Tracking a Small Set of Experts by Mixing Past Posteriors'; 'Using Confidence Bounds for Exploitation-Exploration Tradeoffs.'
This paper presents some of the important models (frameworks) in computational learning theory such as learning in the limit, learning from queries and entailment and probably approximately correct learning.
ISBN:
(纸本)9781424428236
This paper presents some of the important models (frameworks) in computational learning theory such as learning in the limit, learning from queries and entailment and probably approximately correct learning.
A classical approach in multi-class pattern classification is the following. Estimate the probability distributions that generated the observations for each label class, and then label new instances by applying the Ba...
详细信息
A classical approach in multi-class pattern classification is the following. Estimate the probability distributions that generated the observations for each label class, and then label new instances by applying the Bayes classifier to the estimated distributions. That approach provides more useful information than just a class label;it also provides estimates of the conditional distribution of class labels, in situations where there is class overlap. We would like to know whether it is harder to build accurate classifiers via this approach, than by techniques that may process all data with distinct labels together. In this paper we make that question precise by considering it in the context of PAC learnability. We propose two restrictions on the PAC learning framework that are intended to correspond with the above approach, and consider their relationship with standard PAC learning. Our main restriction of interest leads to some interesting algorithms that show that the restriction is not stronger (more restrictive) than various other well-known restrictions on PAC learning. An alternative slightly milder restriction turns out to be almost equivalent to unrestricted PAC learning.
We present a new supervised learning procedure for ensemble machines, in which outputs of predictors, trained on different distributions, are combined by a dynamic classifier combination model. This procedure may be v...
详细信息
We present a new supervised learning procedure for ensemble machines, in which outputs of predictors, trained on different distributions, are combined by a dynamic classifier combination model. This procedure may be viewed as either a version of mixture of experts (Jacobs, Jordan, Nowlan, & Hinton, 1991), applied to classification, or a variant of the boosting algorithm (Schapire, 1990). As a variant of the mixture of experts, it can be made appropriate for general classification and regression problems by initializing the partition of the data set to different experts in a boostlike manner. If viewed as a variant of the boosting algorithm, its main gain is the use of a dynamic combination model for the outputs of the networks. Results are demonstrated on a synthetic example and a digit recognition task from the NIST database and compared with classical ensemble approaches.
In computational learning theory continuous efforts are made to formulate models of machine learning that are more realistic than previously available models. Two of the most popular models that have been recently pro...
详细信息
In computational learning theory continuous efforts are made to formulate models of machine learning that are more realistic than previously available models. Two of the most popular models that have been recently proposed, Valiant's PAC learning model and Angluin's query learning model, can be thought of as refinements of preceding models such as Gold's classic paradigm of identification in the limit, in which the question of how fast the learning can take place is emphasized. A considerable amount of results have been obtained within these two frameworks, resolving the learnability questions of many important classes of functions and languages. These two particular learning models are by no means comprehensive, and many important aspects of learning are not directly addressed in these models. Aiming towards more realistic theories of learning, many new models and extensions of existing learning models that attempt to formalize such aspects have been developed recently. In this paper, we will review some of these new extensions and models in computational learning theory, concentrating in particular on those proposed and studied by researchers at theory NEC Laboratory RWCP, and their colleagues at other institutions.
We establish some general results concerning PAC learning: We find a characterization of the property that any consistent algorithm is PAC. It is shown that the shrinking width property is equivalent to PUAC learnabil...
详细信息
We establish some general results concerning PAC learning: We find a characterization of the property that any consistent algorithm is PAC. It is shown that the shrinking width property is equivalent to PUAC learnability. By counterexample, PAC and PUAC learning are shown to be different concepts. We find conditions ensuring that any nearly consistent algorithm is PAC or PUAC, respectively. The VC dimension of recurrent neural networks and folding networks is infinite. For restricted inputs, however, bounds exist. The bounds for restricted inputs are transferred to folding networks. We find conditions on the probability of the input space ensuring polynomial learnability: the probability of sequences or trees has to converge to zero sufficiently fast with increasing length or height. Finally, we find an example for a concept class that requires exponentially growing sample sizes for accurate generalization.
作者:
Ona, HYagiura, MIbaraki, TKyushu Univ
Grad Sch Informat Sci & Elect Engn Dept Comp Sci & Commun Engn Fukuoka 8128581 Japan Kyoto Univ
Grad Sch Informat Dept Appl Math & Phys Kyoto 6068501 Japan
Logical analysis of data (LAD) is one of the methodologies for extracting knowledge in the form of a Boolean function f from a given pair of data sets (T, F) on attributes set S of size n, in which T (resp., F) subset...
详细信息
Logical analysis of data (LAD) is one of the methodologies for extracting knowledge in the form of a Boolean function f from a given pair of data sets (T, F) on attributes set S of size n, in which T (resp., F) subset of or equal to {0,1}(n) denotes a set of positive (resp., negative) examples for the phenomenon under consideration. In this paper, we consider the case in which extracted knowledge f has a decomposable structure;f(x)=g(x[S-0],h(x[S-1])) for some S-0,S-1 subset of or equal to S and Boolean functions g and h, where x[I] denotes the projection of vector x on 1. In order to detect meaningful decomposable structures, however, it is considered that the sizes \T\ and \F\ must be sufficiently large. In this paper, based on probabilistic analysis, we provide an index for such indispensable number of examples to detect decomposability;we claim that there exist many deceptive decomposable structures of (TF) if \T\ \F\ < 2(n-1). The computational results on synthetically generated data sets and real-world data sets show that the above index gives a good lower bound on the indispensable data size. (C) 2004 Elsevier B.V. All rights reserved.
暂无评论