We survey the fastest known algorithms for learning various expressive classes of Boolean functions in the Probably Approximately Correct (PAC) learning model.
ISBN:
(纸本)3540340211
We survey the fastest known algorithms for learning various expressive classes of Boolean functions in the Probably Approximately Correct (PAC) learning model.
We provide new results for noise-tolerant and sample-efficient learning algorithms under s-concave distributions. The new class of s-concave distributions is a broad and natural generalization of log-concavity, and in...
详细信息
We provide new results for noise-tolerant and sample-efficient learning algorithms under s-concave distributions. The new class of s-concave distributions is a broad and natural generalization of log-concavity, and includes many important additional distributions, e.g., the Pareto distribution and t-distribution. This class has been studied in the context of efficient sampling, integration, and optimization, but much remains unknown about the geometry of this class of distributions and their applications in the context of learning. The challenge is that unlike the commonly used distributions in learning (uniform or more generally log-concave distributions), this broader class is not closed under the marginalization operator and many such distributions are fat-tailed. In this work, we introduce new convex geometry tools to study the properties of s-concave distributions and use these properties to provide bounds on quantities of interest to learning including the probability of disagreement between two halfspaces, disagreement outside a band, and the disagreement coefficient. We use these results to significantly generalize prior results for margin-based active learning, disagreement-based active learning, and passive learning of intersections of halfspaces. Our analysis of geometric properties of s-concave distributions might be of independent interest to optimization more broadly.
In this paper we study Temporal Difference (TD) learning with linear value function approximation. The classic TD algorithm is known to be unstable with linear function approximation and off-policy learning. Recently ...
详细信息
ISBN:
(纸本)9781479945528
In this paper we study Temporal Difference (TD) learning with linear value function approximation. The classic TD algorithm is known to be unstable with linear function approximation and off-policy learning. Recently developed Gradient TD (GTD) algorithms have addressed this problem successfully. Despite their prominent properties of good scalability and convergence to correct solutions, they inherit the potential weakness of slow convergence as they are a stochastic gradient descent algorithm. Accelerated stochastic gradient descent algorithms have been developed to speed up convergence, while still keeping computational complexity low. In this work, we develop an accelerated stochastic gradient descent method for minimizing the Mean Squared Projected Bellman Error (MSPBE), and derive a bound for the Lipschitz constant of the gradient of the MSPBE, which plays a critical role in our proposed accelerated GTD algorithms. Our comprehensive numerical experiments demonstrate promising performance in solving the policy evaluation problem, in comparison to the GTD algorithm family. In particular, accelerated TDC surpasses state-of-the-art algorithms.
In this paper, we provide the first comprehensive comparison of methods for part-of-speech tagging and chunking for Hindi. We present an analysis of the application of three major learning algorithms (viz. Maximum Ent...
详细信息
ISBN:
(纸本)9781424429271
In this paper, we provide the first comprehensive comparison of methods for part-of-speech tagging and chunking for Hindi. We present an analysis of the application of three major learning algorithms (viz. Maximum Entropy Models [2] [9], Conditional Random Fields [12] and Support Vector Machines [8]) to part-of-speech tagging and chunking for Hindi Language using datasets of different sizes. The use of language independent features make this analysis more general and capable of concluding important results for similar South and South East Asian Languages. The results show that CRFs outperform SVMs and Maxent in terms of accuracy. We are able to achieve an accuracy of 92.26% for part-of-speech tagging and 93.57% for chunking using Conditional Random Fields algorithm. The corpus we have used had 138177 annotated instances for training. We report results for three learning algorithms by varying various conditions (clustering, BIEO notation vs. BIES notation, multiclass methods for SVMs etc.) and present an extensive analysis of the whole process. These results will give future researchers an insight into how to shape their research keeping in mind the comparative performance of major algorithms on datasets of various sizes and in various conditions.
A general class of no-regret learning algorithms, called Phi-no-regret learning algorithms is defined, which spans the spectrum from no-internal-regret learning to no-external-regret learning, and beyond. Phi describe...
详细信息
ISBN:
(纸本)3540407200
A general class of no-regret learning algorithms, called Phi-no-regret learning algorithms is defined, which spans the spectrum from no-internal-regret learning to no-external-regret learning, and beyond. Phi describes the set of strategies to which the play of a learning algorithm is compared: a learning algorithm satisfies Phi-no-regret iff no regret is experienced for playing as the algorithm prescribes, rather than playing according to any of the transformations of the algorithm's play prescribed by elements of A Analogously, a class of game-theoretic equilibria, called Phi-equilibria, is defined, and it is shown that the empirical distribution of play of Phi-no-regret algorithms converges to the set of Phi-equilibria. Perhaps surprisingly, the strongest form of no-regret algorithms in this class are no-internal-regret algorithms. Thus, the tightest game-theoretic solution concept to which Phi-no-regret algorithms (provably) converge is correlated equilibrium. In particular, Nash equilibrium is not a necessary outcome of learning via any Phi-no-regret learning algorithms.
Decision tree learning algorithms such as CART are generally based on heuristics that maximizes the purity gain greedily. Though these algorithms are practically successful, theoretical properties such as consistency ...
详细信息
Decision tree learning algorithms such as CART are generally based on heuristics that maximizes the purity gain greedily. Though these algorithms are practically successful, theoretical properties such as consistency are far from clear. In this paper, we discover that the most serious obstacle encumbering consistency analysis for decision tree learning algorithms lies in the fact that the worst-case purity gain, i.e., the core heuristics for tree splitting, can be zero. Based on this recognition, we present a new algorithm, named Grid Classification And Regression Tree (GridCART), with a provable consistency rate O(n(-1)/((d+2))) which is the first consistency rate proved for heuristic tree learning algorithms.
The availability of mobile devices without a keypad like Apple's iPad and iPhone grows continuously and the demand for sophisticated input methods with them. In this paper we present classifiers for on-line handwr...
详细信息
ISBN:
(纸本)9783642161100
The availability of mobile devices without a keypad like Apple's iPad and iPhone grows continuously and the demand for sophisticated input methods with them. In this paper we present classifiers for on-line handwriting recognition based on SVM and kNN algorithms and provide a comparison of the different classifiers using the freely available handwriting corpus UjiPenchars2. We further investigate how their performance can be improved by parallelization and how these improvements can be utilized on a mobile device.
Locally Linear Model Tree (LOLIMOT) and Piecewise Linear Network (PLN) learning algorithms are two approaches in local linear modeling use different algorithm in each part of training phase. PLN learning is more depen...
详细信息
ISBN:
(纸本)9780780397958
Locally Linear Model Tree (LOLIMOT) and Piecewise Linear Network (PLN) learning algorithms are two approaches in local linear modeling use different algorithm in each part of training phase. PLN learning is more depended on training data than LOIMOT and needs rich training data set. In PLN learning no division test is needed and it causes this algorithm to be much faster than LOLIMOT, but it may create adjacent neurons that would lead to singularity in regression matrix. In LOLIMOT, because of regular splitting of input space, this problem does not occur and always it leads to acceptable output error, but needs large number of neuron. Therefore, PILIMOT learning algorithm is introduced as modified combination of these two main Locally Linear approaches. This new method takes suitable error and neuron number from both of algorithms and leads to efficient network which is applicable to identify all functions. Simulation results show the advantage and behavior of new method.
The design of efficient differentially private (DP) learning algorithms with dimension-independent learning guarantees has been one of the central challenges in the field of privacy-preserving machine learning. Existi...
详细信息
The design of efficient differentially private (DP) learning algorithms with dimension-independent learning guarantees has been one of the central challenges in the field of privacy-preserving machine learning. Existing algorithms either suffer from weak generalization guarantees, restrictive model assumptions, or quite large computation cost. In non-private learning, dimension-independent generalization guarantees based on the notion of confidence margin were shown to be the most informative and useful learning guarantees. This motivates a systematic study of DP learning algorithms with confidence-margin generalization guarantees. A recent work has started exploring this direction in the context of linear and kernel-based classification as well as certain classes of neural networks (NNs). Despite showing several positive results, a number of fundamental questions are still open. We identify two major open problems related to DP margin-based learning algorithms. The first problem relates to the design of algorithms with more favorable computational cost. The second one pertains to the question of achieving margin guarantees for NNs under DP with no explicit dependence on the network size.
Incremental learning (IL) plays a key role in many real-world applications where data arrives over time. It is mainly concerned with learning models in an ever-changing environment. In this paper, we review some of th...
详细信息
ISBN:
(纸本)9781424412099
Incremental learning (IL) plays a key role in many real-world applications where data arrives over time. It is mainly concerned with learning models in an ever-changing environment. In this paper, we review some of the incremental learning algorithms and evaluate them within the same experimental settings in order to provide as objective comparative study as possible. These algorithms include fuzzy ARTMAP, nearest generalized exemplar, growing neural gas, generalized fuzzy min-max neural network, and IL based on function decomposition (ILFD).
暂无评论