Non-linear models recently receive a lot of attention as people are starting to discover the power of statistical and embedding features. However, tree-based models are seldom studied in the context of structured lear...
详细信息
ISBN:
(纸本)9781941643723
Non-linear models recently receive a lot of attention as people are starting to discover the power of statistical and embedding features. However, tree-based models are seldom studied in the context of structured learning despite their recent success on various classification and ranking tasks. In this paper, we propose S-MART, a tree-based structured learning framework based on multiple additive regression trees. S-MART is especially suitable for handling tasks with dense features, and can be used to learn many different structures under various loss functions. We apply S-MART to the task of tweet entity linking - a core component of tweet information extraction, which aims to identify and link name mentions to entities in a knowledge base. A novel inference algorithm is proposed to handle the special structure of the task. The experimental results show that S-MART significantly outperforms state-of-the-art tweet entity linking systems.
Prior use of machine learning in genre classification used a list of labels as classification categories. However, genre classes are often organised into hierarchies, e.g., covering the subgenres of fiction. In this p...
详细信息
ISBN:
(纸本)9781932432664
Prior use of machine learning in genre classification used a list of labels as classification categories. However, genre classes are often organised into hierarchies, e.g., covering the subgenres of fiction. In this paper we present a method of using the hierarchy of labels to improve the classification accuracy. As a testbed for this approach we use the Brown Corpus as well as a range of other corpora, including the BNC, HGC and Syracuse. The results are not encouraging: apart from the Brown corpus, the improvements of our structural classifier over the flat one are not statistically significant. We discuss the relation between structural learning performance and the visual and distributional balance of the label hierarchy, suggesting that only balanced hierarchies might profit from structural learning.
Successful application of multi-view co-training algorithms relies on the ability to factor the available features into views that are compatible and uncorrelated. This can potentially preclude their use on problems s...
详细信息
ISBN:
(纸本)1932432132
Successful application of multi-view co-training algorithms relies on the ability to factor the available features into views that are compatible and uncorrelated. This can potentially preclude their use on problems such as coreference resolution that lack an obvious feature split. To bootstrap coreference classifiers, we propose and evaluate a single-view weakly supervised algorithm that relies on two different learning algorithms in lieu of the two different views required by co-training. In addition, we investigate a method for ranking unlabeled instances to be fed back into the bootstrapping loop as labeled data, aiming to alleviate the problem of performance deterioration that is commonly observed in the course of bootstrapping.
Gradient descent training of sigmoidal feed-forward neural networks on binary mappings often gets stuck with someout puts totally wrong. This is because a sum-squared-error cost function leads to weight updates that d...
详细信息
Gradient descent training of sigmoidal feed-forward neural networks on binary mappings often gets stuck with someout puts totally wrong. This is because a sum-squared-error cost function leads to weight updates that depend on the derivative of the output sigmoid which goes to zero as the output approaches maximal error. Although it is easy to understand the cause, the best remedy is not so obvious. Common solutions involve modifying the training data, deviating from true gradient descent, or changing the cost function. In general, finding the best learning procedures for particular classes of problem is difficult because each usually depends on a number of interacting parameters that need to be set to optimal values for a fair comparison. In this paper I shall use simulated evolution to optimise all the relevant parameters, and come to a clear conclusion concerning the most efficient approach for learning binary mappings. (C) 2003 Elsevier Science Ltd. All rights reserved.
In our media-driven world the perception of companies and institutions in the media is of major importance. The creation of press reviews analyzing the media response to company-related events is a complex and time-co...
详细信息
ISBN:
(纸本)9781945626036
In our media-driven world the perception of companies and institutions in the media is of major importance. The creation of press reviews analyzing the media response to company-related events is a complex and time-consuming task. In this demo we present a system that combines advanced text mining and machine learning approaches in an extensible press review system. The system collects documents from heterogeneous sources and enriches the documents applying different mining, filtering, classification, and aggregation algorithms. We present a system tailored to the needs of the press department of a major German University. We explain how the different components have been trained and evaluated. The system enables us demonstrating the live analyzes of news and social media streams as well as the strengths of advanced text mining algorithms for creating a comprehensive media analysis.
learning general representations of text is a fundamental problem for many natural language understanding (NLU) tasks. Previously, researchers have proposed to use language model pre-training and multi-task learning t...
详细信息
ISBN:
(纸本)9781950737901
learning general representations of text is a fundamental problem for many natural language understanding (NLU) tasks. Previously, researchers have proposed to use language model pre-training and multi-task learning to learn robust representations. However, these methods can achieve sub-optimal performance in low-resource scenarios. Inspired by the recent success of optimization-based meta-learning algorithms, in this paper, we explore the model-agnostic meta-learning algorithm (MAML) and its variants for low-resource NLU tasks. We validate our methods on the GLUE benchmark and show that our proposed models can outperform several strong baselines. We further empirically demonstrate that the learned representations can be adapted to new tasks efficiently and effectively.
Machine learning(ML) makes machines independent and self-learning component. Researchers applying machine learning algorithms to solve various real word problems in various domains. Nowadays agriculture affects by var...
详细信息
learning algorithms were implemented for the elimination of strong hiss found in old records and of impulse noise affecting transmitted audio signals. The rough-set method was tested with regard to the automatic setti...
详细信息
learning algorithms were implemented for the elimination of strong hiss found in old records and of impulse noise affecting transmitted audio signals. The rough-set method was tested with regard to the automatic setting of the cutoff threshold in the spectral filtration of noisy audio. Applied methods, results of musical signal processing, and conclusions are presented.
We have designed several new lazy learning algorithms for learning problems with many binary features and classes. This particular type of learning task can be found in many machine learning applications but is of spe...
详细信息
ISBN:
(纸本)3540649921
We have designed several new lazy learning algorithms for learning problems with many binary features and classes. This particular type of learning task can be found in many machine learning applications but is of special importance for machine learning of natural language. Besides pure instance-based learning we also consider prototype-based learning, which has the big advantage of a large reduction of the required memory and processing time for classification. As an application for our learning algorithms we have chosen natural language database interfaces. In our interface architecture the machine learning module replaces an elaborate semantic analysis component. The learning task is to select the correct command class based on semantic features extracted from the user input. We use an existing German natural language interface to a production planning and control system as a case study for our evaluation and compare the results achieved by the different lazy learning algorithms.
Generative adversarial imitation learning (GAIL) is a popular inverse reinforcement learning approach for jointly optimizing policy and reward from expert trajectories. A primary question about GAIL is whether applyin...
详细信息
Generative adversarial imitation learning (GAIL) is a popular inverse reinforcement learning approach for jointly optimizing policy and reward from expert trajectories. A primary question about GAIL is whether applying a certain policy gradient algorithm to GAIL attains a global minimizer (i.e., yields the expert policy), for which existing understanding is very limited. Such global convergence has been shown only for the linear (or linear-type) MDP and linear (or linearizable) reward. In this paper, we study GAIL under general MDP and for nonlinear reward function classes (as long as the objective function is strongly concave with respect to the reward parameter). We characterize the global convergence with a sublinear rate for a broad range of commonly used policy gradient algorithms, all of which are implemented in an alternating manner with stochastic gradient ascent for reward update, including projected policy gradient (PPG)-GAIL, Frank-Wolfe policy gradient (FWPG)-GAIL, trust region policy optimization (TRPO)-GAIL and natural policy gradient (NPG)-GAIL. This is the first systematic theoretical study of GAIL for global convergence.
暂无评论