The workshop program of the Twenty first National Conference on Artificial Intelligence was held July 16-17, 2006 in Boston, Massachusetts. The program was chaired by Joyce Chai and Keith Decker. The titles of the 17 ...
详细信息
In classification problems, machinelearning algorithms often make use of the assumption that (dis)similar inputs lead to (dis)similar outputs. In this case, two questions naturally arise: what does it mean for two in...
详细信息
ISBN:
(纸本)3540290737
In classification problems, machinelearning algorithms often make use of the assumption that (dis)similar inputs lead to (dis)similar outputs. In this case, two questions naturally arise: what does it mean for two inputs to be similar and how can this be used in a learning algorithm? In support vector machines, similarity between input examples is implicitly expressed by a kernel function that calculates inner products in the feature space. For numerical input examples the concept of an inner product is easy to define, for discrete structures like sequences of symbolic data however these concepts are less obvious. This article describes an approach to SVM learning for symbolic data that can serve as an alternative to the bag-of-words approach under certain circumstances. This latter approach first transforms symbolic data to vectors of numerical data which are then used as arguments for one of the standard kernel functions. In contrast, we will propose kernels that operate on the symbolic data directly.
This paper presents an SVM-based learning system for information extraction (IE). One distinctive feature of our system is the use of a variant of the SVM, the SVM with uneven margins, which is particularly helpful fo...
详细信息
ISBN:
(纸本)3540290737
This paper presents an SVM-based learning system for information extraction (IE). One distinctive feature of our system is the use of a variant of the SVM, the SVM with uneven margins, which is particularly helpful for small training datasets. In addition, our approach needs fewer SVM classifiers to be trained than other recent SVM-based systems. The paper also compares our approach to several state-of-the-art systems (including rule learning and statisticallearning algorithms) on three IE benchmark datasets: CoNLL-2003, CMU seminars, and the software jobs corpus. The experimental results show that our system outperforms a recent SVM-based system on CoNLL-2003, achieves the highest score on eight out of 17 categories on the Jobs corpus, and is second best on the remaining nine.
We discuss two kernel based learningmethods, namely the Regularization Networks (RN) and the Radial Basis Function (RBF) Networks. The RNs are derived from the regularization theory, they had been studied thoroughly ...
详细信息
ISBN:
(纸本)3540290737
We discuss two kernel based learningmethods, namely the Regularization Networks (RN) and the Radial Basis Function (RBF) Networks. The RNs are derived from the regularization theory, they had been studied thoroughly from a function approximation point of view, and they posses a sound theoretical background. The RBF networks represent a model of artificial neural networks with both neuro-physiological and mathematical motivation. In addition they may be treated as a generalized form of Regularization Networks. We demonstrate the performance of both approaches on experiments, including both benchmark and real-life learning tasks. We claim that RN and RBF networks are comparable in terms of generalization error, but they differ with respect to their model complexity. The RN approach usually leads to solutions with higher number of base units, thus, the RBF networks can be used as a 'cheaper' alternative. This allows to utilize the RBF networks in modeling tasks with large amounts of data, such as time series prediction or semantic web classification.
Currently the best algorithms for transcription factor binding site prediction are severely limited in accuracy. There is good reason to believe that predictions from these different classes of algorithms could be use...
详细信息
ISBN:
(纸本)3540290737
Currently the best algorithms for transcription factor binding site prediction are severely limited in accuracy. There is good reason to believe that predictions from these different classes of algorithms could be used in conjunction to improve the quality of predictions. In this paper, we apply single layer networks, rules sets and support vector machines on predictions from 12 key algorithms. Furthermore, we use a 'window' of consecutive results in the input vector in order to contextualise the neighbouring results. Moreover, we improve the classification result with the aid of under- and over-sampling techniques. We find that support vector machines outperform each of the original individual algorithms and other classifiers employed in this work with both type of inputs, in that they maintain a better tradeoff between recall and precision.
Survival analysis is a branch of statistics concerned with the time elapsing before "failure", with diverse applications in medical statistics and the analysis of the reliability of electrical or mechanical ...
详细信息
The proceedings contain 92 papers. The special focus in this conference is on Pierre Devijver Lecture and Hybrid and Combined methods. The topics include: Some notes on twenty one 21 nearest prototype classifiers;adap...
ISBN:
(纸本)3540679464
The proceedings contain 92 papers. The special focus in this conference is on Pierre Devijver Lecture and Hybrid and Combined methods. The topics include: Some notes on twenty one 21 nearest prototype classifiers;adaptive graphical pattern recognition beyond connectionist-based approaches;current trends in grammatical inference;classifier's complexity control while training multilayer perceptrons;a framework for classifier fusion;image pattern recognition based on examples;a hybrid system for the recognition of hand-written characters;improving statistical measures of feature subsets by conventional and evolutionary approaches;selection of classifiers based on multiple classifier behaviour;the adaptive subspace map for image description and image database retrieval;adaptative automatic target recognition with SVM boosting for outlier detection;a multiresolution causal colour texture model;writer identification;syntactic pattern recognition by error correcting analysis on tree automata;offline recognition of syntax-constrained cursive handwritten text;structural classification for retrospective conversion of documents;segmentation of date field on bank cheques;grammars and discourse theory to describe and recognize mechanical assemblies;computation of the N best parse trees for weighted and stochastic context-free grammars;partitional vs hierarchical clustering using a minimum grammar complexity approach;encoding nondeterministic finite-state tree automata in sigmoid recursive neural networks;a structural matching algorithm using generalized deterministic annealing;alignment and correspondence using singular value decomposition;fast candidate elimination using machinelearning techniques and efficient alignment and correspondence using edit distance.
Predictive models have been widely used long before the development of the new field that we call data mining. Expanding application demand for data mining of ever increasing data warehouses, and the need for understa...
详细信息
暂无评论