the paper gives an overview of an inter-disciplinary research project whose goal is to elucidate the complex phenomenon of expressive music performance withthe help of machinelearning and automated discovery methods...
详细信息
ISBN:
(纸本)3540001883
the paper gives an overview of an inter-disciplinary research project whose goal is to elucidate the complex phenomenon of expressive music performance withthe help of machinelearning and automated discovery methods. the general research questions that guide the project are laid out, and some of the most important results achieved so far are briefly summarized (with an emphasis on the most recent and still very speculative work). A broad view of the discovery process is given, from data acquisition issues through data visualization to inductive model building and pattern discovery. It is shown that it is indeed possible for a machine to make novel and interesting discoveries even in a domain like music. the report closes with a few general lessons learned and withthe identification of a number of open and challenging research problems.
A machinelearning technique called Graph-Based Induction (GBI) extracts typical patterns from graph data by stepwise pair expansion (pairwise chunking). Because of its greedy search strategy, it is very efficient but...
详细信息
ISBN:
(纸本)3540001883
A machinelearning technique called Graph-Based Induction (GBI) extracts typical patterns from graph data by stepwise pair expansion (pairwise chunking). Because of its greedy search strategy, it is very efficient but suffers from incompleteness of search. Improvement is made on its search capability without imposing much computational complexity by 1) incorporating a beam search, 2) using a different evaluation function to extract patterns that are more discriminatory than those simply occurring frequently, and 3) adopting canonical labeling to enumerate identical patterns accurately. this new algorithm, now called Beam-wise GBI, B-GBI for short, was tested against a small DNA dataset from UCI repository and shown successful in extracting discriminatory substructures.
the proceedings contain 58 papers. the special focus in this conference is on OCR Features, Systems and Handwriting recognition. the topics include: Relating statistical image differences and degradation features;scri...
ISBN:
(纸本)3540440682
the proceedings contain 58 papers. the special focus in this conference is on OCR Features, Systems and Handwriting recognition. the topics include: Relating statistical image differences and degradation features;script identification in printed bilingual documents;optimal feature extraction for bilingual OCR;machinerecognition of printed kannada text;an integrated system for the analysis and the recognition of characters in ancient documents;a complete tamil optical character recognition system;distinguishing between handwritten and machine printed text in bank cheque images;multi-expert seal imprint verification system for bankcheck processing;automatic reading of traffic tickets;a stochastic model combining discrete symbols and continuous attributes and its application to handwriting recognition;top-down likelihood word image generation model for holistic word recognition;the segmentation and identification of handwriting in noisy document images;the impact of large training sets on the recognition rate of offline japanese kanji character classifiers;automatic completion of korean words for open vocabulary pen interface;using stroke-number-characteristics for improving efficiency of combined online and offline japanese character classifiers;a new criterion for choosing the best prolongation;classifier adaptation with non-representative training data;a learning pseudo bayes discriminant method based on difference distribution of feature vectors;a complementarity-based analysis;discovering rules for dynamic configuration of multi-classifier systems;revisiting the majority voting system and its variations;correcting for variable skew;two geometric algorithms for layout analysis;text/graphics separation revisited and a study on the document zone content classification problem.
In this paper, we synthesize the main findings of three repeat purchase modelling case studies using real-life direct marketing data. Historically, direct marketing - more recently, targeted web marketing - has been o...
详细信息
ISBN:
(纸本)9729805067
In this paper, we synthesize the main findings of three repeat purchase modelling case studies using real-life direct marketing data. Historically, direct marketing - more recently, targeted web marketing - has been one of the most popular domains for the exploration of the feasibility and the viable use of novel business intelligence techniques. Many a datamining technique has been field tested in the direct marketing domain. this can be explained by the (relatively) low-cost availability of recency, frequency, monetary (RFM) and several other customer relationship data, the (relatively) well-developed understanding of the task and the domain, the clearly identifiable costs and benefits, and because the results can often be readily applied to obtain a high return on investment. the purchase incidence modelling cases reported on in this paper were in the first place undertaken to trial run state-of-the-art supervised Bayesian learning multilayer perceptron (MLP) and least squares support vector machine (LS-SVM) classifiers. For each of the cases, we also aimed at exploring the explanatory power (relevance) of the available RFM and other customer relationship related variable operationalizations for predicting purchase incidence in the context of direct marketing.
In patternrecognition, the goal of classification can be achieved from two different types of learning strategy-discriminative teaming and informative learning. Discriminative learning focuses on extracting the discr...
详细信息
ISBN:
(纸本)9810475241
In patternrecognition, the goal of classification can be achieved from two different types of learning strategy-discriminative teaming and informative learning. Discriminative learning focuses on extracting the discriminative information between classes. Informative learning emphasizes the learning of the class information such as class densities. We review major discriminative learning methods, namely, principal component analysis (PCA), linear discriminant analysis (LDA), minimum classification error (MCE) training algorithm and support vector machine (SVM) and one informative learning method-Gaussian mixture models (GMM). We also discuss the combination of the two types of learning and give the corresponding experiments results.
Recently, Independent Component Analysis (ICA) has been applied to not only problems of blind signal separation, but also feature extraction of patterns. However, the effectiveness of features extracted by ICA (ICA fe...
详细信息
Recently, Independent Component Analysis (ICA) has been applied to not only problems of blind signal separation, but also feature extraction of patterns. However, the effectiveness of features extracted by ICA (ICA features) has not been verified yet. As one of the reasons, it is considered that ICA features are obtained by increasing their independence rather than by increasing their class separability. Hence, we can expect that high-performance pattern features are obtained by introducing supervisor into conventional ICA algorithms such that the class separability of features is enhanced.. In this work, we propose SICA by maximizing Mahalanobis distance between classes. Moreover, we propose a new distance measure in which each ICA feature is weighted by the power of principal components consisting of the ICA feature. In the recognition experiments, we demonstrate that the better recognition accuracy for two data sets in UCI machinelearning Repository is attained when using features extracted by the proposed SICA.
Studies in machinelearning, datamining, and pattern classification often use a technique to select relevant features from a large data set. this technique is known as Feature subset selection. this feature selection...
详细信息
ISBN:
(纸本)0780370783
Studies in machinelearning, datamining, and pattern classification often use a technique to select relevant features from a large data set. this technique is known as Feature subset selection. this feature selection technique is performed in order to reduce hypothesis search space, to reduce storage, and enhance the performance of the datamining, or machinelearning algorithms. In recent years researchers have been actively involved and are focusing on this particular problem from the perspective of machinelearning. this paper will briefly study the existing approaches to select features. Also, we will study the effectiveness of granular information to feature selection. We will also propose a simple feature elimination based algorithm that uses granular information.
Information extraction (IE) from semi-structured Web documents is a critical issue for information integration systems on the Internet. Previous work in wrapper induction aim to solve this problem by applying machine ...
详细信息
this paper describes a Nearest Neighbour procedure for variable selection in function approximation, pattern classification, and time series prediction. Given a training set of input/output vector pairs the procedure ...
详细信息
作者:
Geurts, PierreUniversity of Liège
Department of Electrical and Computer Engineering Institut Montefiore Sart-Tilman B28 LiègeB4000 Belgium
In this paper, we propose some new tools to allow machinelearning classifiers to cope with time series data. We first argue that many time-series classification problems can be solved by detecting and combining local...
详细信息
暂无评论