In this paper we propose a novel data clustering algorithm based on the idea of considering the individual data items as cells belonging to an uni-dimensional cellular automaton. Our proposed algorithm combines insigh...
详细信息
ISBN:
(纸本)9783642213441
In this paper we propose a novel data clustering algorithm based on the idea of considering the individual data items as cells belonging to an uni-dimensional cellular automaton. Our proposed algorithm combines insights from both social segregation models based on Cellular Automata theory, where the data items themselves are able to move autonomously in lattices, and also from Ants Clustering algorithms, particularly in the idea of distributing at random the data items to be clustered in lattices. We present a series of experiments with both synthetic and real datasets in order to study empirically the convergence and performance results. these experimental results are compared to the obtained by conventional clustering algorithms.
As facial expression is an essential way to convey human's feelings, in this paper, a dynamic selection ensemble learning method is proposed to analyze their emotion automatically. A feature selection algorithm is...
详细信息
As facial expression is an essential way to convey human's feelings, in this paper, a dynamic selection ensemble learning method is proposed to analyze their emotion automatically. A feature selection algorithm is proposed at first based on rough set and the domain oriented data driven dataminingtheory, which can get multiple reducts and candidate classifiers. then the nearest neighborhood of each unseen sample is found in a validation subset and the most accurate classifier is extracted from the candidate classifiers. Finally, the selected classifier is used to recognize unseen samples. Experimental results show that the proposed method is effective and suitable for emotion recognition.
the classical learning problem of the patternrecognition in a finite-dimensional linear space of real-valued features is studied under the conditions of a non-stationary universe. the training criterion of non-statio...
详细信息
ISBN:
(纸本)9783642217869
the classical learning problem of the patternrecognition in a finite-dimensional linear space of real-valued features is studied under the conditions of a non-stationary universe. the training criterion of non-stationary patternrecognition is formulated as a generalization of the classical Support Vector machine. the respective numerical algorithm has the computation complexity proportional to the length of the training time series.
A method of approximation the mass of coal moving on a conveyor belt under the ultrasonic sensor that measures a height of coal pile is described in the paper. A process of defining a set of variables that affects the...
详细信息
ISBN:
(纸本)9783642217869
A method of approximation the mass of coal moving on a conveyor belt under the ultrasonic sensor that measures a height of coal pile is described in the paper. A process of defining a set of variables that affects the approximated coal mass is presented. A model of multiple regression and an algorithm of regression rules induction based on the M5 algorithm have been exploited to relate momentary values of the coal pile withthe mass of moving coal.
the proceedings contain 15 papers. the topics discussed include: estimating probability of failure of a complex system based on partial information about subsystems and components, with potential applications to aircr...
the proceedings contain 15 papers. the topics discussed include: estimating probability of failure of a complex system based on partial information about subsystems and components, with potential applications to aircraft maintenance;stepwise feature selection using multiple kernel learning;empirical reconstruction of fuzzy model of experiment in the Euclidean metric;SVM based offline handwritten gurmukhi character recognition;obtaining of a minimal polygonal representation of a curve by means of a fuzzy clustering;KDDClus: a simple method for multi-density clustering;intelligent datamining for turbo-generator predictive maintenance: an approach in real-world;handwritten script identification from a bi-script document at line level using gabor filters;image recognition using kullback-leibler information discrimination;beyond analytical modeling, gathering data to predict real agents' strategic interaction;and construction of enzyme network of arabidopsis thaliana using graph theory.
In this paper, we consider the problem of extracting opinions from natural language texts, which is one of the tasks of sentiment analysis. We provide an overview of existing approaches to sentiment analysis including...
详细信息
ISBN:
(纸本)9783642217869
In this paper, we consider the problem of extracting opinions from natural language texts, which is one of the tasks of sentiment analysis. We provide an overview of existing approaches to sentiment analysis including supervised (Naive Bayes, maximum entropy, and SVM) and unsupervised machinelearning methods. We apply three supervised learning methods-Naive Bayes, KNN, and a method based on the Jaccard index - to the dataset of Internet user reviews about cars and report the results. When learning a user opinion on a specific feature of a car such as speed or comfort, it turns out that training on full unprocessed reviews decreases the classification accuracy. We experiment with different approaches to preprocessing reviews in order to obtain representations that are relevant for the feature one wants to learn and show the effect of each representation on the accuracy of classification.
Nowadays computer scientists are faced with fast growing and permanently evolving data, which are represented as observations made sequentially in time. A common problem in the datamining community is the recognition...
详细信息
A disjunctive model of box bicluster and tricluster analysis is considered. A least-squares locally-optimal one cluster method is proposed, oriented towards the analysis of binary data. the method involves a parameter...
详细信息
ISBN:
(纸本)9783642218804
A disjunctive model of box bicluster and tricluster analysis is considered. A least-squares locally-optimal one cluster method is proposed, oriented towards the analysis of binary data. the method involves a parameter, the scale shift, and is proven to lead to "contrast" box bi- and tri-clusters. An experimental study of the method is reported.
We propose a combinatorial technique for obtaining tight data dependent generalization bounds based on a splitting and connectivity graph (SC-graph) of the set of classifiers. We apply this approach to a parametric se...
详细信息
ISBN:
(纸本)9783642217869
We propose a combinatorial technique for obtaining tight data dependent generalization bounds based on a splitting and connectivity graph (SC-graph) of the set of classifiers. We apply this approach to a parametric set of conjunctive rules and propose an algorithm for effective SC-bound computation. Experiments on 6 data sets from the UCI ML Repository show that SC-bound helps to learn more reliable rule-based classifiers as compositions of less overfitted rules.
We provide and illustrate a methodology for taking into account data for a knowledge diagnosis method in orthopaedical surgery, using Bayesian networks and machinelearning techniques. We aim to make the conception of...
详细信息
ISBN:
(纸本)9789038625379
We provide and illustrate a methodology for taking into account data for a knowledge diagnosis method in orthopaedical surgery, using Bayesian networks and machinelearning techniques. We aim to make the conception of the student model less time-consuming and subjective. A first Bayesian network was built like an expert system, where experts (in didactic and surgery) provide boththe structure and the probabilities. However, learningthe probability distributions of the variables allows going from an expert network toward a more data-centric one. We compare and analyze here various learning algorithms with regard to experimental data. then we point out some crucial issues like the lack of data.
暂无评论