Building accurate and reliable complex machines is not trivial (but necessary in most real life problems). Typical ensembles are often unsatisfactory. Meta-learning techniques can be much more powerful in composing op...
详细信息
ISBN:
(纸本)9781424409723
Building accurate and reliable complex machines is not trivial (but necessary in most real life problems). Typical ensembles are often unsatisfactory. Meta-learning techniques can be much more powerful in composing optimal or close to optimal solutions to given tasks. Efficient meta-learning is possible only within a versatile and flexible datamining framework providing uniform procedures for dealing with different kinds of methods and tools for thorough analysis of learning processes and their results. We propose a methodology for information exchange between machines of different abstraction levels. Inter-machine communication is based on uniform representation of gained knowledge. Implemented in a general datamining framework, it provides tools for sophisticated analysis of adaptive processes of heterogeneous machines. the resulting meta-knowledge is a brilliant information source for further meta-learning.
Transduction is an inference mechanism "from particular to particular". Its application to classification tasks implies the use of both labeled (training) data and unlabeled (working) data to build a classif...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
Transduction is an inference mechanism "from particular to particular". Its application to classification tasks implies the use of both labeled (training) data and unlabeled (working) data to build a classifier whose main goal is that of classifying (only) unlabeled data as accurately as possible. Unlike the classical inductive setting, no general rule valid for all possible instances is generated. Transductive learning is most suited for those applications where the examples for which a prediction is needed are already known when training the classifier. Several approaches have been proposed in the literature on building transductive classifiers from data stored in a single table of a relational database. Nonetheless, no attention has been paid to the application of the transduction principle in a (multi-) relational setting, where data are stored in multiple tables of a relational database. In this paper we propose a new transductive classifier, named TRANSC, which is based on a probabilistic approach to making transductive inferences from relational data. this new method works in a transductive setting and employs a principled probabilistic classification in multi-relational datamining to face the challenges posed by some spatial datamining problems. Probabilistic inference allows us to compute the class probability and return, in addition to result of transductive classification, the confidence in the classification. the predictive accuracy of TRANSC has been compared to that of its inductive counterpart in an empirical study involving both a benchmark relational dataset and two spatial datasets. the results obtained are generally in favor of TRANSC, although improvements are small by a narrow margin.
Withthe extensive application of database system, a mass-circulation historical data is accumulated in university library. We applied datamining technology for discovering useful knowledge in circulation data analys...
详细信息
ISBN:
(纸本)9781424409723
Withthe extensive application of database system, a mass-circulation historical data is accumulated in university library. We applied datamining technology for discovering useful knowledge in circulation data analysis. there are some shortcomings in mining association rules via Apriori algorithm. this paper introduces two methods for improving the efficiency of algorithm, such as filtrating basic item set, or ignoring the transaction records that are useless for frequent items generated. In order to meet the requirement of personal book recommendation service, we applied the improved algorithm to mine association rules from circulation records in university library. A service model is introduced, and may be used for offering recommendation information to the readers. the recommendation model can also be used in other fields, for example, bookstore, information retrieval system, network reference database, etc.
Unexpected failures of complex equipment such as trains or aircraft introduce superfluous costs, disrupt operation, have an effect on consumer's satisfaction, and potentially decrease safety in practice. One of th...
详细信息
ISBN:
(纸本)9780769530697
Unexpected failures of complex equipment such as trains or aircraft introduce superfluous costs, disrupt operation, have an effect on consumer's satisfaction, and potentially decrease safety in practice. One of the objectives of Prognostics and Health Management (PHM) systems is to help reduce the number of unexpected failures by continuously monitoring the components of interest and predicting their failures sufficiently in advance to allow for proper planning. In other words, PHM systems may help turn unexpected failures into expected ones. Recent research has demonstrated the usefulness of datamining to help build prognostic models for PHM but also has identified the need for new model evaluation methods that take into account the specificities of prognostic applications. this paper investigates this problem. First, it reviews classical and recent methods to evaluate datamining models and it explains their deficiencies with respect to prognostic applications. the paper then proposes a novel approach that overcomes these deficiencies. this approach integrates the various costs and benefits involved in prognostics to quantify the cost saving expected from a given prognostic model. From the end user's perspective, the formula is practical as it is easy to understand and requires realistic inputs. the paper illustrates the usefulness of the methods through a real-world case study involving data-mining prognostic models and realistic costs/benefits information. the results show the feasibility of the approach and its applicability to various prognostic applications.
the clustering about relational databases is an active study subject in datamining. In this paper, we introduce a Multi-relational Hierarchical Clustering Algorithm Based on Shared Nearest Neighbor Similarity (MHSNNS...
详细信息
ISBN:
(纸本)9781424409723
the clustering about relational databases is an active study subject in datamining. In this paper, we introduce a Multi-relational Hierarchical Clustering Algorithm Based on Shared Nearest Neighbor Similarity (MHSNNS). First, this algorithm joins every table through the tuple ID propagation. then, groups objects into a large number of relatively small sub-clusters using the shared nearest neighbor algorithm and the cluster cohesion. Last, find the genuine clusters by repeatedly combining these sub-clusters using the cluster separation. the experiment shows the efficiency and scalability of this approach.
In the last decade, the efforts of spoken language processing have achieved significant advances, however, the work with emotional recognition has not progressed so far,, and can only achieve 50% to 60% in accuracy [1...
详细信息
ISBN:
(纸本)9780769528410
In the last decade, the efforts of spoken language processing have achieved significant advances, however, the work with emotional recognition has not progressed so far,, and can only achieve 50% to 60% in accuracy [1]. this is because a majority of researchers in this field have focused on the synthesis of emotional speech rather than focusing on automating human emotion recognition. Many research groups have focused on how to improve the performance of the classifier they used for emotion recognition, and few work has been done on data pre-processing, such as the extraction and selection of a set of specifying acoustic features instead of using all the possible ones they had in hand. To work with well-selected acoustic features does not mean to delay the whole job, but this will save much time and resources by removing the irrelative information and reducing the high-dimension data calculation. In this paper, we developed an automatic feature selector based on a RF2TREE algorithm and the traditional C4.5 algorithm. RF2TREE applied here helped us to solve the problems that did not have enough data examples. the ensemble learning technique was applied to enlarge the original data set by building a bagged random forest to generate many virtual examples, and then the new data set was used to train a single decision tree, which selects the most efficient features to represent the speech signals for the emotion recognition. Finally, the output of the selector was a set of specifying acoustic features, produced by RF2TREE and a single decision tree.
A Classification Association Rule (CAR), a common type of mined knowledge in datamining, describes an implicative co-occurring relationship between a set of binary-valued data-attributes (items) and a pre-defined cla...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
A Classification Association Rule (CAR), a common type of mined knowledge in datamining, describes an implicative co-occurring relationship between a set of binary-valued data-attributes (items) and a pre-defined class, expressed in the form of an "antecedent double right arrow consequent-class" rule. Classification Association Rule mining (CARM) is a recent Classification Rule mining (CRM) approach that builds an Association Rule mining (ARM) based classifier using CARs. Regardless of which particular methodology is used to build it, a classifier is usually presented as an ordered CAR list, based on an applied rule ordering strategy. Five existing rule ordering mechanisms can be identified: (1) Confidence-Support-size -of-Antecedent (CSA), (2) size-of-Antecedent-Confidence-Support (ACS), (3) Weighted Relative Accuracy (WRA), (4) Laplace Accuracy, and (5) chi(2) Testing. In this paper, we divide the above mechanisms into two groups: (i) pure "support-confidence" framework like, and (ii) additive score assigning like. We consequently propose a hybrid rule ordering approach by combining one approach taken from (i) and another approach taken from (ii). the experimental results show that the proposed rule ordering approach performs well with respect to the accuracy of classification.
Clustering analysis is utilized to analyze the clustering phenomenon occurred to the data structure. this paper proposes a new GA-based clustering method based on the stopping conditions which consider the clustering ...
详细信息
ISBN:
(纸本)9781424409723
Clustering analysis is utilized to analyze the clustering phenomenon occurred to the data structure. this paper proposes a new GA-based clustering method based on the stopping conditions which consider the clustering accuracy for datasets. From experiment results using the UCI datasets of WINE and IRIS, which indicate that the accuracy of the proposed method is better than the listing methods, and the speed of convergence is very fast.
In this paper we describe the application of morphological shared-weight probabilistic neural networks to the problems of pattern classification in synthetic aperture radar (SAR) images. the feature extraction process...
详细信息
ISBN:
(纸本)9781424409723
In this paper we describe the application of morphological shared-weight probabilistic neural networks to the problems of pattern classification in synthetic aperture radar (SAR) images. the feature extraction process is learned by interaction withthe classification process. Feature extraction is performed using gray-scale hit- miss transforms that are independent of gray-level shifts. the classification process is performed by probabilistic neural networks(PNN). Classification experiments were carried out with SAR images of military objects. And classification results show MSPNN architecture to optimize object recognition versus processing time and veracity.
the paper constructs an Online learning Behavior Assessment System based on datamining technique. the process of constructing "online learning behavior - effect" model with C4.5 algorithm is described and i...
详细信息
ISBN:
(纸本)9781424409723
the paper constructs an Online learning Behavior Assessment System based on datamining technique. the process of constructing "online learning behavior - effect" model with C4.5 algorithm is described and is put into practice for grading the students' online learning behavior. the experimental results show that over 90% teachers and students are satisfied withthe assessment results. the application of the system provides an objective, reasonable method for assessing online learning.
暂无评论