this paper presents classification results for infrasonic events using practically all well-known machinelearning algorithms together with wavelet transforms for preprocessing. We show that there are great difference...
详细信息
ISBN:
(纸本)9781934272084
this paper presents classification results for infrasonic events using practically all well-known machinelearning algorithms together with wavelet transforms for preprocessing. We show that there are great differences between different groups of classification algorithms and that nearest neighbor classifiers are superior to all others for accurate classification of infrasonic events.
During the past number of years, machinelearning and data mining techniques have received considerable attention among the intrusion detection researchers to address the weaknesses of knowledgebase detection techniqu...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
During the past number of years, machinelearning and data mining techniques have received considerable attention among the intrusion detection researchers to address the weaknesses of knowledgebase detection techniques. this has led to the application of various supervised and unsupervised techniques for the purpose of intrusion detection. In this paper, we conduct a set of experiments to analyze the performance of unsupervised techniques considering their main design choices. these include the heuristics proposed for distinguishing abnormal data from normal data and the distribution of dataset used for training. We evaluate the performance of the techniques with various distributions of training and test datasets, which are constructed from KDD99 dataset, a widely accepted resource for IDS evaluations. this comparative study is not only a blind comparison between unsupervised techniques, but also gives some guidelines to researchers and practitioners on applying these techniques to the area of intrusion detection.
Description logics have emerged as one of the most successful formalisms for knowledge representation and reasoning. they are now widely used as a basis for ontologies in the Semantic Web. To extend and analyse ontolo...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
Description logics have emerged as one of the most successful formalisms for knowledge representation and reasoning. they are now widely used as a basis for ontologies in the Semantic Web. To extend and analyse ontologies, automated methods for knowledge acquisition and mining are being sought for. Despite its importance for knowledge engineers, the learning problem in description logics has not been investigated as deeply as its counterpart for logic programs. We propose the novel idea of applying evolutionary inspired methods to solve this task. In particular, we show how Genetic Programming can be applied to the learning problem in description logics and combine it with techniques from Inductive Logic Programming. We base our algorithm on thorough theoretical foundations and present a preliminary evaluation.
the recently introduced transductive confidence machines (TCMs) framework allows to extend classifiers such that they satisfy the calibration property. this means that the error rate can be set by the user prior to cl...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
the recently introduced transductive confidence machines (TCMs) framework allows to extend classifiers such that they satisfy the calibration property. this means that the error rate can be set by the user prior to classification. An analytical proof of the calibration property was given for TCMs applied in the on-line learning setting. However, the nature of this learning setting restricts the applicability of TCMs. In this paper we provide strong empirical evidence that the calibration property also holds in the off-line learning setting. Our results extend the range of applications in which TCMs can be applied. We may conclude that TCMs are appropriate in virtually any application domain.
Traditional methods in data Mining cannot be applied to all types of data with equal success. Innovative methods for model creation are needed to address the lack of model performance for data from which it is difficu...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
Traditional methods in data Mining cannot be applied to all types of data with equal success. Innovative methods for model creation are needed to address the lack of model performance for data from which it is difficult to extract relationships. this paper proposes a set of algorithms that allow the integration of data from multiple datasets that are related, as well as results from the implementation of these techniques using data from the field of Predictive Toxicology. the results show significant improvements when related data is used to aid in the model creation process, both overall and in specific data ranges. the proposed algorithms have potential for use within any field where multiple datasets exist, particularly in fields combining computing, chemistry and biology.
Current metrics for evaluating the performance of Bayesian network structure learning includes order statistics of the data likelihood of learned structures, the average data likelihood, and average convergence time. ...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
Current metrics for evaluating the performance of Bayesian network structure learning includes order statistics of the data likelihood of learned structures, the average data likelihood, and average convergence time. In this work, we define a new metric that directly measures a structure learning algorithm's ability to correctly model causal associations among variables in a data set. By treating membership in a Markov Blanket as a retrieval problem, we use ROC analysis to compute a structure learning algorithm's efficacy in capturing causal associations at varying strengths. Because our metric moves beyond error rate and data-likelihood with a measurement of stability, this is a better characterization of structure learning performance. Because the structure learning problem is NP-hard, practical algorithms are either heuristic or approximate. For this reason, an understanding of a structure learning algorithm's stability and boundary value conditions is necessary. We contribute to state of the art in the data-mining community with a new tool for understanding the behavior of structure learning techniques.
Fractal theory has been used for computer graphics, image compression and different fields of patternrecognition. In this paper, a fractal based method for recognition of both on-line and off-line Farsi/Arabic handwr...
详细信息
ISBN:
(数字)9783540734994
ISBN:
(纸本)9783540734987
Fractal theory has been used for computer graphics, image compression and different fields of patternrecognition. In this paper, a fractal based method for recognition of both on-line and off-line Farsi/Arabic handwritten digits is proposed. Our main goal is to verify whether fractal theory is able to capture discriminatory information from digits for patternrecognition task. Digit classification problem (on-line and off-line) deals withpatterns which do not have complex structure. So, a general purpose fractal coder, introduced for image compression, is simplified to be utilized for this application. In order to do that, during the coding process, contrast and luminosity information of each point in the input pattern are ignored. therefore, this approach can deal with on-line data and binary images of handwritten Farsi digits. In fact, our system represents the shape of the input pattern by searching for a set, of geometrical relationship between parts of it. Some fractal-based features are directly extracted by the fractal coder. We show that the resulting features have invariant properties which can be used for object recognition.
According to the load properties of electric power,four kinds of component forecasting Models ore chosen and a new combination forecasting model based on Self-organizing data mining algorithm is introducted in this pa...
详细信息
ISBN:
(纸本)9781424410651
According to the load properties of electric power,four kinds of component forecasting Models ore chosen and a new combination forecasting model based on Self-organizing data mining algorithm is introducted in this paper the forecasted results of each component forcasting models are used as the input of self-organizing data mining algorithm, and the output are the results Of Combination forecasting. In order to verify , the validity and maneuverability of the model, a load forecasting example is given and the result show that this model can improve the forecasting ability remarkably when comparing to optimal combination forecasting, and artificial neural network combination forecasting.
the data mining technique is applied to search stable feature set and build authentication rules of handwriting signature in this paper. Supervised by data mining technique, 10 stable features including-maximum speed,...
详细信息
ISBN:
(纸本)9781424410651
the data mining technique is applied to search stable feature set and build authentication rules of handwriting signature in this paper. Supervised by data mining technique, 10 stable features including-maximum speed, maximum acceleration, the amount and the places of inflexions and etc have been selected from 61 original signature features. Taking the selected feature set as the input attribute, true or false signature sample clusters are trained and learned to build authentication rules supervised by data technique to lest the validity of the selected feature set. the result of the lest shows that the selected feature set is effective to identify handwriting signature and the average veracity of Chinese authentication is zip to 92%. It is proved that data mining technique is an effective method to identify handwriting signature.
In the paper, the fractal property of rotating machinery vibration signals and the principle of fractal data compression are summarized reviewed. Based on the fractal property, an approach for vibration signal data co...
详细信息
ISBN:
(纸本)9781424410651
In the paper, the fractal property of rotating machinery vibration signals and the principle of fractal data compression are summarized reviewed. Based on the fractal property, an approach for vibration signal data compression and reconstruction is proposed. In this method, a signal is represented by parameters of affine maps and is reconstructed according to self-similarity represented by the IFS parameters. the total data size of such a representation is far less than the original time domain data size. To demonstrate the effectiveness of this method to resolving the bottleneck in remote transmission of large amount signals and improving the capability of remote equipment fault diagnosis system, the presented method has been applied to some actual vibration signals as well as simulation signals.
暂无评论