Steel industry is one of the pillar industries in Chinese national economy, and has made an active contribution to the national economy39;s sustained development. Therefore the study in prediction of steel output ha...
详细信息
Steel industry is one of the pillar industries in Chinese national economy, and has made an active contribution to the national economy's sustained development. Therefore the study in prediction of steel output has become a very important task. In this paper, on the basis of reviewing the existing common prediction methods, we combine wavelet with neural network, put forward a datamining method based on self-adaptive wavelet neural network, and build a machinelearning mechanism of datamining process to improve the capability of problem dealing. The demonstration results indicate that compared with general artificial neural network, datamining with self-adaptive wavelet neural network is not only effective but also feasible
State failure has been traditionally defined as the collapse of national authority, which may be reflected in disasters such as wars and disruptive regime transitions. The availability of comprehensive datasets and th...
详细信息
State failure has been traditionally defined as the collapse of national authority, which may be reflected in disasters such as wars and disruptive regime transitions. The availability of comprehensive datasets and the limitations exhibited by previous forecasting analyses led us to integrate different predictive resources and models through statistical analysis and machinelearning. Here we demonstrate the predictive ability of unsupervised and supervised learning approaches to detecting meaningful relationships between country cases, encoded by several socio-economic indicators, and the emergence of violent conflicts. Two clustering-based analyses (Kohonen maps and a network-based approach) provided the basis for exploratory analyses that confirmed hypotheses about the relevance of the data and the differences between state failure types. We also illustrate the potential of a novel network-based clustering approach for sub-class discovery in the area of political instability analysis. Furthermore, we show significant relationships between the emergence of violent conflicts and a dataset of quantitative indicators of good governance, which allows the design of effective supervised and unsupervised classifiers. This study contributes to the development of intelligent data analysis techniques for supporting hypothesis generation and testing in international conflict analyses
The majority of the clustering algorithms are focused on datasets with only numeric or categorical attributes. Recently, the problem of clustering mixed data has drawn interest due to the fact that many real life appl...
详细信息
The majority of the clustering algorithms are focused on datasets with only numeric or categorical attributes. Recently, the problem of clustering mixed data has drawn interest due to the fact that many real life applications have mixed data. In this research work, we propose a clustering algorithm called ACEM that is able to deal with mixed data. This algorithm makes a pre-clustering on the pure categorical data. Then including all mixed data it evaluates the clusters using an entropy-based criterion in order to verify the cluster membership of the data. As result, we obtain a clustering algorithm for mixed data whose main idea is to extend a categorical clustering algorithm introducing an entropy criterion to measure the cluster heterogeneity. We make comparisons with other clustering algorithms on real life datasets to illustrate our algorithm performance
Prostate cancer remains one of the leading causes of cancer death worldwide, with a reported incidence rate of 650,000 cases per annum worldwide. The causal factors of prostate cancer still remain to be determined. In...
详细信息
Prostate cancer remains one of the leading causes of cancer death worldwide, with a reported incidence rate of 650,000 cases per annum worldwide. The causal factors of prostate cancer still remain to be determined. In this paper, we investigate a medical dataset containing clinical information on 502 prostate cancer patients using the machinelearning technique of rough sets. Our preliminary results yield a classification accuracy of 90%, with high sensitivity and specificity (both at approximately 91%). Our results yield a predictive positive value (PPN) of 81% and a predictive negative value (PNV) of 95%. In addition to the high classification accuracy of our system, the rough set approach also provides a rule-based inference mechanism for information extraction that is suitable for integration into a rule-based system. The generated rules relate directly to the attributes and their values and provide a direct mapping between them
This study investigates the effectiveness of probability forecasts output by standardmachinelearning techniques (Neural Network, C4.5, K-Nearest Neighbours, Naive Bayes, SVM and HMM) when tested on time series datas...
详细信息
ISBN:
(纸本)3540287574
This study investigates the effectiveness of probability forecasts output by standardmachinelearning techniques (Neural Network, C4.5, K-Nearest Neighbours, Naive Bayes, SVM and HMM) when tested on time series datasets from various problem domains, Raw data was converted into a pattern classification problem using a sliding window approach, and the respective target prediction was set as some discretised future value in the time series sequence. Experiments were conducted in the online learning setting to model the way in which time series data is presented. The performance of each learner's probability forecasts was assessed using ROC curves, square loss, classification accuracy and Empirical Reliability Curves (ERC) [1]. Our results demonstrate that effective probability forecasts can be generated on time series data and we discuss the practical implications of this.
Hierarchical hidden Markov models (HHMMs) can be used for time series segmentation. However, it is difficult to obtain a desirable segmentation result, because the form of learning for HHMMs is unsupervised. In the pa...
详细信息
ISBN:
(纸本)3540287574
Hierarchical hidden Markov models (HHMMs) can be used for time series segmentation. However, it is difficult to obtain a desirable segmentation result, because the form of learning for HHMMs is unsupervised. In the paper, we present a semisupervised learning algorithm for HHMMs. It is semisupervised in the sense that the supervisor teaches segmentation boundaries but not segment labels. The learning performance of the proposed algorithm is demonstrated through an experiment using music data.
Extension datamining is a new method that is based on the extension analysis method of Extenics. Extenics is a new disciplinary and a new branch of arfiticial intelligence. datamining techniques have their origins i...
详细信息
ISBN:
(纸本)0769523161
Extension datamining is a new method that is based on the extension analysis method of Extenics. Extenics is a new disciplinary and a new branch of arfiticial intelligence. datamining techniques have their origins in methods from statistics, patternrecognition, databases, artificial intelligence, high performance and parallel computing and visualization. This paper presents how to deal with multiple data formats and unify data representation based on extenics. Keyword: matter-element, databases, extension datamining, association rules.
Recent advancement and wide use of highthroughput technologies for biological research are producing enormous size of biological datasets distributed worldwide. datamining techniques and machinelearning methods prov...
详细信息
Gaussian mixture models are being increasingly used in patternrecognition applications. However, for a set of data other distributions can give better results. In this paper, we consider Dirichlet mixtures which offe...
详细信息
ISBN:
(纸本)3540287574
Gaussian mixture models are being increasingly used in patternrecognition applications. However, for a set of data other distributions can give better results. In this paper, we consider Dirichlet mixtures which offer many advantages [1]. The use of the ECM algorithm and the minimum message length (MML) approach to fit this mixture model is described. Experimental results involve the summarization of texture image databases.
The nonlinear Multiuser Detection (MUD) in Direct Sequence Code Division Multiple Access (DS/CDMA) system can be viewed as a two-class classification task. A new classification method called Probabilistic Tangent Subs...
详细信息
ISBN:
(纸本)3540287574
The nonlinear Multiuser Detection (MUD) in Direct Sequence Code Division Multiple Access (DS/CDMA) system can be viewed as a two-class classification task. A new classification method called Probabilistic Tangent Subspace (PTS) is introduced to be used as an MUD. Due to the mobility of communicator, wireless communication channels are in fact time variant. The uncertainties of the time-varying channel's coefficients cause the uncertainties of the Multiuser Interference (MUI). On the other hand, the probabilistic tangent subspace method is designed to encode the pattern variations. Therefore, we are motivated to adopt this method to develop a classifier as a multiuser detector for time-varying channels. Simulation results show that this MUD performs better than that based on Support Vector machine (SVM) for Rayleigh fading channel in DS/CDMA system.
暂无评论