Reliable forecasting of crude oil price has received a prodigious attention by both investment companies and governments. Motivated by this issue, this paper seeks to propose a new hybrid forecasting model for crude o...
详细信息
Reliable forecasting of crude oil price has received a prodigious attention by both investment companies and governments. Motivated by this issue, this paper seeks to propose a new hybrid forecasting model for crude oil price trend prediction. For this purpose, the crude oil price series is initially decomposed by variational mode decomposition algorithm, and the multi-modal data features are extracted based on the decomposed modes. The volatility of crude oil prices is simultaneously converted into trend symbols through symbolic time series analysis. Machine learning multi-classifier are then trained with multi -modal data features and historical volatility as input and trend symbols as output. The well-trained models are used to predict the trend symbols of West Texas Intermediate crude oil future price. Empirical results demonstrate that the proposed hybrid forecasting model outperforms its counterparts. Among the classifiers used, the hybrid prediction model using support vector machine classifier exhibits superior predictive ability. The accuracy of the proposed model for predicting high volatility of crude oil prices is evidenced to be better than that of low volatility. (c) 2021 Elsevier Ltd. All rights reserved.
For the imbalanced classification problems, most traditional classification models only focus on searching for an excellent classifier to maximize classification accuracy with the fixed misclassification cost, not tak...
详细信息
For the imbalanced classification problems, most traditional classification models only focus on searching for an excellent classifier to maximize classification accuracy with the fixed misclassification cost, not take into consideration that misclassification cost can change with sample probability distribution. So far as we know, cost-sensitive learning method can be effectively utilized to solve imbalanced data classification problems. In this regards, we propose an integrated TANBN with cost-sensitive classification algorithm (AdaC-TANBN) to overcome the above drawback and improve classification accuracy. The AdaC-TANBN algorithm employs variable misclassification cost determined by samples distribution probability to train classifier, then implements classification for imbalanced data in medical diagnosis. The effectiveness of our proposed approach is examined on the Cleveland heart dataset (Heart), Indian liver patient dataset (ILPD), Dermatology dataset and Cervical cancer risk factors dataset (CCRF) from the UCI learning repository. The experimental results indicate that the AdaC-TANBN algorithm can outperform other state-of-the-art comparative methods.
In this study, a classification algorithm based on complex number feature is proposed. Specifically, the SVM framework is reformulated, so each example would be classified in the unitary space. The cost function is re...
详细信息
In this study, a classification algorithm based on complex number feature is proposed. Specifically, the SVM framework is reformulated, so each example would be classified in the unitary space. The cost function is redefined by considering the maximum margin of real and imaginary units of the complex number feature at the same time. The cost function is based on the expectation of the hinge loss, and its derivatives can be calculated in closed forms. Using a stochastic gradient descent (SGD) algorithm, this method allows for efficient implementation. For complex number feature, the example uncertainty is modeled by a sample preprocessing method based on within-class Euclidean distance Gaussian distribution sample (DGS). In addition, a complex number feature selection method based on improved hybrid discrimination analysis (HDA) is proposed by considering the correlation between real and imaginary units of complex number feature. The proposed classification algorithm is tested on synthetic data and three publicly available and popular datasets, namely, MNIST, WDBC, and Voc2012. Experimental results verify the effectiveness of the proposed method. The codes are available: https://***/luckysomebody/paper-code
There are lots of typical applications of classification learning for accumulated big data in the nonstationary environments. It is very necessary and urgent to study the algorithms that can carry out classification l...
详细信息
There are lots of typical applications of classification learning for accumulated big data in the nonstationary environments. It is very necessary and urgent to study the algorithms that can carry out classification learning efficiently in these environments. The recently proposed algorithm, named Learn & x002B;& x002B;.NSE, has made an important breakthrough, which is one of the important research achievements in this research field. However, the Learn & x002B;& x002B;.NSE algorithm adopts a serial ensemble mechanism, and its execution efficiency needs to be further improved when facing the long-term accumulated big data. A Parallel and Reverse Learn & x002B;& x002B;.NSE algorithm, abbreviated as PRLearn & x002B;& x002B;.NSE, is proposed in this paper by changing the ensemble mechanism of the base-classifiers, which uses the old base-classifiers as a supplement to the new base-classifier. It constructs a fast and parallel ensemble mechanism. The experimental results on the artificially generated dataset and real dataset show that the proposed PRLearn & x002B;& x002B;.NSE algorithm can greatly improve the efficiency of ensemble classification learning under the premise of obtaining the approaching classification accuracy of Learn & x002B;& x002B;.NSE algorithm and it is very suitable for fast ensemble classification learning for the long-term accumulated big data.
With the mountains of classification algorithms proposed in the literature, the study of how to select suitable classifier(s) for a given problem is important and practical. Existing methods rely on a single learner b...
详细信息
With the mountains of classification algorithms proposed in the literature, the study of how to select suitable classifier(s) for a given problem is important and practical. Existing methods rely on a single learner built on one type of meta-features or a simple combination of several types of meta-features to address this problem. In this paper, we propose a two-layer classification algorithm recommendation method called EML (Ensemble of ML-KNN for classification algorithm recommendation) to leverage the diversity of different sets of meta-features. The proposed method can automatically recommend different numbers of appropriate algorithms for different dataset, rather than specifying a fixed number of appropriate algorithm(s) as done by the ML-KNN, SLP-based and OBOE methods. Experimental results on 183 public datasets show the effectiveness of the EML method compared to the three baseline methods. (c) 2021 Elsevier B.V. All rights reserved.
In this paper, we propose a classification algorithm based on Recency-FrequencyMonetary(RFM) model and K-means data mining method. In addition, the designed algorithm is verified by the experiments on the member data ...
详细信息
In this paper, we propose a classification algorithm based on Recency-FrequencyMonetary(RFM) model and K-means data mining method. In addition, the designed algorithm is verified by the experiments on the member data in a large shopping mall. The experiments results show that the proposed algorithm can provide an accurate classification of the ***, some marketing strategies for different classes of members are given according to the classification results.
The sparse representation classification method has been widely concerned and studied in pattern recognition because of its good recognition effect and classification performance. Using the minimized l(1) norm to solv...
详细信息
The sparse representation classification method has been widely concerned and studied in pattern recognition because of its good recognition effect and classification performance. Using the minimized l(1) norm to solve the sparse coefficient, all the training samples are selected as the redundant dictionary to calculate, but the computational complexity is higher. Aiming at the problem of high computational complexity of the l(1) norm based solving algorithm, l(2) norm local sparse representation classification algorithm is proposed. This algorithm uses the minimum l(2) norm method to select the local dictionary. Then the minimum l(1) norm is used in the dictionary to solve sparse coefficients for classify them, and the algorithm is used to verify the gesture recognition on the constructed gesture database. The experimental results show that the algorithm can effectively reduce the calculation time while ensuring the recognition rate, and the performance of the algorithm is slightly better than KNNSRC algorithm.
Learning from the imbalanced data samples so as to achieve accurate classification is an important research content in data mining field. It is very difficult for classification algorithm to achieve a higher accuracy ...
详细信息
Learning from the imbalanced data samples so as to achieve accurate classification is an important research content in data mining field. It is very difficult for classification algorithm to achieve a higher accuracy because the uneven distribution of data samples makes some categories have few samples. A imbalanced data classification algorithm of support vector machines (KE-SVM) is proposed in this article, this algorithm achieve the initial classification of data samples by training the maximum margin classification SVM model, and then obtaining a new kernel extension function. based on Chi square test and weight coefficient calculation, through training the samples again by the new vector machine with kernel function to improve the classification accuracy. Through the simulation experiments of real data sets of artificial data set, it shows that the proposed method has higher classification accuracy and faster convergence for the uneven distribution data.
The hotel management relationship is a good business strategy for hotels, which can promote the development of a hotel, when a classification algorithm is applied to customer relationship management system. First, the...
详细信息
The hotel management relationship is a good business strategy for hotels, which can promote the development of a hotel, when a classification algorithm is applied to customer relationship management system. First, the classification algorithm is based on a support vector machine is studied, the nearest neighbor sample density is used, and the corresponding mathematical model is constructed. Second, the procedure of a classification algorithm based on fuzzy support vector machine is designed. Third, a customer acquisition plan based on a classification algorithm is analyzed. Finally, a hotel is used as the research object, and a customer acquisition analysis is carried out, and the results show that the new method has quicker training speed and higher classification correctness.
Recommending appropriate classification algorithm(s) for a given classification problem is of great significance and also one of the challenging problems in the field of data mining, which is usually viewed as a meta-...
详细信息
Recommending appropriate classification algorithm(s) for a given classification problem is of great significance and also one of the challenging problems in the field of data mining, which is usually viewed as a meta-learning problem. Multi-label learning has been adopted and validated to be an effective meta-learning method in classification algorithm recommendation. However, the multi-label learning method used in previous classification algorithm recommendation relies only on relationship between data sets and their direct neighbours, ignoring the impact of other data sets. In this paper, a new classification algorithm recommendation method based on link prediction between data sets and classification algorithms is proposed. Taking advantage of link prediction in heterogeneous networks, this method considers the impact of all data sets and makes full use of the interactions between data sets as well as between data sets and algorithms. Firstly, meta data of the training data sets is collected. And then a heterogeneous network called DAR (Data and algorithm Relationship) Network is constructed with the meta data. Finally, the link prediction technique is adopted to recommend appropriate algorithm(s) for a given data set on the basis of the DAR Network. To evaluate the proposed link prediction-based recommendation method, extensive experiments with 131 data sets and 21 classification algorithms are conducted. Results of 5 performance measures indicate that the proposed method is more effective compared with the base line classification algorithm recommendation method and can be used in practice.
暂无评论