The Software Defect Prediction (SDP) process provides reliable software by identifying defect-prone modules before the testing stage. It efficiently and effectively utilizes quality assurance resources. Most predictiv...
详细信息
The Software Defect Prediction (SDP) process provides reliable software by identifying defect-prone modules before the testing stage. It efficiently and effectively utilizes quality assurance resources. Most predictive models are trained on historical data which belong to the same project or comparable project. These models show satisfactory performance as they utilize similar settings to the considered projects. But the limitation of these models is that they are effective only if there are adequate historical data to train a predictive model. In reality, however, such historical data are minimal for some projects and absent for new projects. The defect prediction in such projects which lack historical data can be accomplished by training prediction models on different project data. This process is known as Cross-Project Defect Prediction (CPDP). Software defect datasets also suffered from class imbalance issues which further degrades the model's performance. In this research work, the authors have proposed a multi-objective random forest (MO-RF) algorithm with a data resampling technique to minimize the probability of false alarms, to maximize the probability of detection and to overcome the class imbalance problem. The study also evaluates the performance of other prediction models. The proposed method has shown percentage improvement (in terms of AUC) of 2.78 and 3.46 over MONB and MONBNN, respectively.
In order to extract useful information from X-ray fluorescence (XRF) spectra and establish a high-accuracy prediction model of soil heavy metal contents, a hybrid model combining a deep belief network (DBN) with a tre...
详细信息
In order to extract useful information from X-ray fluorescence (XRF) spectra and establish a high-accuracy prediction model of soil heavy metal contents, a hybrid model combining a deep belief network (DBN) with a tree-based model was proposed. The DBN was first introduced into feature extraction of XRF spectral data, which can obtain deep layer features of spectra. Owing to the strong regression ability of the tree-based model, it can offset the deficiency of DBN in prediction ability so it was used for predicting heavy metal contents based on the extracted features. In order to further improve the performance of the model, the parameters of model can be optimized according to the prediction error, which was completed by sparrow search algorithm and the gird search. The hybrid model was applied to predict the contents of As and Pb based on spectral data of overlapping peaks. It can be obtained that R-2 of As and Pb reached 0.9884 and 0.9358, the mean square error of As and Pb are as low as 0.0011 and 0.0058, which outperform other commonly used models. That proved the combination of DBN and tree-based model can obtain more accurate prediction results.
Accurate prediction of hydrocarbon production is crucial for the oil and gas industry. However, the strong heterogeneity of underground formation, the inconsistency in oil-gas-water distribution, and the complex flow ...
详细信息
Accurate prediction of hydrocarbon production is crucial for the oil and gas industry. However, the strong heterogeneity of underground formation, the inconsistency in oil-gas-water distribution, and the complex flow mechanisms make hydrocarbon production forecasting (HPF) difficult, which leads to a high level of uncertainty in the prediction results. The explosion of machine learning (ML) methodologies that are capable of analyzing big data shed new light on HPF using production data. In this article, an in-depth review is provided regarding HPF using ML methodologies. Firstly, the merits and drawbacks of traditional HPF methods are analyzed and summarized. Then, the applications of ML algorithms in HPF are reviewed in detail, especially concentrating on artificial neural network, support vector machine, and ensemble learning. For each algorithm, the basic theory and its variants are first introduced, and its applications in HPF are comprehensively demonstrated subsequently. Finally, this article presents the challenge and prospects of machine-learning-based HPF. Sophisticated ML proxy models can be con-structed and employed to deal with an extended type of input data such that improving the efficacy of data utilization. On the other hand, deep learning models designed to handle time-series data can gain more attention. Modeling approaches for multivariate time-series hydrocarbon production data using deep neural networks with similar functionality to LSTM may lead to more accurate and computationally efficient production forecasting.
暂无评论