The main purpose of the present study is to use three state-of-the-art data mining techniques, namely, logistic model tree (LMT), random forest (RF), and classification and regression tree (CART) models, to map landsl...
详细信息
The main purpose of the present study is to use three state-of-the-art data mining techniques, namely, logistic model tree (LMT), random forest (RF), and classification and regression tree (CART) models, to map landslide susceptibility. Long County was selected as the study area. First, a landslide inventory map was constructed using history reports, interpretation of aerial photographs, and extensive field surveys. A total of 171 landslide locations were identified in the study area. Twelve landslide-related parameters were considered for landslide susceptibility mapping, including slope angle, slope aspect, plan curvature, profile curvature, altitude, NDVI, land use, distance to faults, distance to roads, distance to rivers, lithology, and rainfall. The 171 landslides were randomly separated into two groups with a 70/30 ratio for training and validation purposes, and different ratios of non-landslides to landslides grid cells were used to obtain the highest classification accuracy. The linear support vector machine algorithm (LSVM) was used to evaluate the predictive capability of the 12 landslide conditioning factors. Second, LMT, RF, and CART models were constructed using training data. Finally, the applied models were validated and compared using receiver operating characteristics (ROC), and predictive accuracy (ACC) methods. Overall, all three models exhibit reasonably good performances;the RF model exhibits the highest predictive capability compared with the LMT and CART models. The RF model, with a success rate of 0.837 and a prediction rate of 0.781, is a promising technique for landslide susceptibility mapping. Therefore, these three models are useful tools for spatial prediction of landslide susceptibility. (C) 2016 Elsevier B.V. All rights reserved.
Objective: To prospectively validate a previously developed classification and regression tree (CART) model that predicts the likelihood of a good outcome among patients undergoing inpatient cardiopulmonary resuscitat...
详细信息
Objective: To prospectively validate a previously developed classification and regression tree (CART) model that predicts the likelihood of a good outcome among patients undergoing inpatient cardiopulmonary resuscitation. Design: Prospective validation of a clinical decision rule. Setting: Skane University Hospital in Malmo, Sweden. Patients: All adult patients (N = 287) experiencing in-hospital cardiopulmonary arrest and undergoing cardiopulmonary resuscitation between 2007 and 2010. Interventions: Patients from Skane University Hospital who underwent CPR (N = 287) were classified using the CART models to predict their likelihood of survival neurologically intact or with minimal deficits, based on a cerebral performance category score of 1. Discrimination and classification accuracy of the score in the Swedish population was compared to that in the original (derivation and internal validation) populations. Measurements and Main Results: For model 1, the area under the receiver-operating characteristic curve (AUROCC) was 0.77, compared with 0.76 and 0.73 in the original derivation and validation populations, respectively. Model 1 classified 71 (2.8%) of 287 patients as being at a very low risk of a good neurologic outcome compared with 157 (26.1%) of 287 patients predicted to be at an above average risk of a good neurologic outcome. Model 2 had a similar AUROCC as the original validation population of 0.71 but lower than the original derivation population. Model 2 performed similarly to Model 1 with regards to its ability to correctly classify patients as very low or higher than average likelihood of a good neurologic outcome. Conclusion: Two CART models validated well in a different population, displaying similar discrimination and classification accuracy compared to the original population. Although additional validation in larger populations is desirable before widespread adoption, these results are very encouraging.
Objective: To examine whether distinct participant groupings for changes in fruit intake (FI) levels between ages 23 and 31 years are identifiable based on both time-varying and time-invariant sociodemo-graphic and be...
详细信息
Objective: To examine whether distinct participant groupings for changes in fruit intake (FI) levels between ages 23 and 31 years are identifiable based on both time-varying and time-invariant sociodemo-graphic and behavioral variables. Methods: Data were derived from the National Longitudinal Survey of Youth-1997, US. Change in FI frequency constituted the dependent variable. For 21 variables, changes and averages in 2007-2011 were calculated. classification and regression tree analysis was conducted using Generalized, Unbiased, Interaction Detection, and Estimation software. Results: Analysis isolated 5 variables (changes in smoking, drinking alcohol, and television viewing, plus 5-year mean of income-to-poverty ratio and computer use) and associated cutoff values to identify 7 groups of participants with differing degrees of FI change. Conclusions and Implications: Multiple groupings existed within upper social strata;a majority maintained healthy behaviors whereas some adopted substance use stress-coping mechanisms. Some low-income individuals demonstrated a capacity to adopt healthy behaviors. Dietary interventions could identify behavioral clustering, with emphasis on drinking, smoking, and screen time.
The alteration of surrounding rock is an important prospecting indicator in mineral exploration, but some important minerals are unclassified or misclassified when using hyper-spectral remote sensing mineral recogniti...
详细信息
The alteration of surrounding rock is an important prospecting indicator in mineral exploration, but some important minerals are unclassified or misclassified when using hyper-spectral remote sensing mineral recognition. A method for mineral recognition mapping was proposed. In this method, a decision tree discrimination rule was established based on the classification and regression tree data-mining algorithm and the absorption characteristics of field-measured spectra. Compared with spectral angle mapping and mixture-tuned matched filtering (MTMF), this method is shown to be efficient for mineral recognition mapping using hyper-spectral images;its accuracy is 85.06%, which is greater than that of the MTMF method (83.91%). The advantages of the proposed method comprise the reduction of errors caused by the setting of the artificial threshold for mineral mapping and the lesser degree of difficulty in its training. Furthermore, the hierarchy structure of the decision tree in this method reflects the recognition process clearly, and the rule nodes are closely related to the spectra of the minerals;therefore, the advantage of this method is the interpretability of the results and the process. This method could be used for mineral recognition and classification using hyperspectral images. (C) 2016 Society of Photo-Optical Instrumentation Engineers (SPIE)
The purpose of the current study is to produce landslide susceptibility maps using different data mining models. Four modeling techniques, namely random forest (RF), boosted regressiontree (BRT), classification and r...
详细信息
The purpose of the current study is to produce landslide susceptibility maps using different data mining models. Four modeling techniques, namely random forest (RF), boosted regressiontree (BRT), classification and regression tree (CART), and general linear (GLM) are used, and their results are compared for landslides susceptibility mapping at the Wadi Tayyah Basin, Asir Region, Saudi Arabia. Landslide locations were identified and mapped from the interpretation of different data types, including high-resolution satellite images, topographic maps, historical records, and extensive field surveys. In total, 125 landslide locations were mapped using ArcGIS 10.2, and the locations were divided into two groups; training (70?%) and validating (25?%), respectively. Eleven layers of landslide-conditioning factors were prepared, including slope aspect, altitude, distance from faults, lithology, plan curvature, profile curvature, rainfall, distance from streams, distance from roads, slope angle, and land use. The relationships between the landslide-conditioning factors and the landslide inventory map were calculated using the mentioned 32 models (RF, BRT, CART, and generalized additive (GAM)). The models’ results were compared with landslide locations, which were not used during the models’ training. The receiver operating characteristics (ROC), including the area under the curve (AUC), was used to assess the accuracy of the models. The success (training data) and prediction (validation data) rate curves were calculated. The results showed that the AUC for success rates are 0.783 (78.3?%), 0.958 (95.8?%), 0.816 (81.6?%), and 0.821 (82.1?%) for RF, BRT, CART, and GLM models, respectively. The prediction rates are 0.812 (81.2?%), 0.856 (85.6?%), 0.862 (86.2?%), and 0.769 (76.9?%) for RF, BRT, CART, and GLM models, respectively. Subsequently, landslide susceptibility maps were divided into four classes, including low, moderate, high, and very high susceptibility. The results re
A quantitative structure-solubility relationship was developed to predict the solubility of some statin drugs in supercritical carbon dioxide (SC-CO2). The solubility of lovastatin, simvastatin, atorvastatin, rosuvast...
详细信息
A quantitative structure-solubility relationship was developed to predict the solubility of some statin drugs in supercritical carbon dioxide (SC-CO2). The solubility of lovastatin, simvastatin, atorvastatin, rosuvastatin, and flovastatin in SC-CO2 at 225 different states of temperature and pressure were predicted. classification and regression tree (CART) was successfully used as a descriptor selection method. Three descriptors (pressure, temperature, and molecular weight) were selected and used as inputs for adaptive neuro-fuzzy inference system (ANFIS). The root mean square errors for the calibration, prediction, and validation sets were 0.09, 0.14, and 0.11, respectively. In comparison with other methods, CART-ANFIS is a powerful model for prediction of solubilities of these statins in SC-CO2.
According to groundwater level monitoring data of Shuping landslide in the Three Gorges Reservoir area, based on the response relationship between influential factors such as rainfall and reservoir level and the chang...
详细信息
According to groundwater level monitoring data of Shuping landslide in the Three Gorges Reservoir area, based on the response relationship between influential factors such as rainfall and reservoir level and the change of groundwater level, the influential factors of groundwater level were selected. Then the classification and regression tree(CART) model was constructed by the subset and used to predict the groundwater level. Through the verification, the predictive results of the test sample were consistent with the actually measured values, and the mean absolute error and relative error is 0.28 m and 1.15%respectively. To compare the support vector machine(SVM) model constructed using the same set of factors, the mean absolute error and relative error of predicted results is 1.53 m and 6.11% respectively. It is indicated that CART model has not only better fitting and generalization ability, but also strong advantages in the analysis of landslide groundwater dynamic characteristics and the screening of important variables. It is an effective method for prediction of ground water level in landslides.
Omni-directional wheelchairs are capable of providing great help for users with difficult walking as wish. In order to help the users with problems controlling wheelchairs with physical contact methods, this paper pre...
详细信息
ISBN:
(纸本)9781509041022
Omni-directional wheelchairs are capable of providing great help for users with difficult walking as wish. In order to help the users with problems controlling wheelchairs with physical contact methods, this paper presents a set of algorithms including improved classification and regression tree (CART), cross validation separation(CVS) and relative angle model (RAM) to control a wheelchair with eye movements. The worldwide researches and some earlier work done about eye movement data processing were introduced which reveal the difficulty of applying eye movement data processing methods into motion control tasks. Afterwards, a kind of improved CART algorithm was proposed to efficiently decrease noisy data and extract useful information from raw data. And CVS along with RAM methods were presented to finish recognition tasks and improve the robustness of the system for nonstandard user generated scanpaths. In the last, an experiment was conducted to verify the effectiveness of the algorithms, and also the future work based on the results was proposed.
IntroductionHaemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research ...
详细信息
IntroductionHaemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. AimsThe present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. Materials & methodsThe CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. classificationtrees (CTs) are used to analyse categorical outcomes and regressiontrees (RTs) to analyse continuous ones. ResultsThe CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. ConclusionThere is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable.
Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, ...
详细信息
Groundwater is considered one of the most valuable fresh water resources. The main objective of this study was to produce groundwater spring potential maps in the Koohrang Watershed, Chaharmahal-e-Bakhtiari Province, Iran, using three machine learning models: boosted regressiontree (BRT), classification and regression tree (CART), and random forest (RF). Thirteen hydrological-geological-physiographical (HGP) factors that influence locations of springs were considered in this research. These factors include slope degree, slope aspect, altitude, topographic wetness index (TWI), slope length (LS), plan curvature, profile curvature, distance to rivers, distance to faults, lithology, land use, drainage density, and fault density. Subsequently, groundwater spring potential was modeled and mapped using CART, RF, and BRT algorithms. The predicted results from the three models were validated using the receiver operating characteristics curve (ROC). From 864 springs identified, 605 (approximate to 70 %) locations were used for the spring potential mapping, while the remaining 259 (approximate to 30 %) springs were used for the model validation. The area under the curve (AUC) for the BRT model was calculated as 0.8103 and for CART and RF the AUC were 0.7870 and 0.7119, respectively. Therefore, it was concluded that the BRT model produced the best prediction results while predicting locations of springs followed by CART and RF models, respectively. Geospatially integrated BRT, CART, and RF methods proved to be useful in generating the spring potential map (SPM) with reasonable accuracy.
暂无评论