Gully erosion is one of the types of water erosion which causes the loss of fertile soil, depletion of soil moisture, and so on. Due to the importance of the identification of the areas that are prone to this type of ...
详细信息
Gully erosion is one of the types of water erosion which causes the loss of fertile soil, depletion of soil moisture, and so on. Due to the importance of the identification of the areas that are prone to this type of erosion in the management and appropriate control of this phenomenon, the present study has tried to determine the prone areas to gully erosion in the southern regions of Fars province using Multiple-criteria decision-making (MCDM). For this purpose, First, fuzzy maps are prepared for each parameter with membership functions, and then the analytic hierarchy process (AHP) method was implemented to determine the weight coefficients of each parameter by pairwise comparison based on their importance in determining areas prone to gully erosion. The map of the gully-prone areas was prepared by multiplying the weight coefficients in the fuzzy layers. Since there are too many influential parameters in gully erosion (19 parameters), the feature selection algorithm is carried out to select the most prominent parameters. It is also decided to apply best subsets regression to determine the strength of the relationship between various parameters as well as their effects on gully erosion. Finally, land use changes (i.e., urbanization) and their effect on gully erosion are investigated. The results showed that the areas located in the central regions of the study area (about 15%) are more susceptible to erosion. The results of the feature J48 algorithm (i.e., Best-First and Greedy-Stepwise method) showed that the most important effective parameters in gully erosion are normalized difference vegetation index (NDVI), slope, topographic wetness index (TWI), altitude, terrain ruggedness index (TRI), lithology and land use. Then, the gully erosion map obtained based on these data. The comparison of the map prepared by selection data and the map prepared by all data show that the gully erosion map with selection data has high accuracy (AUC = 0.80%). The best subsets regress
The ongoing water deficiencies in arid and semi-arid regions in conjunction with certain nutritional requirements highlight the importance of pinpointing the most optimal locations for cultivating agricultural product...
详细信息
The ongoing water deficiencies in arid and semi-arid regions in conjunction with certain nutritional requirements highlight the importance of pinpointing the most optimal locations for cultivating agricultural products with the highest yield. Given the importance of this issue, this study proceeds to prepare soil fertility maps for corn production (Zea Mays L.) in Fars province, Iran. Initially, fuzzy membership functions (FMs) are employed to prepare fuzzy maps for each layer in the geographic information system (GIS), after which feature selection algorithms are deployed to designate the most relevant layers. The layers are then assigned specific weights obtained using a combination of analytic network process (ANP) and analytic hierarchy process (AHP) methods to prepare soil fertility maps. The input data consist of organic content (OC), phosphorus (P), potassium (K), iron (Fe), zinc (Zn), manganese (Mn), and copper (Cu). Inverse distance weighting (IDW) is utilized to procure interpolation maps for each layer. Thereafter, zoning maps are obtained using FMs. Ultimately, ANP and AHP models are once again deployed to generate the final overlayered fertility maps for corn production. The results show that combining the ANP method with featureselection (OC, K, Fe, and P) results in higher accuracy than solely applying the AHP method. Thus, incorporating featureselection and ANP methods with both inter and intra-group pair-wise comparison would result in more accurate fertility maps for corn production with lower costs and time complexity.
We present the results of short-term forecasting of Henry Hub spot natural gas prices based on the performance of classical time series models and machine learning methods, specifically;neural networks (NN) and strate...
详细信息
We present the results of short-term forecasting of Henry Hub spot natural gas prices based on the performance of classical time series models and machine learning methods, specifically;neural networks (NN) and strategic seasonality-adjusted support vector regression machines(SSA-SVR). We introduce several improvements to the forecasting method based on SVR. A procedure for generation of model inputs and model input selection using featureselection (FS) algorithms is suggested. The use of FS algorithms for automatic selection of model input and the use of advanced global optimization technique PSwarm for the optimization of SVR hyper parameters reduce the subjective inputs. Our results show that the machine learning results reported in the literature often over exaggerate the successfulness of these models since, in some cases, we record only slight improvements over the time series approaches. We have to emphasize that our findings apply to Henry Hub, a market which is known among traders as the "widow maker". We find definite advantages of using FS algorithms to preselect the variables both in NN and SVR. Machine learning models without the preselection of variables are often inferior to time series models in forecasting spot prices and in this case FS algorithms show their usefulness and strength. (C) 2017 Elsevier Ltd. All rights reserved.
Understanding the function of protein is conducive to research in advanced fields such as gene therapy of diseases, the development and design of new drugs, etc. The prerequisite for understanding the function of a pr...
详细信息
Understanding the function of protein is conducive to research in advanced fields such as gene therapy of diseases, the development and design of new drugs, etc. The prerequisite for understanding the function of a protein is to determine its tertiary structure. The realization of protein structure classification is indispensable for this problem and fold recognition is a commonly used method of protein structure classification. Protein sequences of 40% identity in the ASTRAL protein classification database are used for fold recognition research in current work to predict 27 folding types which mostly belong to four protein structural classes: alpha, beta, alpha/beta and alpha + beta. We extract features from primary structure of protein using methods covering DSSP, PSSM and HMM which are based on secondary structure and evolutionary information to convert protein sequences into feature vectors that can be recognized by machine learning algorithm and utilize the combination of LightGBM feature selection algorithm and incremental featureselection method (IFS) to find the optimal classifiers respectively constructed by machine learning algorithms on the basis of tree structure including Random Forest, XGBoost and LightGBM. Bayesian optimization method is used for hyper-parameter adjustment of machine learning algorithms to make the accuracy of fold recognition reach as high as 93.45% at last. The result obtained by the model we propose is outstanding in the study of protein fold recognition.
Background Skin cutaneous melanoma (SKCM) is the most common skin tumor with high mortality. The unfavorable outcome of SKCM urges the discovery of prognostic biomarkers for accurate therapy. The present study aimed t...
详细信息
Background Skin cutaneous melanoma (SKCM) is the most common skin tumor with high mortality. The unfavorable outcome of SKCM urges the discovery of prognostic biomarkers for accurate therapy. The present study aimed to explore novel prognosis-related signatures of SKCM and determine the significance of immune cell infiltration in this pathology. Methods Four gene expression profiles (GSE130244, GSE3189, GSE7553 and GSE46517) of SKCM and normal skin samples were retrieved from the GEO database. Differentially expressed genes (DEGs) were then screened, and the feature genes were identified by the LASSO regression and Boruta algorithm. Survival analysis was performed to filter the potential prognostic signature, and GEPIA was used for preliminary validation. The area under the receiver operating characteristic curve (AUC) was obtained to evaluate discriminatory ability. The Gene Set Variation Analysis (GSVA) was performed, and the composition of the immune cell infiltration in SKCM was estimated using CIBERSORT. At last, paraffin-embedded specimens of primary SKCM and normal skin tissues were collected, and the signature was validated by fluorescence in situ hybridization (FISH) and immunohistochemistry (IHC). Results Totally 823 DEGs and 16 feature genes were screened. IFI16 was identified as the signature associated with overall survival of SKCM with a great discriminatory ability (AUC > 0.9 for all datasets). GSVA noticed that IFI16 might be involved in apoptosis and ultraviolet response in SKCM, and immune cell infiltration of IFI16 was evaluated. At last, FISH and IHC both validated the differential expression of IFI16 in SKCM. Conclusions In conclusion, our comprehensive analysis identified IFI16 as a signature associated with overall survival and immune infiltration of SKCM, which may play a critical role in the occurrence and development of SKCM.
In agriculture, crop yield prediction is critical. Crop yield depends on various features which can be categorized as geographical, climatic, and biological. Geographical features consist of cultivable land in hectare...
详细信息
In agriculture, crop yield prediction is critical. Crop yield depends on various features which can be categorized as geographical, climatic, and biological. Geographical features consist of cultivable land in hectares, canal length to cover the cultivable land, number of tanks and tube wells available for irrigation. Climatic features consist of rainfall, temperature, and radiation. Biological features consist of seeds, minerals, and nutrients. In total, 15 features were considered for this study to understand features impact on paddy crop yield for all seasons of each year. For selecting vital features, five filter and wrapper approaches were applied. For predicting accuracy of features selectionalgorithm, Multiple Linear Regression (MLR) model was used. The RMSE, MAE, R, and RRMSE metrics were used to evaluate the performance of feature selection algorithms. Data used for the analysis was drawn from secondary sources of state Agriculture Department, Government of Tamil Nadu, India, for over 30 years. Seventy-five percent of data was used for training and 25% was used for testing. Low computational time was also considered for the selection of best feature subset. Outcome of all feature selection algorithms have given similar results in the RMSE, RRMSE, R, and MAE values. The adjusted R(2 )value was used to find the optimum feature subset despite all the deviations. The evaluation of the dataset used in this work shows that total area of cultivation, number of tanks and open wells used for irrigation, length of canals used for irrigation, and average maximum temperature during the season of the crop are the best features for better crop yield prediction on the study area. The MLR gives 85% of model accuracy for the selected features with low computational time.
In agriculture, crop yield prediction is critical. Crop yield depends on various features including geographic, climate and biological. This research article discusses five featureselection (FS) algorithms namely Seq...
详细信息
In agriculture, crop yield prediction is critical. Crop yield depends on various features including geographic, climate and biological. This research article discusses five featureselection (FS) algorithms namely Sequential Forward FS, Sequential Backward Elimination FS, Correlation based FS, Random Forest Variable Importance and the Variance Inflation Factor algorithm for featureselection. Data used for the analysis was drawn from secondary sources of the Tamil Nadu state Agriculture Department for a period of 30 years. 75% of data was used for training and 25% data was used for testing. The performance of the feature selection algorithms are evaluated by Multiple Linear Regression. RMSE, MAE, R and RRMSE metrics are calculated for the feature selection algorithms. The adjusted R2 was used to find the optimum feature subset. Also, the time complexity of the algorithms was considered for the computation. The selected features are applied to Multilinear regression, Artificial Neural Network and M5Prime. MLR gives 85% of accuracy by using the features which are selected by SFFS algorithm.
Hypertension is a potentially unsafe health ailment, which can be indicated directly from the blood pressure (BP). Hypertension always leads to other health complications. Continuous monitoring of BP is very important...
详细信息
Hypertension is a potentially unsafe health ailment, which can be indicated directly from the blood pressure (BP). Hypertension always leads to other health complications. Continuous monitoring of BP is very important;however, cuff-based BP measurements are discrete and uncomfortable to the user. To address this need, a cuff-less, continuous, and noninvasive BP measurement system is proposed using the photoplethysmograph (PPG) signal and demographic features using machine learning (ML) algorithms. PPG signals were acquired from 219 subjects, which undergo preprocessing and feature extraction steps. Time, frequency, and time-frequency domain features were extracted from the PPG and their derivative signals. featureselection techniques were used to reduce the computational complexity and to decrease the chance of over-fitting the ML algorithms. The features were then used to train and evaluate ML algorithms. The best regression models were selected for systolic BP (SBP) and diastolic BP (DBP) estimation individually. Gaussian process regression (GPR) along with the ReliefF feature selection algorithm outperforms other algorithms in estimating SBP and DBP with a root mean square error (RMSE) of 6.74 and 3.59, respectively. This ML model can be implemented in hardware systems to continuously monitor BP and avoid any critical health conditions due to sudden changes.
To reduce the negative effects of tourism on the environment, the importance of ecotourism is increasingly considered because this form of tourism helps to protect the environment and sustainable development of an are...
详细信息
To reduce the negative effects of tourism on the environment, the importance of ecotourism is increasingly considered because this form of tourism helps to protect the environment and sustainable development of an area. So, it is important to determine suitable places for tourism to better manage the study area. The aim of this study is to identify potential ecotourism sites using ordered weight averaging (OWA) and fuzzy quantifier algorithms in the east and central of Fars province, Iran. Required spatial data such as geology, soil, slope land, topographic roughness index (TRI), vegetation, surface water, elevation, protected area, climate, distance to road, and distance to the village were utilized. To prepare ecotourism maps with different confidence levels, eleven ordered weights were applied corresponding to the eleven parameters that were rank-ordered for each parameter after the modified factor weights were applied. Also, the feature selection algorithm (random search and genetic search methods) was used to select the most important parameters to determine the ecotourism map. The results showed that, with decreasing risk (alpha = 0), almost all of the study area was unsuitable for ecotourism while, with increasing risk (alpha = 20), all of the study areas were suitable for ecotourism. One of the ecotourism maps prepared with different confidence levels can be suggested based on the different conditions of tourists so that, if the tourist has a limited time, ecotourism maps with a higher degree of confidence levels are recommended and vice versa. This is one of the innovations of the present research. Also, the results of the random search method with the least error show that slope, elevation, climate, distance to river, and distance to road parameters are the most important parameters in preparing the ecotourism map of the region. So, using the results of the research, many economic problems, such as unemployment, will be solved by managers by preparing tour
Purpose - Customer insurance coverage sales plan problem, in which the loyal customers are recognized and offered some special plans, is an essential problem facing insurance companies. On the other hand, the loyal cu...
详细信息
Purpose - Customer insurance coverage sales plan problem, in which the loyal customers are recognized and offered some special plans, is an essential problem facing insurance companies. On the other hand, the loyal customers who have enough potential to renew their insurance contracts at the end of the contract term should be persuaded to repurchase or renew their contracts. The aim of this paper is to propose a three-stage data-mining approach to recognize high-potential loyal insurance customers and to predict/plan special insurance coverage sales. Design/methodology/approach - The first stage addresses data cleansing. In the second stage, several filter and wrapper methods are implemented to select proper features. In the third stage, K-nearest neighbor algorithm is used to cluster the customers. The approach aims to select a compact feature subset with the maximal prediction capability. The proposed approach can detect the customers who are more likely to buy a specific insurance coverage at the end of a contract term. Findings - The proposed approach has been applied in a real case study of insurance company in Iran. On the basis of the findings, the proposed approach is capable of recognizing the customer clusters and planning a suitable insurance coverage sales plans for loyal customers with proper accuracy level. Therefore, the proposed approach can be useful for the insurance company which helps them to identify their potential clients. Consequently, insurance managers can consider appropriate marketing tactics and appropriate resource allocation of the insurance company to their high-potential loyal customers and prevent switching themto competitors. Originality/value - Despite the importance of recognizing high-potential loyal insurance customers, little study has been done in this area. In this paper, data-mining techniques were developed for the prediction of special insurance coverage sales on the basis of customers' characteristics. The method allows
暂无评论