Accurate medium and long-term precipitation forecasting plays a vital role in disaster prevention and mitigation and rational allocation of water resources. In recent years, there are various methods for medium- and l...
详细信息
Accurate medium and long-term precipitation forecasting plays a vital role in disaster prevention and mitigation and rational allocation of water resources. In recent years, there are various methods for medium- and long-term precipitation forecasting based on machine learning algorithms. However, machinelearning has a high demand for the size of sample data. Therefore, this article proposes a data augmentation algorithm based on the K-means clustering algorithm and synthetic minority oversampling technique (SMOTE), which can effectively enhance sample information. Besides, through constructing random forest (RF), extreme gradient boosting (XGB), recurrent neural network (RNN), and long short-term memory (LSTM) are, respectively, constructed as the models to forecast monthly grid precipitation of the Danjiangkou River Basin. This study aims to improve the accuracy of medium- and long-term precipitation forecasting. The main results are the following two aspects: 1) in most years, the anomaly correlation coefficient and Pg score of SMOTE-km-XGB and SMOTE-km-RF exceed that of XGB and RF;furthermore, compared with the other three methods, SMOTE-km-XGB method is more suitable for precipitation forecasting in the studied basin in this article;and 2) the forecasting results of two deep learning methods (RNN and LSTM) show that the sample data processed by the K-means clustering algorithm and SMOTE data augmentation algorithm have not achieved considerable results in deep learning. This study improves the accuracy of precipitation forecast by expanding and balancing the information of sample data, and provides a new research idea for improving the accuracy of medium- and long-term hydrological forecasting.
This research studies the implementation of artificial neural networks (ANN) in predicting the concentration of total suspended solids (TSS) for the Fei Tsui reservoir in Taiwan. The prediction model developed in this...
详细信息
This research studies the implementation of artificial neural networks (ANN) in predicting the concentration of total suspended solids (TSS) for the Fei Tsui reservoir in Taiwan. The prediction model developed in this study is designed to be used for monitoring the water quality in the Fei Tsui reservoir. High concentrations of total suspended solids (TSS) have been a crucial problem in the Fei Tsui reservoir for decades. As the Fei Tsui reservoir is a primary water source for Taipei City, this issue impacts the drinking water supply for the city due to etherification problems in the reservoir. 10-year average monthly records and 13-year average annual records have been collected for 26 parameters and correlated with the TSS concentrations to determine the parameters that have a strong relationship with the TSS concentrations. The parameters that were shown to have a strong correlation with the TSS concentration are the trophic state index (TSI), nitrate (NO3) concentration, total phosphorous (TP) concentration, iron concentration (IRON), and turbidity. Linear regression was used to develop the model that estimates the TSS concentration in the Fei Tsui Reservoir. The results show that model 3, a three-layer ANN model that uses three-input parameters namely NO3 concentration, TP concentration, and turbidity, with five neurons, to predict the output parameter which is TSS concentration, produces the highest coefficient of determination (R-2) and Willmott Index (WI), which are 0.9589 and 0.9933 respectively, and the lowest root mean square error, which is 0.4753. Based on these performance criteria, model 3 is concluded as the best model to predict TSS concentrations in this study. (C) 2020 The Authors. Published by Elsevier B.V. on behalf of Faculty of Engineering, Ain Shams University.
As an important indicator of soil quality, soil organicmatter (SOM) significantly contributes to land productivity and ecosystemhealth. Accuratelymapping SOMat regional scales is of critical importance for sustainable...
详细信息
As an important indicator of soil quality, soil organicmatter (SOM) significantly contributes to land productivity and ecosystemhealth. Accuratelymapping SOMat regional scales is of critical importance for sustainable agriculture and soil utilization management and remains a grand challenge. Many studies used soil sampling data and machine learning algorithms to predict SOM at regional scales for a given year, while few studies mapped SOM formultiple years and examined its temporal dynamics. We compared the performance of fourmachine learning algorithms: decision tree (DT), bagging decision tree (BDT), randomforest (RF), and gradient boosting regression trees (GBRT) in mapping SOM in Hubei province, China over the 18-year period from 2000 to 2017. Our results showed that RF and DT had the highest coefficient of determination (R-2) (0.61) and the lowest potential bias (9.48 g/kg), respectively, while GBRT had the lowest mean error (ME) (1.26 g/kg), root mean squared error (RMSE) (5.41 g/kg) and Lin's concordance correlation coefficient (LCCC) (0.72). The SOM map based on GBRT better captured the distribution of the soil sample data than that based on RF. The trained GBRT model and the spatially explicitly data on explanatory variables (e.g., climate, terrain, remote sensing) were used to predict SOM for each 500 m x 500 m grid cell in Hubei for the period from 2000 to 2017. Our results showed that the SOM content of cropland was relatively high in the southeast and relatively low in the north. The SOM content in the topsoil varied from 0.89 to 58.86 g/kg and was averaged at 20.52 g/kg. The mean cropland SOM content of the province exhibited an increasing trend from 2000 to 2017 with an increase of 0.26 g/kg and a growth rate of 1.28%. Spatially, the SOMcontent increased in southernHubei and decreased in central and northern parts of the province. A large portion of the areas with decreasing SOM content in northern Hubei was reclaimed cropland, while a large part of t
At present, the automatic attendance mode of distance education is not conducive to the confirmation and analysis of information after class. In order to study the effective automatic recognition algorithm of remote e...
详细信息
At present, the automatic attendance mode of distance education is not conducive to the confirmation and analysis of information after class. In order to study the effective automatic recognition algorithm of remote education classroom, this study takes the educational classroom of intelligent innovation and entrepreneurship of Internet + as an example for analysis. Moreover, this paper adopts facial features as the basis of recognition, establishes corresponding positioning points, and constructs precise positioning methods for real-time feature capture. At the same time, the ASM algorithm is used to extract facial features, and the algorithm is improved to improve the extraction effect. In addition, this paper proposes Gabor-wavelet packet set and Gabor beamlet set for auxiliary recognition, which improves the recognition rate. Finally, this paper designs experiments to analyze the performance of the algorithm of this study. The results show that the proposed algorithm has certain practical effects and can provide theoretical reference for subsequent related research.
Soil is a complicated historical natural continuum that presents gradual changes in its properties and geographic area. Conventional soil survey and cartography methods on a macroscopic scale based on grids with a coa...
详细信息
Soil is a complicated historical natural continuum that presents gradual changes in its properties and geographic area. Conventional soil survey and cartography methods on a macroscopic scale based on grids with a coarse resolution are inadequate for the rapid development of precision agriculture. The demand for soil mapping content and accuracy has increased as more convenient methods of acquiring multi-source geo-spatial data have been developed, and such data are commonly employed to extract basic mapping units and environmental variables in related algorithms. We employ geo-objects as basic units of soil property mapping, which are extracted from high-resolution remote sensing images using a convolutional neural network based learning algorithm. Multi-source geo-spatial data are transferred into each geo-object as environmental variables, and the relationships between soil properties and environmental variables are mined using powerful tree-based machine learning algorithms, including regressions with random forests and XGBoost. A data set that includes soil sample points and multi-source geo-spatial data is used to evaluate the effectiveness of the proposed method. The experimental results demonstrate that the method allows for better soil organic matter mapping than state-of-the-art interpolation-based and linear-regression-based methods. The proposed procedure has potential to be a general method for mapping other soil properties. Its advantages are embodied in the modeling of relatively miscellaneous data with implicitly associated non-linear relationships between soil properties and environmental variables. The spatial scale and accuracy of the finer maps capture more detailed characteristics of the soil properties and are applicable to the micro-domain fields required for refined soil mapping with small variations.
Software-Defined Networking (SDN) is a new type of technology that embraces high flexibility and adaptability. The applications in SDN have the ability to manage and control networks while ensuring load balancing, acc...
详细信息
Software-Defined Networking (SDN) is a new type of technology that embraces high flexibility and adaptability. The applications in SDN have the ability to manage and control networks while ensuring load balancing, access control, and routing. These are considered the most significant benefits of SDN. However, SDN can be influenced by several types of conflicting flows which may lead to deterioration in network performance in terms of efficiency and optimisation. Besides, SDN conflicts occur due to the impact and adjustment of certain features such as priority and action. Moreover, applying machine learning algorithms in the identification and classification of conflicting flows has limitations. As a result, this paper presents several machine learning algorithms that include Decision Tree (DT), Support Vector machine (SVM), Extremely Fast Decision Tree (EFDT) and Hybrid (DT-SVM) for detecting and classifying conflicting flows in SDNs. The EFDT and hybrid DT-SVM algorithms were designed and deployed based on DT and SVM algorithms to achieve improved performance. Using a range flows from 1000 to 100000 with an increment of 10000 flows per step in two network topologies namely, Fat Tree and Simple Tree Topologies, that were created using the Mininet simulator and connected to the Ryu controller, the performance of the proposed algorithms was evaluated for efficiency and effectiveness across a variety of evaluation metrics. The experimental results of the detection of conflict flows show that the DT and SVM algorithms achieve accuracies of 99.27% and 98.53% respectively while the EFDT and hybrid DT-SVM algorithms achieve respective accuracies of 99.49% and 99.27%. In addition, the proposed EFDT algorithm achieves 95.73% accuracy on the task of classification between conflict flow types. The proposed EFDT and hybrid DT-SVM algorithms show a high capability of SDN applications to offer fast detection and classification of conflict flows.
Antalya is one of the provinces with the highest number of forest fires in Turkiye. In 2021, 278 forest fires occurred within the administrative boundaries of Antalya Regional Directorate of Forestry. The main objecti...
详细信息
Antalya is one of the provinces with the highest number of forest fires in Turkiye. In 2021, 278 forest fires occurred within the administrative boundaries of Antalya Regional Directorate of Forestry. The main objective of this study is to produce forest fire susceptibility (FFS) maps of Antalya province using machinelearning (ML) models. In addition to forest fire inventory data, 16 factors, including topographic, environmental, meteorological, and human-driven, were used in the study. Inventory data included 2166 fire ignition points from the General Directorate of Forestry. 70 % of the inventory dataset was used to train the ML models and 30 % to validate the models. Overall accuracy, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC) approaches were considered as validation metrics. FFS maps of Antalya were produced using stand-alone ML algorithms, K-Nearest Neighbors, and Support Vector machines, as well as tree-based Conditional Inference Trees (CTREE), Random Forest (RF), Gradient Boosting machines (GBM), and Extreme Gradient Boosting (XGBoost) algorithms. To the best of our knowledge, this is the first study using the CTREE algorithm for forest fire susceptibility mapping. Therefore, this study is important for the related literature. The validation results revealed that the XGBoost model outperformed other models. It is thought that the FFS map produced using the XGBoost model will guide forest engineers, wildland firefighting teams, and firefighters to minimize damage and control forest fires. (c) 2024 COSPAR. Published by Elsevier B.V. All rights reserved.
Recommender systems use algorithms to provide users with product or service recommendations. Recently, these systems have been using machine learning algorithms from the field of artificial intelligence. However, choo...
详细信息
Recommender systems use algorithms to provide users with product or service recommendations. Recently, these systems have been using machine learning algorithms from the field of artificial intelligence. However, choosing a suitable machinelearning algorithm for a recommender system is difficult because of the number of algorithms described in the literature. Researchers and practitioners developing recommender systems are left with little information about the current approaches in algorithm usage. Moreover, the development of recommender systems using machine learning algorithms often faces problems and raises questions that must be resolved. This paper presents a systematic review of the literature that analyzes the use of machine learning algorithms in recommender systems and identifies new research opportunities. The goals of this study are to (i) identify trends in the use or research of machine learning algorithms in recommender systems;(ii) identify open questions in the use or research of machine learning algorithms;and (iii) assist new researchers to position new research activity in this domain appropriately. The results of this study identify existing classes of recommender systems, characterize adopted machinelearning approaches, discuss the use of big data technologies, identify types of machine learning algorithms and their application domains, and analyzes both main and alternative performance metrics. (C) 2017 Elsevier Ltd. All rights reserved.
Most of the current cancer treatment approaches are invasive along with a broad spectrum of side effects. Furthermore, cancer drug resistance known as chemoresistance is a huge obstacle during treatment. This study ai...
详细信息
Most of the current cancer treatment approaches are invasive along with a broad spectrum of side effects. Furthermore, cancer drug resistance known as chemoresistance is a huge obstacle during treatment. This study aims to predict the resistance of several cancer cell-lines to a drug known as Cisplatin. In this papers the NCBI GEO database was used to obtain data and then the harvested data was normalized and its batch effects were corrected by the Combat software. In order to select the appropriate features for machinelearning, the feature selection/reduction was performed based on the Fisher Score method. Six different algorithms were then used as machine learning algorithms to detect Cisplatin resistant and sensitive samples in cancer cell lines. Moreover, Differentially Expressed Genes (DEGs) between all the sensitive and resistance samples were harvested. The selected genes were enriched in biological pathways by the enrichr database. Topological analysis was then performed on the constructed networks using Cytoscape software. Finally, the biological description of the output genes from the performed analyses was investigated through literature review. Among the six classifiers which were trained to distinguish between cisplatin resistance samples and the sensitive ones, the KNN and the Naive Bayes algorithms were proposed as the most convenient machines according to some calculated measures. Furthermore, the results of the systems biology analysis determined several potential chemoresistance genes among which PTGER3, YWHAH, CTNNB1, ANKRD50, EDNRB, ACSL6, IFNG and, CTNNB1 are topologically more important than others. These predictions pave the way for further experimental researches.
This study proposes an applicable driver identification method using machine learning algorithms with driving information. The driving data are collected by a 3-axis accelerometer, which records the lateral, longitudi...
详细信息
This study proposes an applicable driver identification method using machine learning algorithms with driving information. The driving data are collected by a 3-axis accelerometer, which records the lateral, longitudinal and vertical accelerations. In this research, a data transformation way is developed to extract interpretable statistics features from raw 3-axis sensor data and utilise machine learning algorithms to identify drivers. To eliminate the bias caused by the sensor installation and ensure the applicability of their approach, they present a data calibration method which proves to be necessary for a comparative test. Four basic supervised classification algorithms are used to perform on the data set for comparison. To improve classification performance, they propose a multiple classifier system, which combines the outputs of several classifiers. Experimental results based on real-world data show that the proposed algorithm is effective on solving driver identification problem. Among the four basic algorithms, random forests (RFs) algorithm has the greatest performance on accuracy, recall and precision. With the proposed multiple classifier system, a greater performance can be achieved in small number of drivers' groups. RFs algorithm takes the lead in running speed. In their experiment, ten drivers are involved and over 5,500,000 driving records per driver are collected.
暂无评论