Daylight confers extensive benefits for building occupants and improves energy efficiency;thus, its prediction and performance are significant for design decision-making on building and lighting. However, real-time pr...
详细信息
Daylight confers extensive benefits for building occupants and improves energy efficiency;thus, its prediction and performance are significant for design decision-making on building and lighting. However, real-time prediction with high precision is difficult since daylight is spatiotemporal correlated and highly dynamic (e.g., periodic, irregularity). In recent years, numerous studies have been focused on machine learning algorithms due to their high generalization ability, robustness, and accuracy in dealing with complex problems. In contrast, selecting the optimal algorithms for a specific research problem is time-consuming and challenging, especially for new learners. This review attempts to develop a systematic guide for algorithms selection and optimization by: (i) discussing the main principles, model structure, and characteristics of 36 of the most representative algorithms;(ii) reviewing and conducting a whole process statistical analysis on the machinelearning-based daylight-prediction studies, while abstracting the statistical issues (regression and classification) and prediction tasks (time series, spatial, and spatiotemporal predictions) of different research topics and purposes;and (iii) integrating the findings, developing the selection and optimization methodologies for algorithms. In conclusion, the order of specific machine learning algorithms is highly recommended for different problem types: (i) for regression, Back Propagation Neural Network (recommendation level = 0.82) > Convolutional Neural Network (0.78) > Kernel Support Vector Regression (0.72);(ii) for classification, Kernel Support Vector machine (0.8) > Radial Basis Function Neural Network (0.56) > Random Forest (0.53);(iii) for time series prediction, Back Propagation Neural Network (0.66) > Random Forest (0.61) > Gated Recurrent Unit Network (0.52) = EXtreme Gradient Boosting (0.52);(iv) for spatial prediction, Back Propagation Neural Network (0.82) > Convolutional Neural Network (0.78
This study was aimed at evaluating the effect of spontaneous lacto-fermentation of carrot slices on flesh structure using different machinelearning approaches. The textures computed from digital images of lacto-ferme...
详细信息
This study was aimed at evaluating the effect of spontaneous lacto-fermentation of carrot slices on flesh structure using different machinelearning approaches. The textures computed from digital images of lacto-fermented and fresh carrot slices were compared using neural networks and other algorithms from different groups. In the case of Multilayer Perceptron, accuracies for training, testing and validation were considered. For some of the networks, lacto-fermented and fresh samples were completely distinguished. The accuracies for training, testing and validation were equal to 100%. For models built using other algorithms (LDA (Linear Discriminant Analysis), Multi Class Classifier, LMT (Logistic Model Tree), KStar, Naive Bayes, PART), the following metrics were used for the evaluation of model effectiveness: accuracies, time taken to build model, Kappa statistic, mean absolute error, root mean squared error, PRC (Precision-Recall) Area, ROC (Receiver Operating Characteristic) Area, MCC (Matthews Correlation Coefficient), F-Measure, Recall, Precision, FP (False Positive) Rate and TP (True Positive) Rate. The most satisfactory results were obtained for the LDA. The lacto-fermented and fresh carrot slices were distinguished with an average accuracy of 99%, low values of errors (mean absolute error: 0.0117, root mean squared error: 0.1014) and FP Rate (0.010). The weighted averages of other metrics were greater than or equal to 0.98 (Kappa statistic: 0.98, PRC Area: 0.987, ROC Area: 0.991, MCC: 0.980, F-Measure: 0.990, Recall: 0.990, Precision: 0.990, TP Rate: 0.990). The obtained results demonstrated the usefulness of different machinelearning approaches to the evaluation of the effect of fermentation on changes in the carrot flesh structure.
Crop acreage analysis and yield estimation are of prime importance in field-level agricultural monitoring and management. This enables prudent decision making during any crop failure event and for ensuing crop insuran...
详细信息
Crop acreage analysis and yield estimation are of prime importance in field-level agricultural monitoring and management. This enables prudent decision making during any crop failure event and for ensuing crop insurance. The free availability of the high resolution Sentinel-2 satellite datasets has created new possibilities for mapping and monitoring agricultural lands in this regard. In the present study conducted on the Tamluk Subdivision of the Purba Medinipur District of West Bengal, the heterogeneous crop area was mapped according to the respective crop type, using Sentinel-2 multi-spectral images and two machine learning algorithms-K Nearest Neighbour (KNN) and Random Forest (RF). Plot-level field information was collected from different cropland types to frame the training and validation datasets (comprising 70% and 30% of the total dataset, respectively) for cropland classification and accuracy assessment. Through this, the major summer crop acreage was identified (Boro rice, vegetables and betel vine-the three main crops in the study area). The extracted maps had an overall accuracy of 97.16% and 97.22%, respectively, in the KNN and RF classifications, with respective Kappa index values of 95.99% and 96.08%, and the RF method proved to be more accurate. This study was particularly useful in mapping the betel leaf acreage herein since scant information exists for this crop and it is cultivated by many smallholder farmers in the region. The methods used in this paper can be readily applied elsewhere for accurately enumerating the respective crop acreages.(c) 2022 National Authority of Remote Sensing & Space Science. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://***/licenses/by-nc-nd/4.0/).
In this study, three new regression models are created for magnitude-type conversion with different machine learning algorithms (linear regression, regression trees, support vector machines, Gaussian process regressio...
详细信息
In this study, three new regression models are created for magnitude-type conversion with different machine learning algorithms (linear regression, regression trees, support vector machines, Gaussian process regression models, ensembles of trees) by using the earthquakes (M >= 4.0) that occurred in Turkey (1900-2020). Additionally, eight new equations are formed with linear and orthogonal regression methods. Developed equations and models are compared to equations selected from the literature by test data. As a result of the study, it is observed that machine learning algorithms create better models and provide results closer to the real values than created and selected equations.
Predicting hospital length of stay is critical for efficient hospital management, enabling proactive resource allocation, the optimization of bed availability, and optimal patient care. This paper explores the potenti...
详细信息
Predicting hospital length of stay is critical for efficient hospital management, enabling proactive resource allocation, the optimization of bed availability, and optimal patient care. This paper explores the potential of machine learning algorithms to revolutionize hospital length-of-stay predictions, contributing to healthcare efficiency and patient care. The main objective is to identify the most effective machinelearning algorithm for building a predictive model capable of predicting hospital length of stay. The bibliographic search of the existing literature on machine learning algorithms applied to hospital length of stay predictions highlighted the most relevant papers within this area of research. The papers were analyzed in terms of model types and metrics that contributed to the considerable impact on healthcare decision making. We also discuss the challenges and limitations of machine learning algorithms for predicting length of stay, and the importance of data quality and ethical considerations.
5G technology is a key factor in delivering faster and more reliable wireless connectivity. One crucial aspect in 5G network planning is coverage prediction, which enables network providers to optimize infrastructure ...
详细信息
5G technology is a key factor in delivering faster and more reliable wireless connectivity. One crucial aspect in 5G network planning is coverage prediction, which enables network providers to optimize infrastructure deployment and deliver high-quality services to customers. This study conducts a comprehensive analysis of machine learning algorithms for 5G coverage prediction, focusing on dominant feature parameters and accuracy. Notably, the Random Forest algorithm demonstrates superior performance with an RMSE of 1.14 dB, MAE of 0.12, and R2 of 0.97. The CNN model, the standout among deep learningalgorithms, achieves an RMSE of 0.289, MAE of 0.289, and R2 of 0.78, showcasing high accuracy in 5G coverage prediction. Random Forest models exhibit near-perfect metrics with 98.4% accuracy, precision, recall, and F1-score. Although CNN outperforms other deep learning models, it slightly trails Random Forest in performance. The research highlights that the final Random Forest and CNN models outperform other models and surpass those developed in previous studies. Notably, 2D Distance Tx Rx emerges as the most dominant feature parameter across all algorithms, significantly influencing 5G coverage prediction. The inclusion of horizontal and vertical distances further improves prediction results, surpassing previous studies. The study underscores the relevance of machinelearning and deep learningalgorithms in predicting 5G coverage and recommends their use in network development and optimization. In conclusion, while the Random Forest algorithm stands out as the optimal choice for 5G coverage prediction, deep learningalgorithms, particularly CNN, offer viable alternatives, especially for spatial data derived from satellite images. These accurate predictions facilitate efficient resource allocation by network providers, ensuring high-quality services in the rapidly evolving landscape of 5G technology. A profound understanding of coverage prediction remains pivotal for suc
For the utility to plan the resources accurately and balance the electricity supply and demand, accurate and timely forecasting is required. The proliferation of smart meters in the grids has resulted in an explosion ...
详细信息
For the utility to plan the resources accurately and balance the electricity supply and demand, accurate and timely forecasting is required. The proliferation of smart meters in the grids has resulted in an explosion of energy datasets. Processing such data is challenging and usually takes a longer time than the requirement of a short-term load forecast. The paper addresses this concern by utilizing parallel computing capabilities to minimize the execution time while maintaining highly accurate load forecasting models. In this paper, a thousand smart meter energy datasets are analyzed to perform day ahead, hourly short-term load forecast (STLF). The paper utilizes multi-processing to enhance the overall execution time of the forecasting models by submitting simultaneous jobs to all the processors available. The paper demonstrates the efficacy of the proposed approach through the choice of machinelearning (ML) models, execution time, and scalability. The proposed approach is validated on real energy consumption data collected at distribution transformers' level in Spanish Electrical Grid. Decision trees have outperformed the other models accomplishing a tradeoff between model accuracy and execution time. The methodology takes only 4 minutes to train 1,000 transformers for an hourly day-ahead forecast of (similar to 24 million records) utilizing 32 processors.
The current study aimed at evaluating the capabilities of seven advanced machinelearning techniques(MLTs),including,Support Vector machine(SVM),Random Forest(RF),Multivariate Adaptive Regression Spline(MARS),Artifici...
详细信息
The current study aimed at evaluating the capabilities of seven advanced machinelearning techniques(MLTs),including,Support Vector machine(SVM),Random Forest(RF),Multivariate Adaptive Regression Spline(MARS),Artificial Neural Network(ANN),Quadratic Discriminant Analysis(QDA),Linear Discriminant Analysis(LDA),and Naive Bayes(NB),for landslide susceptibility modeling and comparison of their *** machine learning algorithms with spatial data types for landslide susceptibility mapping is a vitally important *** study was carried out using GIS and R open source software at Abha Basin,Asir Region,Saudi ***,a total of 243 landslide locations were identified at Abha Basin to prepare the landslide inventory map using different data *** the landslide areas were randomly separated into two groups with a ratio of 70%for training and 30%for validating *** landslide-variables were generated for landslide susceptibility modeling,which include altitude,lithology,distance to faults,normalized difference vegetation index(NDVI),landuse/landcover(LULC),distance to roads,slope angle,distance to streams,profile curvature,plan curvature,slope length(LS),and *** area under curve(AUC-ROC)approach has been applied to evaluate,validate,and compare the MLTs *** results indicated that AUC values for seven MLTs range from 89.0%for QDA to 95.1%for *** findings showed that the RF(AUC=95.1%)and LDA(AUC=941.7%)have produced the best performances in comparison to other *** outcome of this study and the landslide susceptibility maps would be useful for environmental protection.
Internet of Things (IoT) refers to a wide variety of embedded devices connected to the Internet, enabling them to transmit and share information in smart environments with each other. The regular monitoring of IoT net...
详细信息
Internet of Things (IoT) refers to a wide variety of embedded devices connected to the Internet, enabling them to transmit and share information in smart environments with each other. The regular monitoring of IoT network traffic generated from IoT devices is important for their proper functioning and detection of malicious activities. One such crucial activity is the classification of IoT devices in the network traffic. It enables the administrator to monitor the activities of IoT devices which can be useful for proper implementation of Quality of Service, detect malicious IoT devices, etc. In the literature, various methods are proposed for IoT traffic classification using various machine learning algorithms. However, the accuracy of these machine learning algorithms depends on the data generated from various IoT devices, features extracted from network traffic, site at which IoT is deployed, etc. Moreover, the selection of features and machine learning algorithms are manual operations that are prone to error. Therefore, it is important to study the network traffic characteristics as well as suitable machine learning algorithms for accurate and optimized IoT traffic classification. In this article, we perform an in-depth comparative analysis of various popular machine learning algorithms using different effective features extracted from IoT network traffic. We utilize a public data set having 20 days of network traces generated from 20 popular IoT devices. Network traces are first processed to extract the significant features. We then selected state-of-the-art machine learning algorithms based on the recent survey papers for the IoT traffic classification. We then comparatively evaluated the performance of those machine learning algorithms on the basis of classification accuracy, speed, training time, etc. Finally, we provided a few suggestions for selecting the machinelearning algorithm for different use cases based on the obtained results.
The rapid spread of coronavirus disease (COVID-19) has become a worldwide pandemic and affected more than 15 million patients reported in 27 countries. Therefore, the computational biology carrying this virus that cor...
详细信息
The rapid spread of coronavirus disease (COVID-19) has become a worldwide pandemic and affected more than 15 million patients reported in 27 countries. Therefore, the computational biology carrying this virus that correlates with the human population urgently needs to be understood. In this paper, the classification of the human protein sequences of COVID-19, according to the country, is presented based on machine learning algorithms. The proposed model is based on distinguishing 9238 sequences using three stages, including data preprocessing, data labeling, and classification. In the first stage, data preprocessing's function converts the amino acids of COVID-19 protein sequences into eight groups of numbers based on the amino acids' volume and dipole. It is based on the conjoint triad (CT) method. In the second stage, there are two methods for labeling data from 27 countries from 0 to 26. The first method is based on selecting one number for each country according to the code numbers of countries, while the second method is based on binary elements for each country. According to their countries, machine learning algorithms are used to discover different COVID-19 protein sequences in the last stage. The obtained results demonstrate 100% accuracy, 100% sensitivity, and 90% specificity via the country-based binary labeling method with a linear support vector machine (SVM) classifier. Furthermore, with significant infection data, the USA is more prone to correct classification compared to other countries with fewer data. The unbalanced data for COVID-19 protein sequences is considered a major issue, especially as the US's available data represents 76% of a total of 9238 sequences. The proposed model will act as a prediction tool for the COVID-19 protein sequences in different countries.
暂无评论