Nonlinear optical crystals (NLO) are a key class of functional materials in the field of laser technology due to their excellent frequency conversion effects and physical-chemical stability. The research aims to find ...
详细信息
Nonlinear optical crystals (NLO) are a key class of functional materials in the field of laser technology due to their excellent frequency conversion effects and physical-chemical stability. The research aims to find NLO crystals with superior stability by predicting their formation energy. In this study, only compositional information is utilized as input features and models are constructed using regression algorithms such as Random Forest Regression (RFR), Support Vector Regression (SVR), and Gradient Boosting Regression (GBR). Notably, the GBR model exhibited outstanding predictive performance, with an R-2 value of 0.935 and root mean square error (RMSE) of 0.248 eV per atom. Additionally, SHapley Additive exPlanations (SHAP) analysis is employed to elucidate the fundamental principles behind the predictions by assessing the contribution of each feature to the formation energy. To validate the reliability of the models, first-principles calculations are conducted to predict the formation energy of materials of GaP, ZnGeP2, and CdSiP2. The error range between the model predictions and the Generalized Gradient Approximation (GGA) calculated values is approximate to 0.1 eV per atom, confirming the accuracy of the models.
Cenit Transporte y Logistica de Hidrocarburos (CENIT), operator of about 7000 km of hydrocarbon transport systems, which constitutes it the largest operator in Colombia, has developed a strategic alliance to structure...
详细信息
ISBN:
(纸本)9780791885055
Cenit Transporte y Logistica de Hidrocarburos (CENIT), operator of about 7000 km of hydrocarbon transport systems, which constitutes it the largest operator in Colombia, has developed a strategic alliance to structure an adaptive geotechnical susceptibility zoning using supervised learning algorithms. Through this exercise, has been implemented operational decision inferences with simple linguistic values. The difficulties proposed by the method considers the hydroclimatology of Colombia, which is conditioned by several phenomena of Climate Variability that affect the atmosphere at different scales such as the Oscillation of the Intertropical Convergence Zone - ITCZ (seasonal scale) and the occurrence of macroclimatic phenomena such as El Nino-La Nina Southern Oscillation (ENSO) (interannual scale). Likewise, it considers the geotechnical complexity derived from the different geological formation environments, the extension and geographical dispersion of the infrastructure, and its interaction with the climatic regimes, to differentiate areas of interest based on the geohazards of hydrometeorological origin, when grouped into five clusters. The results of this exercise stand out the importance of keep a robust record of the events that affect the infrastructure of hydrocarbon transportation systems and using data-guided intelligence techniques to improve the tools that support decision-making in asset management.
In bridges, monitoring data usually correspond to normal operational and environmental conditions, resulting in a lack of damage-related data. For this reason, machine learningalgorithms for damage detection are typi...
详细信息
In bridges, monitoring data usually correspond to normal operational and environmental conditions, resulting in a lack of damage-related data. For this reason, machine learningalgorithms for damage detection are typically unsupervised. Conversely, numerical models are employable as surrogates for extreme events rarely encountered during the existence of a bridge and for common damage scenarios, enabling the training of supervised machine learningalgorithms. In this paper, hybrid data obtained by integrating monitoring and numerical observations from the Z-24 Bridge benchmark are used for training supervised machine learningalgorithms to classify damage. Special attention is dedicated to the numerical model that, while being simple enough to be used on thousands of runs, must capture the complex nonlinear behavior that typifies damaged conditions. A model updating technique is used for the preliminary calibration of the finite-element model. Numerical data are generated in a probabilistic manner starting from the initial finite-element model, assuming the Gaussian distribution of the uncertain parameters. Three undamaged scenarios and three damaged scenarios are modeled. Subsequently, several supervisedlearning classifiers are trained with the same hybrid database and a comparative summary of their effectiveness is presented. The paper introduces a novel soft classification method that accounts for the overlapping of observations belonging to different structural conditions in the feature space. This study proves that data generated from probabilistic-based finite-element models can be used for structural health monitoring and damage identification in the context of bridges, therefore providing a hybrid supervised approach that can be easily applied in practice. Because bridges are one-of-a-kind, expensive structures, considerable efforts are made to ensure they remain in service for as long as possible. Consequently, monitoring systems are being installed on mor
Objective The interpretation of electrophysiological findings may lead to misdiagnosis in polyneuropathies. We investigated the electrodiagnostic accuracy of three supervised learning algorithms (SLAs): shrinkage disc...
详细信息
Objective The interpretation of electrophysiological findings may lead to misdiagnosis in polyneuropathies. We investigated the electrodiagnostic accuracy of three supervised learning algorithms (SLAs): shrinkage discriminant analysis, multinomial logistic regression, and support vector machine (SVM), and three expert and three trainee neurophysiologists. Methods We enrolled 434 subjects with the following diagnoses: chronic inflammatory demyelinating polyneuropathy (99), Charcot-Marie-Tooth disease type 1A (124), hereditary neuropathy with liability to pressure palsy (46), diabetic polyneuropathy (67), and controls (98). In each diagnostic class, 90% of subjects were used as training set for SLAs to establish the best performing SLA by tenfold cross validation procedure and 10% of subjects were employed as test set. Performance indicators were accuracy, precision, sensitivity, and specificity. Results SVM showed the highest overall diagnostic accuracy both in training and test sets (90.5 and 93.2%) and ranked first in a multidimensional comparison analysis. Overall accuracy of neurophysiologists ranged from 54.5 to 81.8%. Conclusions This proof of principle study shows that SVM provides a high electrodiagnostic accuracy in polyneuropathies. We suggest that the use of SLAs in electrodiagnosis should be exploited to possibly provide a diagnostic support system especially helpful for the less experienced practitioners.
Several reasons such as no free lunch theorem indicates that any learning algorithm in combination with a specific feature selection (FS) technique may give more accurate estimation than other learningalgorithms. The...
详细信息
Several reasons such as no free lunch theorem indicates that any learning algorithm in combination with a specific feature selection (FS) technique may give more accurate estimation than other learningalgorithms. Therefore, there is not a universal approach that outperforms other algorithms. Moreover, due to the large number of FS techniques, some recommended solutions such as using synthetic dataset or combining different FS techniques are very tedious and time consuming. In this study to tackle the issue of more accurate estimation of NPPs parameters, the performance of the major supervised learning algorithms in combination with the different FS techniques which are appropriate for parameters estimation is considered. The target parameters/transients of the Bushehr nuclear power plant (BNPP) are examined as the case study. By comparing three major supervised learning algorithms (i.e. the MLP-BR, the MLP-LM, and the SVM) in combination with six principal FS techniques (i.e. the NCA, the F-test, the Kendall's tau, the Pearson, the Spearman, and the Relief) for estimation of three important parameters of NPP (i.e. FMT, CMT, and the DNBR), the BR learning algorithm gives the more accurate results. Therefore, the results show that if the number of FS techniques is m and the number of learningalgorithms is n, the search space for more accurate estimation of the NPPs important parameters can be reduced from n x m to 1 x m. (C) 2021 Elsevier Ltd. All rights reserved.
Accurate screening of sewer conditions from monitoring data contributes to maintaining their operations (in terms of water quality and quantity) safe as well as reducing their associated costs (for operation and maint...
详细信息
Accurate screening of sewer conditions from monitoring data contributes to maintaining their operations (in terms of water quality and quantity) safe as well as reducing their associated costs (for operation and maintenance). This study was designed to assess the performance deterioration in sewer systems using a series of data classification tools, namely classical classification and novel supervised learning algorithms. The hydraulic data available for four sewer systems at Jinju City in Korea in a daily format during the monitoring period of 2013-2017 were provided as example data sets to those algorithms, which were evaluated independently with 70% training and 30% test data sets randomly divided. A self-organizing map (SOM) with a specialty in extracting hidden patterns in data was used to classify the data sets into three warning levels in the absence of any definite warning criteria for individual parameters. Our findings showed that three supervised learning algorithms achieved comparable performance in predicting warning levels defined from SOM to exiting classification algorithm in terms of accuracy and error rate. The network architecture optimized for supervised learning algorithms, in fact, varied significantly depending on the data sets, including that with additional variables on top of the original data set. In contrast, exiting classification algorithm unexpectedly produced high error rates in case that the hydraulic parameters had low coefficient of variation values reaching as high as 16%. Overall, these results demonstrated that novel supervised learning algorithms were more universally applicable for the assessment of hydraulic and/or water quality conditions in sewer systems than classical classification algorithm, regardless of the amount of variability in the data sets.
Given the importance of the Prophets Hadith for Muslims all over the world, where it is the second source of Islam after the Quran and the fundamental resource of legislation in the Islam community. This study is focu...
详细信息
Given the importance of the Prophets Hadith for Muslims all over the world, where it is the second source of Islam after the Quran and the fundamental resource of legislation in the Islam community. This study is focused on the Classification of hadith automatically into different categories according to its content, based on Hadith text. The objective of this study is to build a classifier model can classify and differentiate hadith categories, to predict its topic like prayer, fasting, and zakat;using data mining and machine learning techniques. In this study, many supervised learning algorithms plus combination methods such as the stacking algorithm was used to improve classification accuracy. The best three classifiers were evaluated mainly: the Decision Tree (DT), Random Forest (RF), and Nave Bayes (NB), which achieved higher accuracy reached up to 0.965, 0.956, and 0.951 respectively. Also, Binary (Boolean algebra) and TF-IDF methods as term weighting was applied to determine the frequency of each word in the hadith text, and identify the most significant features in training dataset using Information Gain (IG), and Chi-square (CHI). The experimental results showed that re-train these classifiers after applying IG and CHI as features selection;gave better accuracy compared to the previous results. Additional to, the best classifier gave high accuracy was DT, it has achieved higher accuracy in most test cases whether in the Boolean algebra or TF-IDF because it can deal with missing values and identifying the most essential features from the training dataset, known as features engineering.
Facial landmarks may be used to localize the movement of facial muscles that help identify an emotion. It is important that these points are appropriately represented to achieve a successful emotion recognition rate. ...
详细信息
ISBN:
(纸本)9781538666500
Facial landmarks may be used to localize the movement of facial muscles that help identify an emotion. It is important that these points are appropriately represented to achieve a successful emotion recognition rate. In this paper, the extraction of 68 facial landmarks, normalization methods and classification of 7 basic emotions are presented. The Cohn-Kanade Database is used as a test bed for the different emotion recognition tasks. The images are normalized by transforming the inputs based on similarity (CKCT) and the mean shape (CKMS). Forward Search and Principal Component Analysis are used to identify the most important features among the 68 facial points. Decision Tree, Logistic Regression, K-Nearest Neighbor and Multilayer Perceptron algorithms are used in building classifiers on reduced and complete feature set. It is interesting to note that facial points in the mouth area are found to be significant in the classification of emotions.
作者:
Kamoun, AichaBoujelbane, RahmaBoujelben, SaoussenUniv Sfax
Fac Econ & Management Res Lab Econ & Financial Anal & Modeling LAMEF Sfax Tunisia Univ Sfax
Fac Econ & Management Multimedia InfoRmat Syst & Adv Comp Lab Miracl ANLP Res Grp Sfax Tunisia Univ Sfax
Sfax Business Sch ESCS Dept Accounting Res Lab Econ & Financial Anal & Modeling LAMEF Airport Rd Km 4-5 Sfax 3018 Tunisia
This study aims to develop a tax non-compliance prediction model in Tunisia using supervised machine learningalgorithms. A data mining analysis was conducted following the Knowledge Discovery in Databases (KDD) proce...
详细信息
This study aims to develop a tax non-compliance prediction model in Tunisia using supervised machine learningalgorithms. A data mining analysis was conducted following the Knowledge Discovery in Databases (KDD) process, utilizing a dataset of 20,930 labeled observations from 2013 to 2017, comprising 110 attributes. We employed supervised learning algorithms, including K-Nearest Neighbors, Decision Trees, Na & iuml;ve Bayes, Gradient Boosting, and Random Forest, to identify the most accurate model. Notably, Random Forest outperformed the other algorithms, achieving a prediction accuracy of 83%. Furthermore, through a combined interpretation of feature importance derived from Random Forest, SHAP value analysis, and ANOVA, our findings provide tax auditors with insights into the most influential attributes for predicting tax non-compliance. This study holds significant practical implications by enhancing the efficiency of tax audits and supporting tax authorities in their efforts to combat tax non-compliance.
BackgroundThe current paradigm for evaluating computed tomography (CT) system performance relies on a task-based approach. As the Hotelling observer (HO) provides an upper bound of observer performances in specific si...
详细信息
BackgroundThe current paradigm for evaluating computed tomography (CT) system performance relies on a task-based approach. As the Hotelling observer (HO) provides an upper bound of observer performances in specific signal detection tasks, the literature advocates HO use for optimization purposes. However, computing the HO requires calculating the inverse of the image covariance matrix, which is often intractable in medical applications. As an alternative, dimensionality reduction has been extensively investigated to extract the task-relevant features from the raw images. This can be achieved by using channels, which yields the channelized-HO (CHO). The channels are only considered efficient when the channelized observer (CO) can approximate its unconstrained counterpart. Previous work has demonstrated that supervisedlearning-based methods can usually benefit CO design, either for generating efficient channels using partial least squares (PLS) or for replacing the Hotelling detector with machine-learning (ML) methods. PurposeHere we investigated the efficiency of a supervised ML-algorithm used to design a CO for predicting the performance of unconstrained HO. The ML-algorithm was applied either (1) in the estimator for dimensionality reduction, or (2) in the detector function. MethodsA channelized support vector machine (CSVM) was employed and compared against the CHO in terms of ability to predict HO performances. Both the CSVM and the CHO were estimated with channels derived from the singular value decomposition (SVD) of the system operator, principal component analysis (PCA), and PLS. The huge variety of regularization strategies proposed by CT system vendors for statistical image reconstruction (SIR) make the generalization capability of an observer a key point to consider upfront of implementation in clinical practice. To evaluate the generalization properties of the observers, we adopted a 2-step testing process: (1) achieved with the same regularization strat
暂无评论