BackgroundCardiovascular diseases (CADs) are the first leading cause of death across the world. World Health Organization has estimated that morality rate caused by heart diseases will mount to 23 million cases by 203...
详细信息
BackgroundCardiovascular diseases (CADs) are the first leading cause of death across the world. World Health Organization has estimated that morality rate caused by heart diseases will mount to 23 million cases by 2030. Hence, the use of data mining algorithms could be useful in predicting coronary artery diseases. Therefore, the present study aimed to compare the positive predictive value (PPV) of CAD using artificial neural network (ANN) and SVM algorithms and their distinction in terms of predicting CAD in the selected *** present study was conducted by using datamining techniques. The research sample was the medical records of the patients with coronary artery disease who were hospitalized in three hospitals affiliated to AJA University of Medical Sciences between March 2016 and March 2017 (n=1324). The dataset and the predicting variables used in this study was the same for both datamining techniques. Totally, 25 variables affecting CAD were selected and related data were extracted. After normalizing and cleaning the data, they were entered into SPSS (V23.0) and Excel 2013. Then, R 3.3.2 was used for statistical *** SVM model had lower MAPE (112.03), higher Hosmer-Lemeshow test's result (16.71), and higher sensitivity (92.23). Moreover, variables affecting CAD (74.42) yielded better goodness of fit in SVM model and provided more accurate result than the ANN model. On the other hand, since the area under the receiver operating characteristic (ROC) curve in the SVM algorithm was more than this area in ANN model, it could be concluded that SVM model had higher accuracy than the ANN *** to the results, the SVM algorithm presented higher accuracy and better performance than the ANN model and was characterized with higher power and sensitivity. Overall, it provided a better classification for the prediction of CAD. The use of other data mining algorithms are suggested to improve the positive predictive value o
Smart grid is an emerging and promising technology. It uses the power of information technologies to deliver intelligently the electrical power to customers, and it allows the integration of green technology to meet t...
详细信息
ISBN:
(纸本)9781728109275
Smart grid is an emerging and promising technology. It uses the power of information technologies to deliver intelligently the electrical power to customers, and it allows the integration of green technology to meet the environmental requirements. Unfortunately, information technologies have inherent vulnerabilities and weaknesses that expose the smart grid to a wide variety of security risks. The Intrusion detection system (IDS) plays an important role in securing smart grid networks and detecting malicious activity, yet it suffers from several limitations. Many research papers have been published to address these issues using several algorithms and techniques. Therefore, a detailed comparison between these algorithms is needed. This paper presents an overview of four data mining algorithms used by IDS in Smart Grid. A performance evaluation of these algorithms is conducted based on several metrics including the probability of detection, probability of false alarm, probability of miss detection, efficiency, and processing time. Results show that Random Forest outperforms the other three algorithms in detecting attacks with a higher probability of detection, lower probability of false alarm, lower probability of miss detection, and higher accuracy.
Driving daily through traffic congestion has been recognised as a major cause of stress. High levels of stress while driving negatively impact the driver's decisions which could potentially lead to accidents and o...
详细信息
Driving daily through traffic congestion has been recognised as a major cause of stress. High levels of stress while driving negatively impact the driver's decisions which could potentially lead to accidents and other long-term health hazards. Accordingly, there is a great need to determine stress levels for drivers based on measuring and predicting the major causes (features or classes) that increase stress levels. In this paper, the problem of predicting automobile drivers' stress levels, as experienced during actual driving, is investigated through the application of five different data mining algorithms, namely K-Nearest Neighbour (KNN), Decision Tree (J48), Random Forest (RF), Support Vector Machine (SVM), and Artificial Neural Networks (ANN). An experiment was conducted on 14 drivers taking various routes in Amman Jordan, with a wearable biomedical device attached to the driver to instantly collect physiological data. The collected data (dataset) is grouped into two different categories, namely 'Yes' to signify the presence of stress and 'No' to signify the absence of stress. In order to efficiently apply data mining algorithms to the data set, oversampling was used to avoid the negative effect of driver samples with a lesser class on the prediction of stress. The findings are evaluated in relation to stress prediction and accordingly contrasted alongside standard reference approaches that do not consider oversampling and/or feature selection using the Friedman rank test. The proposed approach, in combination with RF, was seen to surpass any others in terms of accuracy, AUC, specificity, and sensitivity. The accuracy, AUC, specificity, and sensitivity rates produced by RF utilising our proposed approach were 98.92%, 99.91%, 98.46%, and 99.36%, respectively.
In determining a limited number of analytes in samples having a complex chemical composition with an unknown matrix, the combination of data mining algorithms (problems of clustering and regression) is proposed. This ...
详细信息
In determining a limited number of analytes in samples having a complex chemical composition with an unknown matrix, the combination of data mining algorithms (problems of clustering and regression) is proposed. This makes it possible to compensate for the influence of the components of the host medium on the intensity of the analytical line of an element being determined. The technology developed is tested in the X-ray fluorescence determination of S, Fe, Cu, Zn, and As in float concentrate samples during processing of polymetallic ores and V and Fe in synthetic film samples that are adequate in physicochemical properties to samples of welding fumes deposited on a filter. The error of the results of analysis has decreased by a factor 1.5-5 compared to the use of the Lucas-Tooth classical regression equation. The developed technology considerably increases the rapidity of analysis when it is used with X-ray spectrometers of consecutive action.
The article describes a approach of parallel data mining algorithms to be executed on multicore processors of various architecture. The suggested method presents an algorithm as a consequence of pure functions with un...
详细信息
ISBN:
(纸本)9781479989997
The article describes a approach of parallel data mining algorithms to be executed on multicore processors of various architecture. The suggested method presents an algorithm as a consequence of pure functions with unified interfaces. For parallel execution additional functions are introduced to share data and models between the parallel threads. Besides such functions allow to obtain various parallel algorithm structures and implement various strategies of execution for different environment conditions. Application of the described method is illustrated through algorithm Naive Bayes.
The present paper describes the formal model of data mining algorithms. These models consider each datamining algorithm as a sequence of operations. This allows us to determine ways for parallel execution of data min...
详细信息
ISBN:
(纸本)9783319151472;9783319151465
The present paper describes the formal model of data mining algorithms. These models consider each datamining algorithm as a sequence of operations. This allows us to determine ways for parallel execution of data mining algorithms. The software implementation of the formal model is executed on the Java language. A few data mining algorithms were developed on the basis of the suggested formal modal. The algorithm k-means is described in the paper as the example.
The article describes extension of lambda-calculation for creation of parallel data mining algorithms. The proposed approach uses presentation of the algorithm as a consequence of pure functions with unified interface...
详细信息
ISBN:
(纸本)9783319219097;9783319219080
The article describes extension of lambda-calculation for creation of parallel data mining algorithms. The proposed approach uses presentation of the algorithm as a consequence of pure functions with unified interfaces. For parallel execution we use special function that allows to change a structure of the algorithm and to implement various strategies for processing of data set and model.
the present paper describes the framework for creating data mining algorithms from thread-safe functional blocks. This framework requirements decomposition of algorithms into independently functioning blocks. These bl...
详细信息
ISBN:
(纸本)9781479973064
the present paper describes the framework for creating data mining algorithms from thread-safe functional blocks. This framework requirements decomposition of algorithms into independently functioning blocks. These blocks must have unified interfaces and implement pure functions. The framework allows create new data mining algorithms from existing blocks and improves the existing algorithms by optimizing single blocks or the whole structure of the algorithms. This becomes possible due to a number of important properties such as thread-safety inherent in pure functions and hence functional blocks.
The purpose of this study was to ascertain the fresh herbage yield, fertilizer dosage, and plant characteristics of the Sorghum-Sudangrass hybrid grown in arid and semi-arid regions, as well as their interrelationship...
详细信息
The purpose of this study was to ascertain the fresh herbage yield, fertilizer dosage, and plant characteristics of the Sorghum-Sudangrass hybrid grown in arid and semi-arid regions, as well as their interrelationships. For this reason, data from the Sorghum-Sudangrass hybrid were used to assess the predictive performance of several datamining techniques, including CHAID, CART, MARS, and Bagging MARS. Plant traits were measured in Konya and Sanliurfa during 2021 and 2022. The descriptive statistical values were calculated as follows: plant height 306.27 cm, stem diameter 9.47 mm, fresh herbage yield 10852.51 kg/da, crude protein ratio 9.66%, acid detergent fiber 33.39%, neutral detergent fiber 51.85%, acid detergent lignin 9.76%, dry matter digestibility 62.88%, dry matter intake 2.34%, and relative feed value 114.68 (average values). The predictive capacities of the fitted models were assessed using model fit statistics such as the coefficient of determination (R-2), adjusted R-2, root mean square error (RMSE), mean absolute percentage error (MAPE), standard deviation ratio (SD ratio), and Akaike Information Criterion (AIC). With the lowest values for RMSE, MAPE, SD ratio, and AIC (246, 1.926, 0.085, and 845, respectively), and the highest R-2 value (0.993) and adjusted R-2 value (0.989), the MARS algorithm was determined to be the best model for characterizing fresh herbage yield. As a solid alternative to other datamining techniques, the MARS algorithm was shown to be the most appropriate model for forecasting fresh herbage production.
Software testing plays a crucial role in enhancing software quality. A significant portion of the time and cost in software development is dedicated to testing. Automation, particularly in generating test cases, can g...
详细信息
Software testing plays a crucial role in enhancing software quality. A significant portion of the time and cost in software development is dedicated to testing. Automation, particularly in generating test cases, can greatly reduce the cost. Model-based testing aims at generating automatically test cases from models. Several model based approaches use model checking tools to automate test case generation. However, this technique faces challenges such as state space explosion and duplication of test cases. This paper introduces a novel solution based on data mining algorithms for systems specified using graph transformation systems. To overcome the aforementioned challenges, the proposed method wisely explores only a portion of the state space based on test objectives. The proposed method is implemented using the GROOVE tool set for model-checking graph transformation systems specifications. Empirical results on widely used case studies in service-oriented architecture as well as a comparison with related state-of-the-art techniques demonstrate the efficiency and superiority of the proposed approach in terms of coverage and test suite size.
暂无评论