This paper predicts the Diabetes Disease based on Data Mining Techniques of Classification algorithms. Classification algorithm and tools may reduce heavy work on Doctors. In this paper Evaluated as Classification Alg...
详细信息
ISBN:
(纸本)9781467382861
This paper predicts the Diabetes Disease based on Data Mining Techniques of Classification algorithms. Classification algorithm and tools may reduce heavy work on Doctors. In this paper Evaluated as Classification algorithms for the Classify of some Diabetes Disease Patient Datasets. Data Mining is one of the main algorithm is Classification. Classification algorithm Examine of the decision tree algorithm, Byes algorithm and Rule based algorithm. These algorithms are evaluate Error Rates and identify of the patients based evolution Function of the measure the accurate results.
This paper presents the results of classiying Arabic text documents using a decision tree algorithm. Experiments are performed over two self collected data corpus and the results show that the suggested hybrid approac...
详细信息
ISBN:
(纸本)9781424446148
This paper presents the results of classiying Arabic text documents using a decision tree algorithm. Experiments are performed over two self collected data corpus and the results show that the suggested hybrid approach of Document Frequency Thresholding using an embedded information gain criterion of the decision tree algorithm is the preferable feature selection criterion. The study concluded that the effectiveness of the improved classifier is very good and gives generalization accuracy about 0.93 for the scientific corpus and 0.91 for the literary corpus and we also conclude that the effectiveness of the decisiontree classifier was increased as we increase the training size, and the nature of the corpus has such a influence on the classifier performance.
Forest disturbances caused by Pantana phyllostachysae caused the death of extensive bamboos in Hubei province, China in 2015. Field survey is time-consuming and at higher cost to satisfy the forest management requirem...
详细信息
ISBN:
(纸本)9781509048601
Forest disturbances caused by Pantana phyllostachysae caused the death of extensive bamboos in Hubei province, China in 2015. Field survey is time-consuming and at higher cost to satisfy the forest management requirements. Satellite remote sensing technology having the characteristics of landscape of coverage, convenient, and fast in formation acquisition, is one of the most important and most effective ways of discriminating the damage. Based on decision tree algorithm, the study compared the results on the condition of segmentation variables on different scales and colors and proposed that GF-2 image are suitable for discriminating the damage as an alternative of field survey.
Currently, Most back-end web databases cannot be indexed by traditional hyperlink-based search engines due to their requirement of users' interactive queries via page form submission. In order to make hidden-Web i...
详细信息
ISBN:
(纸本)9783319220475;9783319220468
Currently, Most back-end web databases cannot be indexed by traditional hyperlink-based search engines due to their requirement of users' interactive queries via page form submission. In order to make hidden-Web information more easily accessible, this paper proposes a hierarchical classifier to locate domain-specific hidden Web entries at a large scale. The classifier is trained by appropriately selected page form features to get rid of non-relevant domains and non-searchable forms. Experiments conducted on eight different topics demonstrate that the technique can discover deep web interfaces accurately and efficiently.
This paper presents a new algorithm to improve the speed of threshold searching process in C4.5 by using the technique of genetic algorithms. In the threshold searching process in C4.5, the values in a numerical attri...
详细信息
ISBN:
(纸本)9781479941735
This paper presents a new algorithm to improve the speed of threshold searching process in C4.5 by using the technique of genetic algorithms. In the threshold searching process in C4.5, the values in a numerical attribute are sorted first and then the mid-point between every two consecutive values is calculated and designated as a candidate threshold. This process can be time consuming and it is not practical for large data. Our algorithm generates a population of possible thresholds and converges to the best threshold value rapidly. Our experimental results have shown that significant time reduction has been achieved by using our algorithm in threshold searching process.
Photovoltaic-thermal (PV-T) systems are expected to fulfil an increasingly vital role in future energy production. The current research endeavors to showcase machine learning modeling and control of a water-based PV-T...
详细信息
ISBN:
(纸本)9798350315431
Photovoltaic-thermal (PV-T) systems are expected to fulfil an increasingly vital role in future energy production. The current research endeavors to showcase machine learning modeling and control of a water-based PV-T collector. In this work, the PV-T collector is modeled using a decision tree algorithm and artificial neural network (ANN). The predicted outputs are compared with the actual outputs to validate the models. The ANN-based model performed better and proved its efficacy in training and testing. Further, various control strategies are implemented and their performance is compared. All the techniques presented are illustrated through simulation results.
In data classification mining, the decisiontree method is a key algorithm. ID3 (Iterative Dichotomiser 3) algorithm which was presented by Quinlan is a famous decision tree algorithms, but ID3 has some shortcomings s...
详细信息
ISBN:
(纸本)9781538635247
In data classification mining, the decisiontree method is a key algorithm. ID3 (Iterative Dichotomiser 3) algorithm which was presented by Quinlan is a famous decision tree algorithms, but ID3 has some shortcomings such as high complex computation in computing the information entropy expression, multivalue bios problem in the process of selecting an optimal attribute, large scales, etc. In order to solve the above problems, the improved ID3 algorithm is proposed, which combines the simplified information entropy with coordination degree in rough set theory. The experiment result has proved the feasibility of the optimized way.
Based on the random forest classification algorithm, a warning model of water bloom is proposed. Using the collected data, Select the water quality, meteorological factors which like Chlorophyll a (Chl-a), water tempe...
详细信息
ISBN:
(纸本)9781509066643
Based on the random forest classification algorithm, a warning model of water bloom is proposed. Using the collected data, Select the water quality, meteorological factors which like Chlorophyll a (Chl-a), water temperature (T), PH, nitrogen and phosphorus ratio (TN: TP), chemical oxygen demand (COD), total nitrogen (TN), total phosphorus (TP), dissolved oxygen Light (E) and so on as the impact factor and use them establish a warning model for Water bloom. And compared with the prediction accuracy of neural network model and SVM model. The results show that the water bloom warning model is established by using stochastic forest classification algorithm, the prediction accuracy is slightly higher than other algorithms. And the random forest algorithm has the characteristics of high robustness, China good performance, strong practicability, can effectively carry out water bloom early warning.
Machine learning is widely being used in medical field for disease diagnostics and *** area of machine learning is mainly classified into 3 parts: supervised, unsupervised and reinforcement *** machine learning (ML) a...
详细信息
ISBN:
(纸本)9781665416504
Machine learning is widely being used in medical field for disease diagnostics and *** area of machine learning is mainly classified into 3 parts: supervised, unsupervised and reinforcement *** machine learning (ML) algorithms are used in this paper for modeling and showing the impact of increased testing on the number of daily confirmed cases of COVID-19. The algorithms used to carry out this study are decisiontree regression and random forest regression. Machine learning for modeling has proven to be significant for forecasting and hence decision making over the future course of actions. In this paper, Gaussian process regression has been used for modeling as well as forecasting the daily confirmed cases in South Korea. The results obtained show that if the number of tests conducted is increased to the population of South Korea, approximately equal to 51, 286, 183, the peak in the daily cases is obtained earlier and hence the overall number of daily cases is less compared to current cases.
Social networking services (SNS) have increased in popularity over the last decade. They have become major platforms for e-commerce, personal branding, socialization and information. The success of social networking s...
详细信息
ISBN:
(纸本)9781479988259
Social networking services (SNS) have increased in popularity over the last decade. They have become major platforms for e-commerce, personal branding, socialization and information. The success of social networking services like Facebook and Twitter as well as LinkedIn, LiveJournal and Foursquare and the variety of their usages leads their users to create a set of profiles on different SNS. Recently, social networking service aggregators have proposed centralizing the multiple social networking profiles of a given user in order to facilitate his interactions with social networking services. Such aggregators allow the messages received by a profile over multiple SNS to be retrieved, edited and posted with much less effort. Despite their obvious advantages, we highlight in this paper the risk of potential data leaks due to the inexperienced use of such tools. For this purpose, we provide a classification of online SNS and present their specificities with regard to the publicly exposed data of a user. Based on this classification, we investigate the possible insecure use of aggregators with an inappropriate set of SNS, which could lead to rendering sensitive data accessible to people it wasn't intended for. We present a decisiontree approach for identifying a possible data leak based on the three following criteria: opinion, interest and location. We finally show the result of this approach on popular social networking aggregators.
暂无评论