The introduction of machinelearning in large scale utility networks extends the room for improvement in the quality of service and maintenance costs. The ever expanding network of smart meters allows for a more accur...
详细信息
ISBN:
(纸本)9781538684450
The introduction of machinelearning in large scale utility networks extends the room for improvement in the quality of service and maintenance costs. The ever expanding network of smart meters allows for a more accurate estimation of the state of the water distribution systems, at the same time requiring modern data processing solutions. By fusion with the more traditional approach in this field of research it is possible to enhance the existing capabilities for network analysis and to extend the algorithms to the level of cognitive abilities that form a basis for more efficient decision support system. In this paper we extend the fault sensitivity analysis for water distribution systems with the insights provided by state-of-the-art machine learning algorithms for data clustering and anomaly detection.
There are thousands of products with hundreds of reviews on major e-commerce sites such as Amazon and eBay. Customers often browse through positive and negative reviews before making a purchase decision. Reading hundr...
详细信息
ISBN:
(纸本)9783031628801;9783031628818
There are thousands of products with hundreds of reviews on major e-commerce sites such as Amazon and eBay. Customers often browse through positive and negative reviews before making a purchase decision. Reading hundreds of reviews for a single product can be time-consuming and overwhelming for customers. Sentiment analysis approach has been identified to address this issue. The study aspires to use several machine learning algorithms to do sentiment analysis on Amazon product reviews. For this purpose, supervised learning, online learning, and ensemble learningalgorithms have been applied to Amazon product reviews obtained from the Kaggle database. Natural language processing and data mining techniques were applied to the dataset. Firstly, natural language processing techniques were applied for data preprocessing. The dataset was separated into 20% for testing and 80% for training. Term Frequency-Inverse Document Frequency (TF-IDF) vectorization was employed to create word vectors. Passive Aggressive (PA), SupportVector machine (SVM), Random Forest (RF), AdaBoost, K-Nearest Neighbor (KNN), and XGBoost algorithms were employed in model implementation, which was the crucial step. Accuracy rates, cross-validation scores, confusion matrices, and classification report results were compared. The Random Forest algorithm provided the highest accuracy rate with a prediction accuracy of 96.13%.
Big Data appears with not only the increasing size of data but also complex and different processing and analytical tools. This research aims to compare some selected machine learning algorithms on datasets of differe...
详细信息
ISBN:
(纸本)9783319483085;9783319483078
Big Data appears with not only the increasing size of data but also complex and different processing and analytical tools. This research aims to compare some selected machine learning algorithms on datasets of different types and sizes using Apache spark tool in order to make a fair judgment about which one is the best fitting in. The algorithms were compared based on few parameters including mainly accuracy and training time. The algorithms were applied on three datasets of different fields: marketing, packing and statistics, and security datasets. The findings of this experiment show that the decision tree algorithm is the most suitable algorithm for marketing and security datasets. Additionally, logistic regression algorithm had the highest accuracy for packing and statistics dataset.
The aim is to use machinelearning methods to select indicators of undergraduates' development. This paper adopts questionnaire survey. Then, we collected nearly 3000 pieces of data through 2 months and used the l...
详细信息
ISBN:
(纸本)9781665480482
The aim is to use machinelearning methods to select indicators of undergraduates' development. This paper adopts questionnaire survey. Then, we collected nearly 3000 pieces of data through 2 months and used the lasso regression model (LASSO), ridge regression model (RRM) and partial least squares regression model (PLSRM) of machinelearning to select more relevant indicators among 15 indicators. According to the results, these indicators are main indicator of the development of undergraduate, included score, business project, paper, patent, competitions in academic, honor, public service, the level of English and professional qualification certificate. It can not only lay the foundation for the subsequent research but also help mentor to guide them in the right way.
Discovering functionalities for unknown enzymes has been one of the most common bioinformatics tasks. Functional annotation methods based on phylogenetic properties have been the gold standard in every genome annotati...
详细信息
ISBN:
(纸本)9783031349522;9783031349539
Discovering functionalities for unknown enzymes has been one of the most common bioinformatics tasks. Functional annotation methods based on phylogenetic properties have been the gold standard in every genome annotation process. However, these methods only succeed if the minimum requirements for expressing similarity or homology are met. Alternatively, machinelearning and deep learning methods have proven helpful in this problem, developing functional classification systems in various bioinformatics tasks. Nevertheless, there needs to be a clear strategy for elaborating predictive models and how amino acid sequences should be represented. In this work, we address the problem of functional classification of enzyme sequences (EC number) via machinelearning methods, exploring various alternatives for training predictive models and numerical representation methods. The results show that the best performances are achieved by applying representations based on pre-trained models. However, there needs to be a clear strategy to train models. Therefore, when exploring several alternatives, it is observed that the methods based on CNN architectures proposed in this work present a more outstanding facility for learning and pattern extraction in complex systems, achieving performances above 97% and with error rates lower than 0.05 of binary cross entropy. Finally, we discuss the strategies explored and analyze future work to develop integrated methods for functional classification and the discovery of new enzymes to support current bioinformatics tools.
Wireless sensor networks suffer from a wide range of faults and anomalies which hinder their smooth working. These faults are even more significant for medical wireless sensor networks, which simply cannot afford such...
详细信息
Wireless sensor networks suffer from a wide range of faults and anomalies which hinder their smooth working. These faults are even more significant for medical wireless sensor networks, which simply cannot afford such inconsistencies. To combat this issue, various fault detection mechanisms have been developed. We tried enhancing the performance of one such mechanism, and our findings are presented in this paper. Using machine learning algorithms, we will show through our experiments on real medical datasets that our approach gives more accurate results than other existing fault detection mechanisms. This research will be critical in detecting sensor faults quickly, accurately and with a low false alarm ratio. (C) 2015 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://***/licenses/by-nc-nd/4.0/)
Cloud computing became very popular in past few years, and most of the business and home users rely on its services. Because of its wide usage, cloud computing services became a common target of different cyber-attack...
详细信息
ISBN:
(纸本)9789532330922
Cloud computing became very popular in past few years, and most of the business and home users rely on its services. Because of its wide usage, cloud computing services became a common target of different cyber-attacks executed by insiders and outsiders. Therefore, cloud computing vendors and providers need to implement strong information security protection mechanisms on their cloud infrastructures. One approach that has been taken for successful threat detection that will lead to the successful attack prevention in cloud computing infrastructures is the application of machine learning algorithms. To understand how machine learning algorithms can be applied for cloud computing threat detection, we propose the cloud computing threat classification model based on the feasibility of machine learning algorithms to detect them. In this paper, we addressed three different criteria types, where we considered three types of classification: a) type of learning algorithm, b) input features and c) cloud computing level. Results proposed in this paper can contribute to further studies in the field of cloud threat detection with machine learning algorithms. More specifically, it will help in selecting appropriate input features, or machine learning algorithms, to obtain higher classification accuracy.
Absenteeism is the usual or recurrent absence from work is continuously causing disruption in the smooth running of business, affecting the organizational performance and productivity and impacting on the employees...
详细信息
ISBN:
(纸本)9789532330991
Absenteeism is the usual or recurrent absence from work is continuously causing disruption in the smooth running of business, affecting the organizational performance and productivity and impacting on the employees' morale. The Oil Refinery in Albania (ARMO), employing 1200 employees is facing high rate of absences. If necessary measures are not being serious dealt with, the issue of absenteeism may jeopardize the operation and production. Prediction of absenteeism is too complex influenced by many factors. Usage of data mining and machine learning algorithms is a good solution to predict and analyze it. The aim of this paper is to identify and evaluate the appropriate ML algorithms to predict and analyses absenteeism at workplace. The dataset taken into account consists of some attributes such as: age, education, employment category, day, month, length of service ect, and 125000 records are considered. Analysis and comparison of various algorithms in terms of accuracy, precision and sensitivity are done in Weka tool.
The quality of life for upper limb amputees can be greatly improved by the adoption of poly-articulated myoelectric prostheses. Typically, in these applications, a pattern recognition algorithm is used to control the ...
详细信息
ISBN:
(纸本)9781665417143
The quality of life for upper limb amputees can be greatly improved by the adoption of poly-articulated myoelectric prostheses. Typically, in these applications, a pattern recognition algorithm is used to control the system by converting the recorded electromyographic activity (EMG) into complex multi-degrees of freedom (DoFs) movements. However, there is currently a trade-off between the intuitiveness of the control and the number of active DoFs. We here address this challenge by performing simultaneous multi-joint control of the Hannes system and testing several state-of-the-art classifiers to decode hand and wrist movements. The algorithms discriminated multi-DoF movements from forearm EMG signals of 10 healthy subjects reproducing hand opening-closing, wrist flexion-extension and wrist pronation-supination. We first explored the effect of the number of employed EMG electrodes on device performance through the classifiers optimization in terms of F1Score. We further improved classifiers by tuning their respective hyperparameters in terms of the Embedding Optimization Factor. Finally, three mono-lateral amputees tested the optimized algorithms to intuitively and simultaneously control the Hannes system. We found that the algorithms performances were similar to that of healthy subjects, particularly identifying the Non-Linear Regression classifier as the ideal candidate for prosthetic applications.
By increase of various radio access network (RAN) services, available spectrum resources for mobile communications get decrease, and efficient use of the radio resource is becoming a very important issue. In order to ...
详细信息
ISBN:
(纸本)9781457720710
By increase of various radio access network (RAN) services, available spectrum resources for mobile communications get decrease, and efficient use of the radio resource is becoming a very important issue. In order to optimize the radio resource usage and maxmize the throughput and quality of service (QoS), the link aggregation technologies to utilize multiple different available RANs have been studied. However, in such heterogeneous wireless networks, it is difficult to improve the throughput by their aggregation because of the differences among the QoSs of the different RANs. In this paper, we propose an autonomous parameter optimization scheme using a machinelearning algorithm, which maximize the throughput of the heterogeneous RAN aggregation system. We evaluate the performance of the proposed scheme implemented on a cognitive wireless network system called Cognitive Wireless Cloud (CWC) system, connected to real wireless network services, such as HSDPA, WiMAX and W-CDMA. Our experimental results of the proposed system show that the aggregation throughput can be improved with increase of the training samples, which are collected autonomously.
暂无评论