The quality of the software can be improved by determining its faulty portions in the initial phases of the lifecycle of a software product. There are various machine learning algorithms proposed in literature studies...
详细信息
ISBN:
(纸本)9781538659335
The quality of the software can be improved by determining its faulty portions in the initial phases of the lifecycle of a software product. There are various machine learning algorithms proposed in literature studies that can be used to predict faulty classes. The machine learning algorithms determine faulty classes by using object oriented metrics as predictors. These models will allow the developers to predict faulty classes and concentrate the constraint resources in testing these weaker portions of the software. This study evaluates and compares the predictive capability of six machine learning algorithms amongst themselves and with logistic regression, a statistical algorithm for determining faulty portions of a software. The results are validated using seven open source software.
In recent years, as the popularity of mobile phone devices has increased, the short message service (SMS) has grown into a multi-billion dollar industry. At the same time, a reduction in the cost of messaging services...
详细信息
ISBN:
(纸本)9789811302121;9789811302114
In recent years, as the popularity of mobile phone devices has increased, the short message service (SMS) has grown into a multi-billion dollar industry. At the same time, a reduction in the cost of messaging services has resulted in the growth of unsolicited messages, known as spam, one of the major problems that not only causes financial damage to organizations but is also very annoying for those who receive them. Findings: Thus, the increasing volume of such unsolicited messages has generated the need to classify and block them. Although humans have the cognitive ability to readily identify a message as spam, doing so remains an uphill task for computers. Objectives: This is where machinelearning comes in handy by offering a data-driven and statistical method for designing algorithms that can help computer systems identify an SMS as a desirable message (HAM) or as junk (SPAM). But the lack of real databases for SMS spam, limited features and the informal language of the body of the text are probable factors that may have caused existing SMS filtering algorithms to underperform when classifying text messages. Methods/ Statistical Analysis: In this paper, a corpus of real SMS texts made available by the University of California, Irvine (UCI) machinelearning Repository has been leveraged and a weighting method based on the ability of individual words (present in the corpus) to point towards different target classes (HAM or SPAM) has been applied to classify new SMSs as SPAM and HAM. Additionally, different supervised machine learning algorithms such as support vector machine, k-nearest neighbours, and random forest have been compared on the basis of their performance in the classification of SMSs. Applications/ Improvements: The results of this comparison are shown at the end of the paper along with the desktop application for the same which helps in classification of SPAM and HAM. This is also developed and executed in python.
Internet of Things (IoT) systems produce large amounts of raw data in the form of log files. This raw data must then be processed to extract useful information. machinelearning (ML) has proved to be an efficient tech...
详细信息
ISBN:
(纸本)9781538649800
Internet of Things (IoT) systems produce large amounts of raw data in the form of log files. This raw data must then be processed to extract useful information. machinelearning (ML) has proved to be an efficient technique for such tasks, but there are many different ML algorithms available, each suited to different types of scenarios. In this work, we compare the performance of 22 state-of-the-art supervised ML classification algorithms on different IoT datasets, when applied to the problem of anomaly detection. Our results show that there is no dominant solution, and that for each scenario, several candidate techniques perform similarly. Based on our results and a characterization of our datasets, we propose a recommendation framework which guides practitioners towards the subset of the 22 ML algorithms which is likely to perform best on their data.
The progressive deployment of smart meters in Spain since 2015 has changed the retail electricity sector imposing a strong operational impact for agents participating in the wholesale market. Each residential customer...
详细信息
ISBN:
(纸本)9781538647226
The progressive deployment of smart meters in Spain since 2015 has changed the retail electricity sector imposing a strong operational impact for agents participating in the wholesale market. Each residential customer is characterized based on its real hourly consumption instead of the monthly aggregated consumption. New models are needed to forecast the hourly consumption of residential customers to buy the energy in the markets. This paper presents a robust and scalable methodology to predict the household's hourly energy consumption based on smart meters' data using machine learning algorithms such as Neural Gas, Classification Trees, Multilayer Perceptron Networks and XGBoost. First, a novel clustering methodology to aggregate consumers is presented. Secondly, a model to forecast the hourly consumption in the day-ahead is proposed for every cluster. Finally, a case example is used to illustrate the results, accuracy and robustness of the methodology.
In commercial projects, number of defects raised directly depends upon the releases. In order to get the idea of progress of the project it is necessary to estimate the average time required to fix the bug. This time ...
详细信息
ISBN:
(纸本)9781538680759
In commercial projects, number of defects raised directly depends upon the releases. In order to get the idea of progress of the project it is necessary to estimate the average time required to fix the bug. This time is referred as bug estimation time. It is essential to estimate the time of software bug for a proper project planning. In this paper, a new algorithm is proposed to determine bug estimation time with the help of machine learning algorithms. New developer is predicted and its average bug estimation time is calculated based on the bug estimation time of already existing developer. A comparative study for accuracy for different machine learning algorithms is carried out.
Polyvinylidene fluoride (PVDF) has widely used in detecting the interplanetary dust, In the case of penetration and non-penetration, the output signals of the PVDF are quite different. Detecting whether particles pene...
详细信息
ISBN:
(纸本)9781538665657
Polyvinylidene fluoride (PVDF) has widely used in detecting the interplanetary dust, In the case of penetration and non-penetration, the output signals of the PVDF are quite different. Detecting whether particles penetrate PVDF is a crucial issue. We create a set of experimenral equipment for collecting the signals from the PVDF. The equipment consists of particle emitter, shield, conditioning circuits and data acquisition equipment. 600 experiments are conducted, Among 200 experiments, the particles penetrate PVDF. We successfully distinguish penetration of PVDF using four machine learning algorithms: Anomaly Detection (AD), Artificial Neural Network (ANN), K-Nearest-Neighbors (KNN), and Support Vector machines (SVM). We propose a unique evaluation criteria OP to evaluate the performance of four classifiers including their accuracy and computational time. The results show that ANN is the best machinelearning algorithm for ow problem, and AD is not suitable for our problem.
The article presents a study which designed, developed and implemented novel machine learning algorithms (MLAs) as a support tool for lymphocyte-related diagnosis using cell population data (CPD), along with absolute ...
详细信息
The article presents a study which designed, developed and implemented novel machine learning algorithms (MLAs) as a support tool for lymphocyte-related diagnosis using cell population data (CPD), along with absolute lymphoid count, age and gender. Topics include stages to build the model; classification algorithms that were tested including decision trees, random forests, and k-nearest neighbour; and finding on the neural networks (NN) algorithm, based on CPD and absolute lymphoid counts.
machinelearning based acute stress detection systems use physiological sensor data to objectively predict acute stress. However, machine learning algorithms developed for stress detection do not consider how machine ...
详细信息
ISBN:
(数字)9781728185262
ISBN:
(纸本)9781728185279
machinelearning based acute stress detection systems use physiological sensor data to objectively predict acute stress. However, machine learning algorithms developed for stress detection do not consider how machinelearning algorithm performance may be affected based on a change(s) in the deployment environment. In this study, the deployment environment changes that are investigated are sensor type and sensor placement. Electrodermal activity (EDA) and skin temperature (TEMP) data from two different sensors, the RespiBAN Professional (RespiBAN) and the Empatica E4 are used to train three different machinelearning models. The RespiBAN records the EDA data from the rectus abdominis and records the skin TEMP data from the sternum. The Empatica E4 sensor records both EDA and skin TEMP data from the wrist. Three different support vector machine (SVM) models were trained to classify no-stress versus stress states using EDA and skin TEMP data. The first model was trained using data from the RespiBAN wearable sensor (SVM-R), the second model was trained using data from the Empatica E4 sensor (SVM-E) and third model was trained using data from both sensors (SVM-RE). The accuracy of SVM-R on a test set recorded by the RespiBAN sensor was 100%. The accuracy of SVM-E on a test set recorded by the Empatica E4 sensor was 99%. The accuracy of SVM-RE on a test set recorded by both the RespiBAN and Empatica E4 sensor was 82%. The accuracy of the SVM-R on a test set recorded by the Empatica E4 was 64%. These results suggest that research and development cannot be hardware or placement agnostic with wearable sensing data. Sensor type and placement must be taken into consideration when reporting performance metrics of physiological based stress detection machine learning algorithms.
Heart-related disorders are rapidly growing throughout the world. Artificial Intelligence with computational methods plays a significant role in early detection and diagnosis. This study has been devoted to finding th...
详细信息
Heart-related disorders are rapidly growing throughout the world. Artificial Intelligence with computational methods plays a significant role in early detection and diagnosis. This study has been devoted to finding the best classifiers for different valvular heart problems using popular CNN-based deep learning models and machine learning algorithms written in Python 3.8. In this research, the CNN-based Xception network model for the first time has been proposed for valvular heart sound analysis, which achieved an accuracy of 99.45% on the test dataset with a sensitivity of 98.5% and specificity of 98.7%. Compared with other deep learning models like LeNet-5, AlexNet, VGG16, VGG19, DenseNet121, Inception Net, and Residual Net, it is observed that accuracy for predicting the prediction of valvular heart disease is the highest, and testing time is the lowest in the proposed modified Xception network model. The features are Root Mean Square, Energy, Power, Zero Crossing Rate, Total Harmonic Distortion, Skewness, and Kurtosis in the time domain. The analysis has been made on heart sounds of normal and diseased patients available from the standard heart sound data repository. Finally, all the evaluated results were compared, and found SVM and Random Forest algorithms are the most effective among machinelearning methods. The proposed modified CNN-based Xception model works the best among all deep learning methods.
In today's world there is an active introduction of artificial intelligence technologies in various fields of science and technology. On the one hand, the publication by open source researchers of their achievemen...
详细信息
ISBN:
(数字)9781728181790
ISBN:
(纸本)9781728181806
In today's world there is an active introduction of artificial intelligence technologies in various fields of science and technology. On the one hand, the publication by open source researchers of their achievements can be called the force for progress. On the other hand, some studies have a dual purpose like the analysis of computer traffic using machinelearning methods. A number of questions arise regarding the ethical side of the issue of publishing such information in the public domain. The paper discusses the ethical and legal foundations of publishing open source dual-purpose machine learning algorithms. And also the problems associated with it.
暂无评论