Software Engineering is a discipline that encompasses processes associated with the development of interactive systems. The perceived quality of an interactive system is heavily influenced by the user interface design...
详细信息
Software Engineering is a discipline that encompasses processes associated with the development of interactive systems. The perceived quality of an interactive system is heavily influenced by the user interface design, which may result in many challenges. One such challenge is design-level requirements analysis. The success of the software system is mostly dependent on how well users' requirements have been understood and translated into appropriate functionalities. During the interactive system design process, it is common to find recurring problems in human-computer interactions, for which reusing solutions is highly feasible. Interaction design patterns seek to support designers in decision making during the design of interactive systems. Due to the design task tends to be subjective and prone to errors. This work aims at presenting and evaluating an interaction design patterns recommendation model based on design-level requirements classification, through the application of supervised machine learning algorithms. To compare the performance of four classification algorithms, a study was carried out, in which the linear support vector machine was the most suitable to this problem. The results of this work can be used for implementing frameworks that can better support designers' decision making when designing user interfaces.
The emergence of Industry 4.0, also known as the fourth industrial revolution, has brought forth the concept of prognostics and health management (PHM) as an inevitable trend in the realm of industrial big data and sm...
详细信息
The emergence of Industry 4.0, also known as the fourth industrial revolution, has brought forth the concept of prognostics and health management (PHM) as an inevitable trend in the realm of industrial big data and smart manufacturing. This study aims to present a proof-of-concept that illustrates how machinelearning can be employed to analyze industrial facility data and anticipate the condition of industrial machines. Specifically, a comprehensive case study focusing on vibration monitoring is conducted. The proposed models aim to predict maintenance requirements for the forced blower of a chemical plant by utilizing vibration data obtained during the manufacturing process. To validate the methodology, five different machinelearningalgorithms, namely logistic regression (LR), support vector machine (SVM), K-nearest neighbor (KNN), extreme gradient boosting (XGBoost), and random forest (RF), are employed. The evaluation metrics used include Matthews correlation coefficient (MCC) and receiver operator characteristic curve (ROC). This study aims to establish a relationship between machine failures caused by vibration and the prediction of both healthy and faulty bearings using the machinelearning approaches. The findings indicate that the XGBoost algorithm outperforms other approaches with an MCC of 0.800 and a higher area under the ROC curve.
Food is essential for life. The food we take should be pure, nutritious and free from any type of adulteration for proper maintenance of human health. In this paper, an IoT based food and formalin detection technique ...
详细信息
ISBN:
(纸本)9781665440868
Food is essential for life. The food we take should be pure, nutritious and free from any type of adulteration for proper maintenance of human health. In this paper, an IoT based food and formalin detection technique is developed to detect the presence of formalin using machine-learning approaches. Volatile compound HCHO gas sensor connected with Raspberry pi3 were used to extract the concentration of the formalin as a function of output voltage of any fruit or vegetable and different machinelearningalgorithms were used to classify the fruit or vegetable based on their extracted features. supervised machine learning algorithms have been incorporated in our system to accurately predict the correct concentration of formalin at all temperatures which is also able to correctly classify between artificially added and naturally formed formalin.
As the amount of available digital documents keeps growing rapidly, extracting useful information from them has become a major challenge. Data mining, natural language processing, and machinelearning are powerful tec...
详细信息
ISBN:
(纸本)9781538627150
As the amount of available digital documents keeps growing rapidly, extracting useful information from them has become a major challenge. Data mining, natural language processing, and machinelearning are powerful techniques that can be used together to deal with this problem. Depending on the task at hand, there are many different approaches that can be used. The methods available are continuously improved, but not all of them have been tested and compared in a set of coherent problems using supervised machine learning algorithms. For example, what happens to the quality of the methods if we increase the training data size from, say, 100 MB to over 1 GB? Moreover, are quality gains worth it when the rate of data processing diminishes? Can we trade quality for time efficiency and recover the quality loss by just being able to process more data? We attempt to answer these questions in a general way for text processing tasks, considering the trade-offs involving training data size, learning time, and quality obtained. For this, we propose a performance trade-off framework and apply it to three important tasks: Named Entity Recognition, Sentiment Analysis and Document Classification. These problems were also chosen because they have different levels of object granularity: words, paragraphs, and documents. For each problem, we selected several supervised machine learning algorithms and we evaluated the trade-offs of them on large publicly available data sets (news, reviews, patents). To explore these trade-offs, we use different data subsets of increasing size ranging from 50 MB to several GB. For the last two tasks, we also consider similar algorithms with two different data sets and two evaluation techniques, to study their impact on the resulting trade-offs. We find that the results do not change significantly and that most of the time the best algorithms are the ones with fastest processing time. However, we also show that the results for small data (say less than 1
In this paper, the studies carried out to detect objectionable expressions in any text will be explained. Experiments were performed with Sentence transformers, supervised machine learning algorithms, and Bert transfo...
详细信息
ISBN:
(纸本)9798350345650
In this paper, the studies carried out to detect objectionable expressions in any text will be explained. Experiments were performed with Sentence transformers, supervised machine learning algorithms, and Bert transformer architecture trained in English, and the results were observed. To prepare the dataset used in the experiments, the natural language processing and machinelearning methodologies of the toxic and non-toxic contents in the labeled text data obtained from the Kaggle platform are explained, and then the methods and performances of the models trained using this dataset are summarized in this paper.
In recent years, online reviews have been playing an important role in making purchase decisions. This is because, these reviews can provide customers with large amounts of useful information about the goods or servic...
详细信息
ISBN:
(纸本)9781538661475
In recent years, online reviews have been playing an important role in making purchase decisions. This is because, these reviews can provide customers with large amounts of useful information about the goods or service. However, to promote factitiously or lower the quality of the products or services, spammers may forge and produce fake reviews. Due to such behavior of the spammers, customers would be misleaded and make wrong decisions. Thus detecting fake (spam) reviews is a significant problem. In this paper, we propose two types of features and apply supervised machine learning algorithms for performing classification on Yelp's real-life data. In terms of features used, there are two new semantic feature sets: readability features and topic features. Our results show that our proposed new features are more effective than n-gram features in detecting spam reviews. To improve classification on the real Yelp review data, we use a set of behavioral features about reviewers and their reviews for learning, which dramatically improves the classification result on real-life opinion spam data. For further improvement, we also ensure the number of reviewers instead of reviews is balanced.
Cardiac diseases are diseases that affect people across the globe, and cardiac failure occurs without any warning. Identification of cardiac diseases at an early stage becomes a challenge for researchers in the health...
详细信息
ISBN:
(数字)9783031281839
ISBN:
(纸本)9783031281822;9783031281839
Cardiac diseases are diseases that affect people across the globe, and cardiac failure occurs without any warning. Identification of cardiac diseases at an early stage becomes a challenge for researchers in the health domain. machinelearning frameworks and algorithms are effectively used in the current medical field to predict and classify various diseases accurately. In this paper, we explore the traditional supervisedmachinelearning techniques and algorithms and their cardiac disease classification accuracy. We further investigate the feature extraction technique Kernel Principal Component Analysis with a pipelined framework. The proposed framework overcomes the issue of overfitting and increases the prediction accuracy most effectively. Random Forest produced the most perfect result and Extreme Gradient Boost technique achieved an accuracy of 99.02%. Other Boosting classifiers, Gradient Boosting and Light Gradient Boosted machine produced an accuracy of 94.16% and 98.38% respectively.
This paper describes an empirical research work based on the use of a suitable data structure, named Flow Graph (FG), that can be induced from a supervised training data set. A FG can be approached as a weighted and l...
详细信息
Computer network security and integrity are severely impacted by network attacks. The ability to predict and prevent these attacks is crucial for maintaining a secure network environment. supervised ML (machine Learni...
详细信息
The development of 5G network and beyond has led to an explosion of data generation. It is therefore crucial to have an intrusion detection system (IDS) to detect and remove malicious packets from entering network. Th...
详细信息
ISBN:
(纸本)9798350324136
The development of 5G network and beyond has led to an explosion of data generation. It is therefore crucial to have an intrusion detection system (IDS) to detect and remove malicious packets from entering network. This paper therefore presents an IDS based on a Feature Selection approach which applies the Recursive Feature Elimination and Random Forest Classifier with 10-fold Cross Validation to classify malicious and benign traffic on a publicly available UNSW-NB15 dataset. Most existing Feature Selection approaches on this dataset directed to enhance the performance of a limited number of algorithms used. Our proposed Feature Selection approach was tested on six well-known supervisedmachinelearning (ML) algorithms including Artificial Neural Network (ANN), Random Forest (RF), Decision Tree (DT), K-Nearest Neighbor (KNN), Support Vector machines (SVM) and Logistic Regression (LR) performing binary classification. In addition, we performed hyperparameter tuning to get the best possible parameters for each ML algorithm. Unlike hyperparameter tuning in most studies, we perform both Manual Search and Grid Search. The performance of the selected ML algorithms are evaluated based on Accuracy, Recall, Precision, and F1 score. The results from our experiments indicate that the most robust algorithm is ANN whereas the weakest performing algorithm is LR. RF is the second-best performing algorithm, however, its runtime is much lower than that of ANN. In particular, ANN excels with (testing accuracy, F1 score) of (88.62%, 96.473%), RF with (87.40%, 89.60%), DT with (87.266%, 89.414%), KNN with (87.11%, 88.7%), SVM with (81.835%, 86.959%) and LR with (81.835%, 85.632%). In addition, the over-fitting problems are eliminated based on our proposed Feature Selection and Hyperparameter turning. Compared with existing works with the same ML algorithms on UNSW-NB15 dataset, our proposed Feature Selection approach achieved better results in most cases and more stable among different
暂无评论