Breast cancer is the most common cancer type worldwide. In cancer studies, histopathological breast images are used in the process of diagnosis. In this paper, we defined three sets of features to represent the charac...
详细信息
ISBN:
(数字)9781665467704
ISBN:
(纸本)9781665467704
Breast cancer is the most common cancer type worldwide. In cancer studies, histopathological breast images are used in the process of diagnosis. In this paper, we defined three sets of features to represent the characteristics of the cell nuclei to detect malignant cases. Geometric, directional, and intensity-based features, a total of 33, are derived and evaluated using breast cancer histopathological images from the BreaKHis database. Four machine learning algorithms, including Decision Tree, Support Vector Machines, K-Nearest Neighbor, and Narrow Neural Networks (NNN), are designed to assess the efficiency of the sets. The preliminary results showed that the proposed methodology achieved high performance in classifying cancerous cells as the directional feature set was the most effective set among the three sets. The combination of the sets achieved the best performance by the NNN, which reached an accuracy, recall, precision, AUC, and F1 score of 96.9%, 97.4%, 98%, 98.8%, and 97.7%, respectively.
The identification of digital market segments to make value-creating propositions is a major challenge for entrepreneurs and marketing managers. New technologies and the Internet have made it possible to collect huge ...
详细信息
The identification of digital market segments to make value-creating propositions is a major challenge for entrepreneurs and marketing managers. New technologies and the Internet have made it possible to collect huge volumes of data that are difficult to analyse using traditional techniques. The purpose of this research is to address this challenge by proposing the use of AI algorithms to cluster customers. Specifically, the proposal is to compare the suitability of supervised algorithms, XGBoost, versus unsupervised algorithms, K-means, for segmenting the digital market. To do so, both algorithms have been applied to a sample of 5 million Spanish users captured between 2010 and 2022 by a lead generation start-up. The results show that supervised learning with this type of data is more useful for segmenting markets than unsupervised learning, as it provides solutions that are better suited to entrepreneurs' commercial objectives.
The increasing number of online communities and social platforms like Twitter and Facebook has facilitated a level of information sharing never before seen in human history Consumers are generating and sharing more da...
详细信息
ISBN:
(纸本)9798350360806;9798350360790
The increasing number of online communities and social platforms like Twitter and Facebook has facilitated a level of information sharing never before seen in human history Consumers are generating and sharing more data than ever before thanks to the proliferation of social media platforms, and some of it is deceptive and has no basis in reality. Automatically determining whether a text contains misleading data or misinformation is difficult. Before passing judgment on the accuracy of a piece, even a subject matter specialist needs to look into a number of different angles. Here, this research presents machine learning strategies for distinguishing between false and genuine news. In this study, we have collected data by web scraping and employed several different techniques to train a collection of machine learning algorithms and then compare how well they perform on our datasets. In this work five machine learning algorithms have been applied to find the best algorithms. After evaluating the model, the research found that the decision tree achieved the best 99.84% model accuracy from this study.
Machine learning techniques have been widely used in the oil and gas industry to improve the qualitative and quantitative characterization of subsurface reservoirs. Because rock properties are strongly influenced by l...
详细信息
Machine learning techniques have been widely used in the oil and gas industry to improve the qualitative and quantitative characterization of subsurface reservoirs. Because rock properties are strongly influenced by lithological and sedimentological information, lithofacies classification is an important step in 3D reservoir modeling. The aim of this study is to use supervised classification algorithms to predict the spatial distribution pattern of lithofacies classes using borehole and seismic data. In this study, lithofacies classes are distributed away from the wells using a machine-learning classifier. Seismic data attributes extracted from well locations are utilized as training data features in various supervised classification algorithms. Machine learning classifiers trained and evaluated for lithofacies classification include K-nearest neighbors, support vector machine, Gaussian naive Bayes, decision tree, Gradient Boosting, and Random Forests. A number of parameters are optimally determined in order to achieve the highest value of classification accuracy in the model. Comparing machine learning classifiers based on evaluation metrics reveals that ensemble-based decision tree approaches such as Random Forests and Gradient Boosting are the most effective for supervised classification. The results are validated using testing data and have an 80% classification accuracy. The predicted volume of lithofacies classes contributes to improved 3D reservoir modeling for the pre-salt carbonate reservoir.
The plywood industry is one of the most significant sub-sectors of the forestry industry and serves as a cornerstone of sustainable construction within a bioeconomy framework. Plywood is a panel composed of multiple l...
详细信息
The plywood industry is one of the most significant sub-sectors of the forestry industry and serves as a cornerstone of sustainable construction within a bioeconomy framework. Plywood is a panel composed of multiple layers of wood sheets bonded together. While automation and process monitoring have played a crucial role in improving efficiency, data-driven decision-making remains underutilized in the industrial sector. Many industrial processes continue to rely heavily on the expertise of operators rather than on data analytics. However, advancements in data storage capabilities and the availability of high-speed computing have paved the way for data-driven algorithms that can support real-time decision-making. Due to the biological nature of wood and the numerous variables involved, managing manufacturing operations is inherently complex. The multitude of process variables, and the presence of non-linear physical phenomena make it challenging to develop accurate and robust analytical predictive models. As a result, data-driven approaches-particularly Artificial Intelligence (AI)-have emerged as highly promising modeling techniques. Leveraging industrial data and exploring the application of AI algorithms, particularly Machine Learning (ML), to predict key performance indicators (KPIs) in process plants represent a novel and expansive field of study. The processing of industrial data and the evaluation of AI algorithms best suited for plywood manufacturing remain key areas of research. This study explores the application of supervised Machine Learning (ML) algorithms in monitoring key process variables to enhance quality control in veneers and plywood production. The analysis included Random Forest, XGBoost, K-Nearest Neighbors (KNN), Support Vector Machine (SVM), Lasso, and Logistic Regression. An initial dataset comprising 49 variables related to the maceration, peeling, and drying processes was refined to 30 variables using correlation analysis and Lasso variable
Various diseases severely affect maize, leading to a significant reduction in yield and crop quality. Therefore, the identification of genes responsible for tolerance to biotic stress is important in maize breeding pr...
详细信息
Various diseases severely affect maize, leading to a significant reduction in yield and crop quality. Therefore, the identification of genes responsible for tolerance to biotic stress is important in maize breeding programs. In the present study, a meta-analysis on microarray gene expression of maize imposed to various biotic stresses, induced by fungal pathogens or pests, was performed to identify key tolerant genes. Correlation-based Feature Selection (CFS) was performed to attain fewer DEGs discriminating control and stress conditions. As a result, 44 genes were selected and their performance was confirmed in the Bayes Net, MLP, SMO, KStar, Hoeffding Tree, and Random Forest models. Bayes Net outperformed the other algorithms representing an accuracy level of 97.1831%. Pathogen recognition genes, decision tree models, co-expression analysis, and functional enrichment were implemented on these selected genes. A robust co-expression was observed among 11 genes responsible for defense response, diterpene phytoalexin biosynthetic process, and diterpenoid biosynthetic process in terms of biological process. This study could provide new information on the genes responsible for resistance to biotic stress in maize to be implicated in biology or maize breeding.
Softwares play an important role in controlling complex systems. Monitoring the proper functioning of the components of such systems is the principal role of softwares. Often, a petite fault in one of the subsystems m...
详细信息
Softwares play an important role in controlling complex systems. Monitoring the proper functioning of the components of such systems is the principal role of softwares. Often, a petite fault in one of the subsystems may cause irreparable damages;therefore, it is of great importance to be able to predict software faults and estimate the reliability of softwares. In this survey, we present a classification of various methods proposed in the literature to predict software reliability. This study summarizes the results of more than 200 research papers in the field. We also discuss the challenges involved in prediction methods along with proposed partial solutions (i.e., Bayesian methods) to improve the accuracy of such predictions. Moreover, we review numerous evaluation measures introduced so far to assess the performance of prediction models, the datasets they are based on, and also the results they yield.
The semi-automatic and automatic extraction of land features such as buildings, trees, and roads using aerial laser scan data is crucial in land use change studies and urban management. This research introduces the &q...
详细信息
The semi-automatic and automatic extraction of land features such as buildings, trees, and roads using aerial laser scan data is crucial in land use change studies and urban management. This research introduces the "BTR" extractor, a novel software package designed to enhance classification accuracy of phenomena identified in the super points obtained from aerial laser scanners. Our method focuses on: - Comparing classification methods using airborne laser scanning data. - Implementing supervised algorithms for high-accuracy classification. - Evaluating the performance against existing software like TerraSolid. The user-friendly interface allows data entry, training data collection, and selection of classification methods. We employed five methods (Bayesian algorithms, support vector machine, K-nearest neighbor, C-Tree, and discriminant analysis) to classify land features. Comparative results show the BTR extractor outperforms TerraSolid, particularly in supervised classification, demonstrating high accuracy and reliable implementation in the studied area. Our findings advocate for the use of supervised algorithms in classifying cloud data for enhanced accuracy and efficiency in remote sensing applications.
India's agricultural sector has been grappling with the adverse effects of climate change over the last two decades, resulting in the diminished performance of various crops. Predicting crop yields well in advance...
详细信息
Parkinson's disease is one of the most common neurodegenerative chronic diseases which can affect the patient's quality of life by creating several motor and non-motor impairments. The freezing of gait is one ...
详细信息
ISBN:
(纸本)9798400708169
Parkinson's disease is one of the most common neurodegenerative chronic diseases which can affect the patient's quality of life by creating several motor and non-motor impairments. The freezing of gait is one such motor impairment which can cause the inability to move forward despite the intention to walk. The identification of the freezing-of-gait events using sensor technology and machine-learning algorithms can result in an improvement in the quality of life and can decrease the risk of fall in Parkinson's patients. Our study focuses on a systematic performance evaluation of machine learning algorithms for developing a good fit and generalized model. In this work, we train time-domain and frequency-domain-transform-based features on fully connected artificial and deep neural network algorithm for classifying the events of freezing of gait in patients by using accelerometer data. We evaluate these algorithms for hyperparameters such as batch size, optimizer type, and window sizes in a step-wise process. We identify an optimal combination of parameters according to the accuracy and model fit optimality metrics, for artificial and deep neural network to classify freezing of gait events in Parkinson's patients. We were able to achieve classification accuracy of 89%-90% with Adam optimizer, batch sizes (BS) of 256 and 8 and epochs of 60 and 40 for ANN and DNN respectively.
暂无评论