The continuous advancement of next-generation sequencing (NGS) technology enables researchers to detect somatic mutations, significantly enhancing the accuracy of identifying somatic mutations from NGS data. With the ...
详细信息
ISBN:
(纸本)9789819756919;9789819756926
The continuous advancement of next-generation sequencing (NGS) technology enables researchers to detect somatic mutations, significantly enhancing the accuracy of identifying somatic mutations from NGS data. With the continuous advancement of machinelearning (ML) technology, researchers have gained more confidence in utilizing this technology for data prediction. This article proposes the combination of the Tree-structured Parzen Estimator (TPE) algorithm with the Support Vector machines (SVM) algorithm for detecting somatic mutations in matched tumor and normal paired sequencing data. The method is applied to real biological data from exome capture data and whole-genome shotgun data. The results indicate a significant improvement in the detectability of somatic mutations using the proposed integrated approach compared to the conventional methods.
Explainable Artificial Intelligence (XAI) is a young but very promising field of research. Unfortunately, the progress in this field is currently slowed down by divergent and incompatible goals. We separate various th...
Explainable Artificial Intelligence (XAI) is a young but very promising field of research. Unfortunately, the progress in this field is currently slowed down by divergent and incompatible goals. We separate various threads tangled within the area of XAI into two complementary cultures of human/value-oriented explanations (BLUE XAI) and model/validation-oriented explanations (RED XAI). This position paper argues that the area of RED XAI is currently under-explored, i.e., more methods for explainability are desperately needed to question models (e.g., extract knowledge from well-performing models as well as spotting and fixing bugs in faulty models), and the area of RED XAI hides great opportunities and potential for important research necessary to ensure the safety of AI systems. We conclude this paper by presenting promising challenges in this area.
Cardiovascular Diseases (CVDs) have emerged as a significant physiological condition, being a primary contributor to mortality. Timely and precise diagnosis of heart disease is crucial to safeguard patients from addit...
详细信息
Cardiovascular Diseases (CVDs) have emerged as a significant physiological condition, being a primary contributor to mortality. Timely and precise diagnosis of heart disease is crucial to safeguard patients from additional harm. Recent studies show that the usage of data driven approaches, such as Deep learning (DL) and machinelearning (ML) techniques, in the field of medical science is highly useful in accurately diagnosing heart disease in less time. However, statistical learning and traditional ML approaches require feature engineering to generate robust and effective features from data, which are then used in the prediction models. In the case of large complex data, both processes pose many challenges. Whereas, DL techniques are capable of learning features automatically from the data and are effective at handling large and intricate datasets while outperforming the ML models. This study focuses on the accurate prediction of CVDs, considering the patient’s health and socio-economic conditions while mitigating the challenges presented by imbalanced data. The Adaptive Synthetic Sampling Technique is used for data balancing, while the Point Biserial Correlation Coefficient is used as a feature selection technique. In this study, two DL models, Ensemble based Cardiovascular Disease Detection Network (EnsCVDD-Net) and Blending based Cardiovascular Disease Detection Network (BlCVDD-Net), are proposed for accurate prediction and classification of CVDs. EnsCVDD-Net is made by applying an ensemble technique to LeNet and Gated Recurrent Unit (GRU), and BlCVDD-Net is made by blending LeNet, GRU and Multilayer Perceptron. SHapley Additive exPlanations is used to provide a clear understanding of the influence different factors have on CVD diagnosis. The network’s performance is evaluated on the basis of various performance metrics. The results indicate that the EnsCVDD-Net outperforms all base models with 88% accuracy, 88% F1-score, 91% precision, 85% recall, and 777s execu
The increasing growth of the aviation industry has been accompanied by increasing problems of flight delays. This research aims to predict flight delays by employing machinelearning models to enhance the effectivenes...
详细信息
machinelearning method was applied for rapid and precise prediction of flow and heat transfer characteristics within co-rotating disk cavity with a finned vortex reducer. A new data preprocessing scheme was developed...
详细信息
machinelearning method was applied for rapid and precise prediction of flow and heat transfer characteristics within co-rotating disk cavity with a finned vortex reducer. A new data preprocessing scheme was developed to reduce modeling cost. By using this scheme, classical radial basis function neural network (RBFNN) shows better prediction performance compared with deconvolutional neural network. Furthermore, RBFNN has a simpler topological structure and fewer hyperparameters. By testing, the relative root mean square error for total pressure, total temperature, and swirl ratio is 0.51 %, 0.32 %, and 1.99 %, and corresponding coefficient of determination reaches 99.68 %, 96.55 %, and 99.16%. The effects of input parameters on the outputs of RBFNN were analyzed, and a sensitivity analysis was conducted. Within the investigated parameter range, the total pressure loss increases as dimensionless mass flow rate, rotational Reynolds number, and radial position of the fin increases. The total temperature drop increases with the increase of dimensionless mass flow rate and rotational Reynolds number, but the decrease of radial position of the fin. Additionally, an increase in dimensionless mass flow rate or a decrease in rotational Reynolds number causes the increase of swirl ratio. This research provides an efficient modeling method for components in secondary air system in gas turbines.
This study introduces a robust machinelearning pipeline for event classification, utilizing a large-scale dataset from eight Google Borg compute clusters (May 2019) available via Google Big- Query. The dataset includ...
详细信息
Multi-label classification is a task with diverse applications, but current algorithms heavily rely on accurately labeled data, leading to time-consuming and labor-intensive data collection. However, multi-label class...
详细信息
ISBN:
(纸本)9798400709234
Multi-label classification is a task with diverse applications, but current algorithms heavily rely on accurately labeled data, leading to time-consuming and labor-intensive data collection. However, multi-label classification with partial labels presents significant challenges. In this study, we propose Multi-modal Contextual Prompt learning (MCPL), a novel approach that leverages large-scale visual-language models and exploits the strong image-text alignment in CLIP to address the scarcity of label annotations. We pre-train the visual language model's encoder on a large number of imagetext pairs.. We introduce multi-modal contextual prompt learning in both images and labeled text to better utilize the image-label correspondence within CLIP, resulting in enhanced multi-label classification performance, even when faced with partial labels. We also use the coupling function to couple the two modes and realize the interactive connection of the two modal prompts. Extensive experiments on the MS-COCO and VOC2007 datasets, demonstrating its superiority and achieving competitive performance.
data quality assessment is critical for distributed machinelearning (DML). data collected from heterogeneous Internet of things (IoT) devices may contain biased information that decreases the prediction accuracy of D...
详细信息
ISBN:
(纸本)9781665457194
data quality assessment is critical for distributed machinelearning (DML). data collected from heterogeneous Internet of things (IoT) devices may contain biased information that decreases the prediction accuracy of DML models. To address these challenges, we propose a blockchain-based approach to assess the quality of data that are not independent and identically distributed (non-IID). A blockchain running atop mobile edge computing (MEC) is helpful to protect privacy, security, and integrity of healthcare data when IoT devices are connected to MEC servers. Therefore, it is critical to integrate data quality assessment module on blockchain when building a blockchain-enabled DML system. In this paper, we jointly consider information loss and marginal utility of non-IID data samples. Specifically, we use Kullback-Leibler (KL) divergence to evaluate the information loss between IID and non-IID data samples and apply the reciprocal of data quantity to model the marginal utility of data samples. Human activities and handwritten digit recognition data sets are used for performance evaluations. Experiments show that our proposed scheme out-performs benchmarks regarding model test accuracy on various non-IID data samples.
This research develops scalable and interpretable learning tools—stealth assessment and a teacher dashboard—using a sophisticated logging system within the Mission HydroSci game-based learning environment. By integr...
详细信息
In recent years, the external environment of enterprises has been characterized by significant uncertainty;and the increasingly complex situation has also posed numerous challenges to their supply chains. Demand forec...
详细信息
暂无评论