When processing datasets in diabetes classification, common problems included a large number of missing values, outliers, and dataset imbalance. To deal with those issues, this study analyzed 18 studies on diabetes cl...
详细信息
When processing datasets in diabetes classification, common problems included a large number of missing values, outliers, and dataset imbalance. To deal with those issues, this study analyzed 18 studies on diabetes classification with machine learning algorithms over the past 5 years. This revealed the important role of data pre-processing in creating effective classification models, as it was found that by using different data pre-processing techniques, the same model can provide different performance. The study identified K-Nearest Neighbor (KNN) and support vector machine (SVM) as superior methods for filling in missing values, achieving an accuracy of 98.49% and 94.89%, respectively. These approaches outperformed traditional methods such as median or mean replacement. However, the challenge of imbalanced data sets remains in all studies reviewed. The common evaluation metrics used to evaluate the created models in previous studies included accuracy, precision, specificity, sensitivity/recall, and F1 Score. Overall, this review showed that the role of data pre-processing is no less important than algorithm selection to improve the performance of machine learning models in diabetes classification.
The COVID pandemic has caused tremendous loss worldwide. Now vaccines are the primary weapon to combat the pandemic. Understanding how SARS-CoV-2, the virus that causes the COVID, may mutate in the presence of the vac...
详细信息
Several studies suggest that sleep quality is associated with physical activities. Moreover, deep sleep time can be used to determine the sleep quality of an individual. In this work, we aim to find the association be...
详细信息
Tissue health is dictated by the capacity to respond to perturbations and then return to homeostasis. Mechanisms that initiate, maintain, and regulate immune responses in tissues are therefore essential. Adaptive immu...
详细信息
Tissue health is dictated by the capacity to respond to perturbations and then return to homeostasis. Mechanisms that initiate, maintain, and regulate immune responses in tissues are therefore essential. Adaptive immunity plays a key role in these responses, with memory and tissue residency being cardinal features. A corresponding role for innate cells is unknown. Here, we have identified a population of innate lymphocytes that we term tissue-resident memory-like natural killer (NKRM) cells. In response to murine cytomegalovirus infection, we show that circulating NK cells were recruited in a CX3CR1-dependent manner to the salivary glands where they formed NKRM cells, a long-lived, tissue-resident population that prevented autoimmunity via TRAIL-dependent elimination of CD4+ T cells. Thus, NK cells develop adaptive-like features, including long-term residency in non-lymphoid tissues, to modulate inflammation, restore immune equilibrium, and preserve tissue health. Modulating the functions of NKRM cells may provide additional strategies to treat inflammatory and autoimmune diseases.
Understanding the mechanistic interpretability of mutation effects in a protein can help predict the clinical implications of the genetic variants. Hence, computational variant effect predictions that involve protein ...
Understanding the mechanistic interpretability of mutation effects in a protein can help predict the clinical implications of the genetic variants. Hence, computational variant effect predictions that involve protein structural features of the protein mutations might be suitable in this case. In this work, we focus on BRCT domains of BRCA1 gene that is widely studied in breast cancer studies. We retrieved 88 selected missense variants found in BRCT domains annotated in both ClinVar and gnomAD databases. To computationally characterize the pathogenic property of the mutations we used two types of features extracted from protein structures: a change in free Gibbs energy and a set of features derived from molecular dynamics simulations of each mutant. Using a dimensional reduction and Gaussian mixture model (GMM)-based clustering we demonstrate that the variants are segregated into two regions that may correspond to their pathogenic status. This method can be a potential computational pipeline for providing the preliminary mechanistic interpretation of mutation effects in terms of their thermodynamic and structural features.
In recent years, long non-coding RNAs (lncRNAs) have emerged as potential regulators of biological processes and genes, with the potential to serve as valuable biomarkers for cancer diagnosis and prognosis prediction....
详细信息
ISBN:
(数字)9798350371499
ISBN:
(纸本)9798350371505
In recent years, long non-coding RNAs (lncRNAs) have emerged as potential regulators of biological processes and genes, with the potential to serve as valuable biomarkers for cancer diagnosis and prognosis prediction. This work proposes an evolutionary learning-based method, EL-COAD, to identify a robust lncRNA signature with biomarker discovery for predicting stages of colon adenocarcinoma (COAD). The COAD patient cohorts were obtained from both the Cancer Genome Atlas and Gene Expression Omnibus (gse17536) databases. EL-COAD incorporates a bi-objective combinatorial genetic algorithm with a support vector machine for selecting a minimal number of lncRNAs while maximizing prediction accuracy. EL-COAD identified a 15-lncRNA signature and achieved a five-fold cross-validation and area under receiver operating characteristic curve of 79.4% and 0.792, respectively. Utilising the 10 lncRNAs from the signature for an independent dataset gse17536, the Sequential Minimal Optimization model achieved a test accuracy of 64.15%. Furthermore, the lncRNAs of the signature were prioritized, with the top five being TMEM105, DUXAP8, APCDD1L-DT, PCAT6, and a novel transcript, ENSG00000226308. Furthermore, both Kyoto Encyclopedia of Genes and Genomes pathway and Disease Ontology analyses provided strong support for the viability of this model-independent signature, emphasising ENSG00000226308 as a promising biomarker.
MERS-CoV, which belongs to the beta-coronaviruses together with SARS-CoV-2, although it has received relatively less attention by the COVID-19 pandemic, there is a sufficient possibility of new MERS-CoV lineages and v...
详细信息
ISBN:
(纸本)9798400708343
MERS-CoV, which belongs to the beta-coronaviruses together with SARS-CoV-2, although it has received relatively less attention by the COVID-19 pandemic, there is a sufficient possibility of new MERS-CoV lineages and variants. Previous studies have discussed the possibility of frequent recombination of MERS-CoV. We thus present a highly accurate method for the phylogenetic analysis and classification of MERS-CoV including recombinant sequences. We collected the sequences of S protein from MERS-CoV and divided them into five phylogenetic groups, of which recombinant sequences were divided into seven types. Physicochemical properties of amino acids were then calculated from the S protein sequences, and the results were used for the random forest model, Naïve Bayes classification, and k-nearest neighbor method. We also constructed several feature subsets based on the ranked amino acid properties and applied them to the random forest model. In each dataset, the amino acid physicochemical properties were ranked differently. Using this information, classification of MERS-CoV based on machine learning algorithms showed that the random forest model had the best accuracy and area under the curve compared with the k-nearest neighbor and Naïve Bayes classification methods. Several feature subsets were constructed using the correlation feature selection algorithm and applied to the random forest model. Overall, the performance of the classifier was improved compared to that when using all features. Coronaviruses including MERS-CoV continue to evolve into new forms through recombination or mutation. We thus present a method to increase the accuracy of their classification using additional information of the viral protein sequence, and confirm that a subunit consisting of optimal prominent features can improve the performance of the classifier by removing the unnecessary characteristic information.
A combination of cloud-based deep learning (DL) algorithms with portable/wearable (P/W) devices has been developed as a smart heath care system to support automatic cardiac arrhythmias (CAs) classification using elect...
详细信息
Typhoid fever is an endemic disease that burdens Indonesia and has a potentially fatal infection multisystem. Salmonella typhi bacterium is responsible for typhoid fever disease. Poor sanitation, crowding, and slums a...
详细信息
Detecting COVID-19 as early as possible and quickly is one way to stop the spread of COVID-19. Machine learning development can help to diagnose COVID-19 more quickly and accurately. This report aims to find out how f...
详细信息
暂无评论