检索结果-内蒙古大学图书馆

A review: Data pre-processing techniques used for diabetes prediction

Procedia Computer Science 2024年 245卷 667-676页

作者： Mahmud Isnan Gregorius Natanael Elwirehardja Bens Pardamean Bioinformatics and Data Science Research Center Bina Nusantara University Jakarta 11480 Indonesia Computer Science Department School of Computer Science Bina Nusantara University Jakarta 11480 Indonesia Cmputer Science Department BINUS Graduate Program – Master of Computer Science Program Bina Nusantara University Jakarta 11480 Indonesia

When processing datasets in diabetes classification, common problems included a large number of missing values, outliers, and dataset imbalance. To deal with those issues, this study analyzed 18 studies on diabetes classification with machine learning algorithms over the past 5 years. This revealed the important role of data pre-processing in creating effective classification models, as it was found that by using different data pre-processing techniques, the same model can provide different performance. The study identified K-Nearest Neighbor (KNN) and support vector machine (SVM) as superior methods for filling in missing values, achieving an accuracy of 98.49% and 94.89%, respectively. These approaches outperformed traditional methods such as median or mean replacement. However, the challenge of imbalanced data sets remains in all studies reviewed. The common evaluation metrics used to evaluate the created models in previous studies included accuracy, precision, specificity, sensitivity/recall, and F1 Score. Overall, this review showed that the role of data pre-processing is no less important than algorithm selection to improve the performance of machine learning models in diabetes classification.

关键词： Diabetes Classification Data Pre-Processing Machine Learning Imbalanced Dataset

来源：评论

学校读者我要写书评

暂无评论

Evolution Of SARS-CoV-2 In The Presence Of Vaccines.

Evolution Of SARS-CoV-2 In The Presence Of Vaccines.

引用

2022 IEEE International Conference on bioinformatics and Biomedicine, BIBM 2022

作者： Marathe, Vishwajeet Yan, Changhui North Dakota State University Department of Computer Scienceline FargoND United States North Dakota State University Department of Computer Science Genomics Phenomics and Bioinformatics Graduate Program FargoND United States

ISBN: (纸本)9781665468190

The COVID pandemic has caused tremendous loss worldwide. Now vaccines are the primary weapon to combat the pandemic. Understanding how SARS-CoV-2, the virus that causes the COVID, may mutate in the presence of the vaccines is critical for designing drugs and vaccines for future variants of the virus. In this study, we investigated the numbers of mutations that SARS-CoV-2 accumulated on each protein over time. We found that different proteins of the virus accumulated different levels of mutations and their mutation rates changed over time following different patterns. We also presented evidence that the mutation of the Spike protein might have been suppressed by the vaccines. This is the first time that such a relation was reported based on real world data. Although the discovery was not meant to be conclusive, this study sheds light onto how the virus may response to the vaccines. If confirmed by further studies, the discovery will have significant impacts on many fields, including drug and vaccine designs. © 2022 IEEE.

关键词： Coronavirus

来源：评论

学校读者我要写书评

暂无评论

Long Short-Term Memory-based Models for Sleep Quality Prediction from Wearable Device Time Series Data 8

Long Short-Term Memory-based Models for Sleep Quality Predic...

引用

8th International Conference on Computer Science and Computational Intelligence, ICCSCI 2023

作者： Hidayat, Alam Ahmad Budiarto, Arif Pardamean, Bens Mathematics Department School of Computer Science Bina Nusantara University Jakarta11480 Indonesia Bioinformatics and Data Science Research Center Bina Nusantara University Jakarta11480 Indonesia Computer Science Department School of Computer Science Bina Nusantara University Jakarta11480 Indonesia Computer Science Department BINUS Graduate Program - Master of Computer Science Program Bina Nusantara University Jakarta11480 Indonesia

Several studies suggest that sleep quality is associated with physical activities. Moreover, deep sleep time can be used to determine the sleep quality of an individual. In this work, we aim to find the association between physical activities and deep sleep time by modeling the time series data such as heart rate and a number of steps captured from a commercial wearable device. Our previous study demonstrates that deep learning-based time series modeling is well suited for our problem since the temporal patterns in the two physical parameters need to be captured to obtain more accurate results. We first preprocess our series data to have a time-step size of 10 minutes. To improve our previous effort in this modeling, we compare four different variants of Long Short-Term Memory (LSTM)-based models, ranging from single input to dual input models. Our result shows that the simple stacked LSTM model performs better for our data because the remaining models suffer from overfitting due to a larger number of the trained parameters. © 2023 The Authors. Published by Elsevier B.V.

关键词： deep learning forecasting sleep quality time series wearable devices

来源：评论

学校读者我要写书评

暂无评论

Infection induces tissue-resident memory NK cells that safeguard tissue health (vol 56, pg 531, 2023)

引用

IMMUNITY 2023年第9期56卷 2173-2174页

作者： Schuster, Iona S. Sng, Xavier Y. X. Lau, Colleen M. Powell, David R. Weizman, Orr-El Fleming, Peter Neate, Georgia E. G. Voigt, Valentina Sheppard, Sam Maraskovsky, Andreas I. Daly, Sheridan Koyama, Motoko Hill, Geoffrey R. Turner, Stephen J. O'Sullivan, Timothy E. Sun, Joseph C. Andoniou, Christopher E. Degli-Esposti, Mariapia A. Infection and Immunity Program and Department of Microbiology Biomedicine Discovery Institute Monash University Clayton VIC Australia Centre for Experimental Immunology Lions Eye Institute Nedlands WA Australia Immunology Program Memorial Sloan Kettering Cancer Center New York NY USA Monash Bioinformatics Platform Biomedicine Discovery Institute Monash University Clayton VIC Australia Translational Science and Therapeutics Fred Hutchinson Cancer Center Seattle WA USA Department of Microbiology Immunology and Molecular Genetics David Geffen School of Medicine UCLA Los Angeles CA USA

Tissue health is dictated by the capacity to respond to perturbations and then return to homeostasis. Mechanisms that initiate, maintain, and regulate immune responses in tissues are therefore essential. Adaptive immunity plays a key role in these responses, with memory and tissue residency being cardinal features. A corresponding role for innate cells is unknown. Here, we have identified a population of innate lymphocytes that we term tissue-resident memory-like natural killer (NKRM) cells. In response to murine cytomegalovirus infection, we show that circulating NK cells were recruited in a CX3CR1-dependent manner to the salivary glands where they formed NKRM cells, a long-lived, tissue-resident population that prevented autoimmunity via TRAIL-dependent elimination of CD4+ T cells. Thus, NK cells develop adaptive-like features, including long-term residency in non-lymphoid tissues, to modulate inflammation, restore immune equilibrium, and preserve tissue health. Modulating the functions of NKRM cells may provide additional strategies to treat inflammatory and autoimmune diseases.

关键词： natural killer cells memory tissue residency viral infection inflammation autoimmunity immune regulation cytomegalovirus Sjogren's syndrome CD4 T cells

来源：评论

学校读者我要写书评

暂无评论

Aggregating Structural Features for Computational Analysis of Mutations in BRCT Domains of BRCA1 Protein

Aggregating Structural Features for Computational Analysis o...

引用

International Conference on Informatics, Multimedia, Cyber and Information System (ICIMCIS)

作者： Alam Ahmad Hidayat Rudi Nirwantono Mahmud Isnan Joko Pebrianto Trinugroho Bens Pardamean Mathematics Department School of Computer Science Bioinformatics and Data Science Research Center Bina Nusantara University Jakarta Indonesia Biotechnology Department Faculty of Engineering Bioinformatics and Data Science Research Center Bina Nusantara University Jakarta Indonesia Bioinformatics and Data Science Research Center Bina Nusantara University Jakarta Indonesia Computer Science Department Bioinformatics and Data Science Research Center BINUS Graduate Program Jakarta Indonesia

Understanding the mechanistic interpretability of mutation effects in a protein can help predict the clinical implications of the genetic variants. Hence, computational variant effect predictions that involve protein structural features of the protein mutations might be suitable in this case. In this work, we focus on BRCT domains of BRCA1 gene that is widely studied in breast cancer studies. We retrieved 88 selected missense variants found in BRCT domains annotated in both ClinVar and gnomAD databases. To computationally characterize the pathogenic property of the mutations we used two types of features extracted from protein structures: a change in free Gibbs energy and a set of features derived from molecular dynamics simulations of each mutant. Using a dimensional reduction and Gaussian mixture model (GMM)-based clustering we demonstrate that the variants are segregated into two regions that may correspond to their pathogenic status. This method can be a potential computational pipeline for providing the preliminary mechanistic interpretation of mutation effects in terms of their thermodynamic and structural features.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Evolutionary learning-derived lncRNA signature with biomarker discovery for predicting stage of colon adenocarcinoma

Evolutionary learning-derived lncRNA signature with biomarke...

引用

Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)

作者： Yann-Lin Ho Yann-Jen Ho Fang-Yu Ko Shinn-Ying Ho Arete Honors Program National Yang Ming Chiao Tung University Hsinchu Taiwan Genome and Systems Biology Degree Program National Taiwan University Taipei Taiwan Bioinformatics and Systems Biology National Yang Ming Chiao Tung University Hsinchu Taiwan

ISBN: (数字)9798350371499

ISBN: (纸本)9798350371505

In recent years, long non-coding RNAs (lncRNAs) have emerged as potential regulators of biological processes and genes, with the potential to serve as valuable biomarkers for cancer diagnosis and prognosis prediction. This work proposes an evolutionary learning-based method, EL-COAD, to identify a robust lncRNA signature with biomarker discovery for predicting stages of colon adenocarcinoma (COAD). The COAD patient cohorts were obtained from both the Cancer Genome Atlas and Gene Expression Omnibus (gse17536) databases. EL-COAD incorporates a bi-objective combinatorial genetic algorithm with a support vector machine for selecting a minimal number of lncRNAs while maximizing prediction accuracy. EL-COAD identified a 15-lncRNA signature and achieved a five-fold cross-validation and area under receiver operating characteristic curve of 79.4% and 0.792, respectively. Utilising the 10 lncRNAs from the signature for an independent dataset gse17536, the Sequential Minimal Optimization model achieved a test accuracy of 64.15%. Furthermore, the lncRNAs of the signature were prioritized, with the top five being TMEM105, DUXAP8, APCDD1L-DT, PCAT6, and a novel transcript, ENSG00000226308. Furthermore, both Kyoto Encyclopedia of Genes and Genomes pathway and Disease Ontology analyses provided strong support for the viability of this model-independent signature, emphasising ENSG00000226308 as a promising biomarker.

关键词： Support vector machines Accuracy Regulators RNA Genomics Receivers Biomarkers Colon bioinformatics Cancer

来源：评论

学校读者我要写书评

暂无评论

A Study of Classification Techniques Based on Spike Protein Sequences of MERS-CoV 23

A Study of Classification Techniques Based on Spike Protein ...

引用

Proceedings of the 2023 10th International Conference on Biomedical and bioinformatics Engineering

作者： Hayeon Kim Myeongji Cho Hyeon S. Son Laboratory of Computational Biology & Bioinformatics Graduate School of Public Health Seoul National University Republic of Korea and Institute of Health and Environment Seoul National University Republic of Korea Laboratory of Computational Biology & Bioinformatics Graduate School of Public Health Seoul National University Republic of Korea Laboratory of Computational Biology & Bioinformatics Graduate School of Public Health Seoul National University Republic of Korea Institute of Health and Environment Seoul National University Republic of Korea and Interdisciplinary Graduate Program in Bioinformatics College of Natural Science Seoul National University Republic of Korea

ISBN: (纸本)9798400708343

MERS-CoV, which belongs to the beta-coronaviruses together with SARS-CoV-2, although it has received relatively less attention by the COVID-19 pandemic, there is a sufficient possibility of new MERS-CoV lineages and variants. Previous studies have discussed the possibility of frequent recombination of MERS-CoV. We thus present a highly accurate method for the phylogenetic analysis and classification of MERS-CoV including recombinant sequences. We collected the sequences of S protein from MERS-CoV and divided them into five phylogenetic groups, of which recombinant sequences were divided into seven types. Physicochemical properties of amino acids were then calculated from the S protein sequences, and the results were used for the random forest model, Naïve Bayes classification, and k-nearest neighbor method. We also constructed several feature subsets based on the ranked amino acid properties and applied them to the random forest model. In each dataset, the amino acid physicochemical properties were ranked differently. Using this information, classification of MERS-CoV based on machine learning algorithms showed that the random forest model had the best accuracy and area under the curve compared with the k-nearest neighbor and Naïve Bayes classification methods. Several feature subsets were constructed using the correlation feature selection algorithm and applied to the random forest model. Overall, the performance of the classifier was improved compared to that when using all features. Coronaviruses including MERS-CoV continue to evolve into new forms through recombination or mutation. We thus present a method to increase the accuracy of their classification using additional information of the viral protein sequence, and confirm that a subunit consisting of optimal prominent features can improve the performance of the classifier by removing the unnecessary characteristic information.

关键词： MERS-CoV classification machine learning

来源：评论

学校读者我要写书评

暂无评论

SRECG: ECG Signal Super-Resolution Framework for Portable/Wearable Devices in Cardiac Arrhythmias Classification

引用

IEEE Transactions on Consumer Electronics 2023年第3期69卷 250-260页

作者： Chen, Tsai-Min Tsai, Yuan-Hong Tseng, Huan-Hsin Liu, Kai-Chun Chen, Jhih-Yu Huang, Chih-Han Li, Guo-Yuan Shen, Chun-Yen Tsao, Yu National Taiwan University and Academia Sinica Graduate Program of Data Science Taipei106319 Taiwan Academia Sinica Research Center for Information Technology Innovation Taipei115 Taiwan Taiwan Artificial Intelligence Academy Foundation Program of Taiwan AI Academy New Taipei22065 Taiwan Artificial Intelligence Foundation Technology Development Center New Taipei24158 Taiwan Graduate Institute of Biomedical Electronics and Bioinformatics National Taiwan University Taipei10617 Taiwan Institute of Biomedical Sciences Academia Sinica Taipei115 Taiwan National Taiwan University Department of Mathematics Taipei10617 Taiwan

A combination of cloud-based deep learning (DL) algorithms with portable/wearable (P/W) devices has been developed as a smart heath care system to support automatic cardiac arrhythmias (CAs) classification using electrocardiography (ECG). However, long-term and continuous ECG monitoring is challenging because of limitations of batteries and transmission bandwidth of P/W devices while incorporated with consumer electronics (CE). A feasible approach to address this challenge is to decrease sampling rates. However, low sampling rates lead to low-resolution signals that hinder the CAs classification performance. In this study, we propose a DL-based ECG signal super-resolution framework (called SRECG) to enhance low-resolution ECG signals by jointly considering the accuracies when applied to the DL-based high-resolution multiclass classifier (HMC) of CAs. In our experiments, we downsampled the ECG signals from the CPSC2018 dataset and evaluated their HMC accuracies with and without the SRECG. Experimental results show that SRECG can well improve the HMC accuracies as compared to traditional interpolation methods. Moreover, approximately half of the CAs classification accuracies of HMC were maintained within the enhanced ECG signals by SRECG. The promising results confirm that SRECG can be suitably used to enhance low-resolution ECG signals from P/W devices with CE to improve their cloud-based HMC performances. © 1975-2011 IEEE.

关键词： Electrocardiography

来源：评论

学校读者我要写书评

暂无评论

Data mining for epidemiology: The correlation of typhoid fever occurrence and environmental factors 7

Data mining for epidemiology: The correlation of typhoid fev...

引用

7th International Conference on Computer Science and Computational Intelligence, CSCI 2022

作者： Asadi, Faisal Trinugroho, Joko Pebrianto Hidayat, Alam Ahmad Rahutomo, Reza Pardamean, Bens Bioinformatics and Data Science Research Center Bina Nusantara University Jakarta11480 Indonesia Information System Department School of Information System Bina Nusantara University Jakarta11480 Indonesia Computer Science Department BINUS Graduate Program-Master of Computer Science Program Bina Nusantara University Jakarta 11480 Indonesia

Typhoid fever is an endemic disease that burdens Indonesia and has a potentially fatal infection multisystem. Salmonella typhi bacterium is responsible for typhoid fever disease. Poor sanitation, crowding, and slums are the main factors of increasing typhoid fever incidences. Environmental factors directly connected to meteorological factors are the main factor in breeding the Salmonella typhi bacterium. This study aims to identify the correlation between meteorological parameters and typhoid fever disease occurrence. The study was carried out in Jakarta, Indonesia, and the Bureau of Meteorological, Climatology, and Geophysics (BMKG) provided the meteorological parameter data. In addition, the Jakarta health surveillance office provided information on typhoid fever hospitalizations from 2019 to 2021. Pearson's concept was utilized d to investigate the correlation between typhoid fever incidences and the meteorological parameters. Humidity, precipitation, and wind speed are the meteorological parameters that significantly affect in contribute to the occurrence of typhoid fever disease. These findings might be used as a reference for Indonesia's government in making public policy to prevent typhoid fever in Indonesia. © 2022 Elsevier B.V.. All rights reserved.

关键词： Data mining

来源：评论

学校读者我要写书评

暂无评论

A systematic literature review of machine learning application in COVID-19 medical image classification 7

A systematic literature review of machine learning applicati...

引用

7th International Conference on Computer Science and Computational Intelligence, CSCI 2022

作者： Daniel Cenggoro, Tjeng Wawan Pardamean, Bens Computer Science Department BINUS Graduate Program - Master of Computer Science Program Bina Nusantara University Jakarta11480 Indonesia Computer Science Department School of Computer Science Bina Nusantara University Jakarta11480 Indonesia Bioinformatics and Data Science Research Center Bina Nusantara University Jakarta11480 Indonesia

Detecting COVID-19 as early as possible and quickly is one way to stop the spread of COVID-19. Machine learning development can help to diagnose COVID-19 more quickly and accurately. This report aims to find out how far research has progressed and what lessons can be learned for future research in this sector. By filtering titles, abstracts, and content in the Google Scholar database, this literature review was able to find 19 related papers to answer two research questions, i.e. what medical images are commonly used for COVID-19 classification and what are the methods for COVID-19 classification. According to the findings, chest X-ray were the most commonly used data to categorize COVID-19 and transfer learning techniques were the method used in this study. Researchers also concluded that lung segmentation and use of multimodal data could improve performance. © 2022 Elsevier B.V.. All rights reserved.

关键词： COVID-19

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：