版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Research Institute of Biomolecular and Chemical Engineering University of Pannonia Veszprem Hungary Horváth Csaba Memorial Laboratory of Bioseparation Sciences Research Center for Molecular Medicine Doctoral School of Molecular Medicine Faculty of Medicine University of Debrecen Debrecen Hungary Department of Computer Science and Systems Technology University of Pannonia Veszprem Hungary Department of Pulmonology Borsod Academic County Hospital Miskolc Hungary
出 版 物:《Computers in Biology and Medicine》 (Comput. Biol. Med.)
年 卷 期:2025年第186卷
页 面:109681-109681页
核心收录:
学科分类:0710[理学-生物学] 1004[医学-公共卫生与预防医学(可授医学、理学学位)] 1002[医学-临床医学] 1001[医学-基础医学(可授医学、理学学位)] 07[理学] 09[农学]
基 金:The classifier is a type of machine learning algorithm designed to categorize the data into one or more predefined classes . The classification task which aimed to explore the correlation between chemotherapy outcomes and structural changes in the N-glycome profiles was transformed into three independent binary classification tasks to predict the effectiveness of chemotherapy with the following class labels of 'regression ' 'progression ' and 'stationary.' To identify the most suitable classification method the performance of 27 different classification algorithms was evaluated using their default parameters. In the initial phase the tested models included linear classifiers (e.g. logistic regression) tree-based classifiers (e.g. decision tree) distance-based classifiers (e.g. k-nearest neighbors) probabilistic Bayes-based models (e.g. Gaussian Naive Bayes Quadratic Discriminant Analysis (QDA)) support vector classifiers (SVC) with various kernels ensemble models (e.g. Random Forest Extra Trees Gradient Boosting eXtreme Gradient Boosting (XGBoost)) and a Neural Network-based classifier . Following the performance evaluation with the default parameters we conducted an exhaustive hyperparameter tuning for the five best-performing machine learning algorithms: SVC QDA Random Forest XGBoost and Neural Network. For each algorithm the hyperparameters that most significantly influenced learning were meticulously tuned as follows. For SVC hyperparameter tuning focused on the type of kernel applied. In the case of QDA the regularization parameter was optimized. For the Random Forest algorithm the hyperparameter set included the splitting criterion the number of estimators the maximum tree depth the minimum sample size required for a split and the minimum sample size required at the leaf level. For XGBoost the tuned hyperparameters were the number of estimators the maximum tree depth and the gamma parameter. For the Neural Network the structure of the network (number of hidden layers number of neurons) and the activation function were optimized. Hyperparameter tuning was conducted using Bayesian Optimization with a 5-fold cross-validation applied in all cases. Quadratic Discriminant Analysis showed the best classification performance thus it was selected to construct the final model . Additionally a combination of the Sequential Feature Selection (SFS) procedure and the brute force method was employed to minimize the influence of irrelevant features and identify the relevant N-glycan peaks with structural changes most effectively correlating the chemotherapy response. Due to the limited number of records in the original dataset the fine-tuned QDA classifier was run 1000 times for the final evaluation. This involved randomly partitioning the entire original dataset into separate training and test sets with an 80 %\u201320 % ratio and the resulting performance metrics were calculated by averaging the results of the test sets. All in-house developed data analysis code was implemented in Python using Jupyter Notebook v7.1.1 . We also utilized the Receiver Operating Characteristic (ROC) Area Under Curve (AUC) analysis method which is a frequently used technique for analyzing the accuracy of diagnostic tests. The ROC curve is the plot of the series of true positive points (sensitivity) against the false positive points (1-specificity). An ideal ROC curve jumps towards the upper left corner of the graph indicating good (AUC >0.8) or great (AUC >0.9) discrimination properties. In other words the higher AUC value of the ROC curve suggests greater discriminative power.Authors gratefully acknowledge the support from the following sources: ATBG Korea V4 joint project of the National Research Development and Innovation Office of Hungary #2023-1.2.1-ERA_NET-2023-00015 the Andras Korany
摘 要:An efficient novel approach is introduced to predict the effectiveness of chemotherapy treatment in lung cancer by monitoring the serum N-glycome of patients combined with artificial intelligence-based data analysis. The study involved thirty-three lung cancer patients undergoing chemotherapy treatments. Serum samples were taken before and after the treatment. The N-linked oligosaccharides were enzymatically released, fluorophore-labeled, and analyzed by capillary electrophoresis with laser-induced fluorescence detection. The resulting electropherograms were thoroughly processed and evaluated by artificial intelligence-based classifiers, i.e., utilizing a machine learning algorithm to categorize the data into two (binary) classes. The classifier analysis method revealed a strong association between the structural changes in the N-glycans and the outcomes of the chemotherapy treatments (ROC 0.9). This novel combination of bioanalytical and AI methods provided a precise and rapid tool for predicting the effectiveness of chemotherapy. © 2025 The Authors