检索结果-内蒙古大学图书馆

11th International Conference on Knowledge and Systems Engineering (KSE)

作者： Xuan Tho Dang Duong Hung Bui Thi Hong Nguyen Tran Quoc Vinh Nguyen Dang Hung Tran Hanoi Natl Univ Educ Fac Informat Technol Hanoi Vietnam Hanoi Trade Union Univ Fac Informat Technol Hanoi Vietnam Univ Da Nang Univ Sci & Educ Fac Informat Technol Danang Vietnam

ISBN: (纸本)9781728130033

Autism is one of the neurological disorders that occurs in children. There are many causes of autism, one of which is genetic factors. Therefore, in order to find effective treatments, we need to discover the genes which relate to autism disease. In this paper, we use a computational approach to train a model that can predict new autism-related candidate genes. The methodology combines different data sources such as protein-protein interaction networks, microRNAs (miRNA)target network and known autism-related genes into an integrated network. The structural properties of this network are represented as a vector dataset and a binary classification problem is formulated. However, because the number of known autism-related genes is very small, we face an imbalance data classification problem. To solve this issue, an under-sampling clustering-based data balancing algorithm has been proposed. Training classifiers with machine learning models such as SVMs, k-NN, and RFs, we obtained results of 1-3% higher in G-mean measures when comparing to cases without using any data balancing strategies. These results implied that our proposed model may contribute to finding new autism-related gene candidates.

关键词： imbalance data clustering-based under-sampling classification autism-related gene protein-protein interaction network miRNA-mRNA network

来源：评论

学校读者我要写书评

暂无评论

A gradient boosting-based mortality prediction model for COVID-19 patients

引用

NEURAL COMPUTING & APPLICATIONS 2023年第33期35卷 23997-24013页

作者： Keser, Sinem Bozkurt Keskin, Kemal Eskisehir Osmangazi Univ Dept Comp Engn TR-26040 Eskisehir Turkiye Eskisehir Osmangazi Univ Dept Elect & Elect Engn TR-26040 Eskisehir Turkiye

The COVID-19 pandemic has been a global public health concern since March 11, 2020. Healthcare systems struggled to meet patients' growing needs for diagnosis, treatment, and care. As healthcare industries struggled to cope with the overwhelming demands, advanced intelligence and computing technologies have become essential. Artificial intelligence techniques have become essential for identifying and triaging patients, predicting disease severity, and detecting outcomes. The aim of the paper is to propose a gradient boosting-based model to predict the mortality of COVID-19 patients and to improve the prediction accuracy by incorporating resampling strategies. A real COVID-19 data that includes patients' travel, health, geographical, and demographic information is obtained from a public repository. The dataset used in the study has the class imbalance problem, and several approaches are applied to solve the problem. In this study, a gradient boosting-based model for predicting the mortality of COVID-19 patients is proposed. This approach incorporates resampling strategies, such as synthetic minority oversampling technique (SMOTE), random under-sampling, and clustering-based under-sampling, to address the imbalanced class distribution problem in the dataset. Then, gradient boosting machines (GBM) such as extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and categorical boosting (CatBoost) are analyzed in terms of accuracy and computational time. Random search method is used to find the optimal hyper-parameters for the algorithms. A stacking-based hybrid model that combines the XGBoost, LightGBM, and CatBoost algorithms was used for comparison in the experiments. In the experiments, the factors that can influence the mortality of COVID-19 patients are investigated. And, it is found that the age of the patient, whether the patient belonged to Wuhan, the difference between when they first noticed symptoms and when they visited the hospital

关键词： COVID-19 Machine learning Gradient-based boosting machines SMOTE Random under-sampling clustering-based under-sampling

来源：评论

学校读者我要写书评

暂无评论

Improved hybrid resampling and ensemble model for imbalance learning and credit evaluation

引用

Journal of Management Science and Engineering 2022年第4期7卷 511-529页

作者： Gang Kou Hao Chen Mohammed A.Hefni School of Business Administration Southwestern University of Finance and EconomicsChengdu610074China Department of Mining Engineering Faculty of EngineeringJeddah21589Saudi Arabia

A clustering-based undersampling (CUS) and distance-based near-miss method are widely used in current imbalanced learning algorithms, but this method has certain drawbacks. In particular, the CUS does not consider the influence of the distance factor on the majority of instances, and the near-miss method omits the inter-class(es) within the majority of samples. To overcome these drawbacks, this study proposes an undersampling method combining distance measurement and majority class clustering. Resampling methods are used to develop an ensemble-based imbalanced-learning algorithm called the clustering and distance-based imbalance learning model (CDEILM). This algorithm combines distance-based undersampling, feature selection, and ensemble learning. In addition, a cluster size-based resampling (CSBR) method is proposed for preserving the original distribution of the majority class, and a hybrid imbalanced learning framework is constructed by fusing various types of resampling methods. The combination of CDEILM and CSBR can be considered as a specific case of this hybrid framework. The experimental results show that the CDEILM and CSBR methods can achieve better performance than the benchmark methods, and that the hybrid model provides the best results under most circumstances. Therefore, the proposed model can be used as an alternative imbalanced learning method under specific circumstances, e.g., for providing a solution to credit evaluation problems in financial applications.

关键词： Imbalanced learning clustering-based under-sampling Ensemble methods Hybrid methods Credit risk evaluation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：