Objective: Although numerous studies related to cancer survival have been published, increasing the prediction accuracy of survival classes still remains a challenge. integration of different datasets, such as microR...
详细信息
Objective: Although numerous studies related to cancer survival have been published, increasing the prediction accuracy of survival classes still remains a challenge. integration of different datasets, such as microRNA (miRNA) and mRNA, might increase the accuracy of survival class prediction. Therefore, we suggested a machine learning (ML) approach to integrate different datasets, and developed a novel method based on feature selection with Cox proportional hazard regression model (FSCOX) to improve the prediction of cancer survival time. Methods: FSCOX provides us with intermediate survival information, which is usually discarded when separating survival into 2 groups (short- and long-term), and allows us to perform survival analysis. We used an ML-based protocol for feature selection, integrating information from miRNA and mRNA expression profiles at the feature level. To predict survival phenotypes, we used the following classifiers, first, existing ML methods, support vector machine (SVM) and random forest (RF), second, a new median-based classifier using FSCOX (FSCOX_median), and third, an SVM classifier using FSCOX (FSCOX_SVM). We compared these methods using 3 types of cancer tissue datasets: (i) miRNA expression, (ii) mRNA expression, and (iii) combined miRNA and mRNA expression. The latter data set included features selected either from the combined miRNA/mRNA profile or independently from miRNAs and mRNAs profiles (IFS). Results: In the ovarian data set, the accuracy of survival classification using the combined miRNA/mRNA profiles with IFS was 75% using RF, 86.36% using SVM, 84.09% using FSCOX_median, and 88.64% using FSCOX_SVM with a balanced 22 short-term and 22 long-term survivor data set. These accuracies are higher than those using miRNA alone (70.45%, RF;75%, SVM;75%, FSCOX_median;and 75%, FSCOX_SVM) or mRNA alone (65.91%, RF;63.64%, SVM;72.73%, FSCOX_median;and 70.45%, FSCOX_SVM). Similarly in the glioblastoma multiforme data, the accuracy of m
暂无评论