版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Nanjing Agr Univ Coll Sci Nanjing 210095 Jiangsu Peoples R China
出 版 物:《JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY》 (生命信息学与计算生物学杂志)
年 卷 期:2016年第14卷第1期
页 面:1650006-1650006页
核心收录:
学科分类:0710[理学-生物学] 07[理学] 09[农学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
主 题:RNA germ cells pattern recognition algorithm
摘 要:MicroRNAs (miRNAs) are a set of short (21-24 nt) non-coding RNAs that play significant regulatory roles in the cells. Triplet-SVM-classifier and MiPred (random forest, RF) can identify the real pre-miRNAs from other hairpin sequences with similar stem-loop (pseudo pre-miRNAs). However, the 32-dimensional local contiguous structure-sequence can induce a great information redundancy. Therefore, it is essential to develop a method to reduce the dimension of feature space. In this paper, we propose optimal features of local contiguous structure-sequences (OP-Triplet). These features can avoid the information redundancy effectively and decrease the dimension of the feature vector from 32 to 8. Meanwhile, a hybrid feature can be formed by combining minimum free energy (MFE) and structural diversity. We also introduce a neural network algorithm called extreme learning machine (ELM). The results show that the specificity (S-p) and sensitivity (S-n) of our method are 92.4% and 91.0%, respectively. Compared with Triplet-SVM-classifier, the total accuracy (ACC) of our ELM method increases by 5%. Compared with MiPred (RF) and miRANN, the total accuracy (ACC) of our ELM method increases nearly by 2%. What is more, our method commendably reduces the dimension of the feature space and the training time.