版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering Xiangtan University Hunan 411105 China. Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering Xiangtan University Hunan 411105 China. yuzuguo@***. School of Electrical Engineering and Computer Science Queensland University of Technology GPO Box 2434 Brisbane Q4001 Australia. yuzuguo@***. Advanced Analytics Institute Faculty of Engineering & IT University of Technology Sydney P.O Box 123 Broadway NSW 2007 Australia. School of Mathematical Sciences Queensland University of Technology GPO Box 2434 Brisbane Q4001 Australia.
出 版 物:《BMC bioinformatics》
年 卷 期:2018年第19卷第SUPPL 19期
页 面:521页
主 题:Hibert-Huang transform Network PSI-BLAST profiles Pre-microRNA SVM mRMR
摘 要:BACKGROUND:Distinction between pre-microRNAs (precursor microRNAs) and length-similar pseudo pre-microRNAs can reveal more about the regulatory mechanism of RNA biological processes. Machine learning techniques have been widely applied to deal with this challenging problem. However, most of them mainly focus on secondary structure information of pre-microRNAs, while ignoring sequence-order information and sequence evolution information. RESULTS:We use new features for the machine learning algorithms to improve the classification performance by characterizing both sequence order evolution information and secondary structure graphs. We developed three steps to extract these features of pre-microRNAs. We first extract features from PSI-BLAST profiles and Hilbert-Huang transforms, which contain rich sequence evolution information and sequence-order information respectively. We then obtain properties of small molecular networks of pre-microRNAs, which contain refined secondary structure information. These structural features are carefully generated so that they can depict both global and local characteristics of pre-microRNAs. In total, our feature space covers 591 features. The maximum relevance and minimum redundancy (mRMR) feature selection method is adopted before support vector machine (SVM) is applied as our classifier. The constructed classification model is named MicroRNA -NHPred. The performance of MicroRNA -NHPred is high and stable, which is better than that of those state-of-the-art methods, achieving an accuracy of up to 94.83% on same benchmark datasets. CONCLUSIONS:The high prediction accuracy achieved by our proposed method is attributed to the design of a comprehensive feature set on the sequences and secondary structures, which are capable of characterizing the sequence evolution information and sequence-order information, and global and local information of pre-microRNAs secondary structures. MicroRNA -NHPred is a valuable method for pre-microRNAs iden