版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Shanghai East Hosp Translat Med Ctr Stem Cell Therapy Shanghai 200136 Peoples R China Tongji Univ Sch Med Shanghai 200120 Peoples R China Tongji Univ Dept Comp Sci & Technol Shanghai 201804 Peoples R China Fudan Univ Shanghai Key Lab Intelligent Informat Proc Shanghai 200433 Peoples R China Fudan Univ Sch Comp Sci Shanghai 200433 Peoples R China
出 版 物:《IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS》 (IEEE/ACM Trans. Comput. BioL. Bioinf.)
年 卷 期:2022年第19卷第6期
页 面:3425-3434页
核心收录:
学科分类:0710[理学-生物学] 0808[工学-电气工程] 08[工学] 0714[理学-统计学(可授理学、经济学学位)] 0701[理学-数学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:National Natural Science Foundation of China (NSFC) [61972100, 61772367] National Key Research and Development Program of China [2016YFC0901704]
主 题:Sparse matrices Data models Clustering methods Computational modeling Clustering algorithms Matrix decomposition Optimization Single cell RNA-seq data unsupervised sparse representation spectral clustering outlier detection
摘 要:Clustering analysis has been widely used in analyzing single-cell RNA-sequencing (scRNA-seq) data to study various biological problems at cellular level. Although a number of scRNA-seq data clustering methods have been developed, most of them evaluate the similarity of pairwise cells while ignoring the global relationships among cells, which sometimes cannot effectively capture the latent structure of cells. In this paper, we propose a new clustering method SPARC for scRNA-seq data. The most important feature of SPARC is a novel similarity metric that uses the sparse representation coefficients of each cell in terms of the other cells to measure the relationships among cells. In addition, we develop an outlier detection method to help parameter selection in SPARC. We compare SPARC with nine existing scRNA-seq data clustering methods on twelve real datasets. Experimental results show that SPARC achieves the state of the art performance. By further analyzing the cell similarity data derived from sparse representations, we find that SPARC is much more effective in mining high quality clusters of scRNA-seq data than two traditional similarity metrics. In conclusion, this study provides a new way to effectively cluster scRNA-seq data and achieves more accurate clustering results than the state of art methods.