symmetric nonnegative matrix factorization(SNMF) has widely employed in many areas of applications. symmetricnonnegative positive definition matrixfactorization(SNPDMF) is a sub problem of SNMF. Authors proposed a L...
详细信息
symmetric nonnegative matrix factorization(SNMF) has widely employed in many areas of applications. symmetricnonnegative positive definition matrixfactorization(SNPDMF) is a sub problem of SNMF. Authors proposed a Lanczos-based initialization method for SNPDMF, which can be combined with existing SNPDMF algorithms and achieve higher efficiency. Experiments shows that the SNPDMF algorithm which combined the proposed initialization method can converges to a better solution.
With the rapid development of collection techniques, it is easy to gather various data which come from different domains, such as images, videos, documents, and etc, how to group these heterogeneous data becomes a res...
详细信息
nonnegativematrixfactorization (NMF) provides a lower rank approximation of a matrix by a product of two nonnegative factors. NMF has been shown to produce clustering results that are often superior to those by othe...
详细信息
nonnegativematrixfactorization (NMF) provides a lower rank approximation of a matrix by a product of two nonnegative factors. NMF has been shown to produce clustering results that are often superior to those by other methods such as K-means. In this paper, we provide further interpretation of NMF as a clustering method and study an extended formulation for graph clustering called symmetric NMF (SymNMF). In contrast to NMF that takes a data matrix as an input, SymNMF takes a nonnegative similarity matrix as an input, and a symmetricnonnegative lower rank approximation is computed. We show that SymNMF is related to spectral clustering, justify SymNMF as a general graph clustering method, and discuss the strengths and shortcomings of SymNMF and spectral clustering. We propose two optimization algorithms for SymNMF and discuss their convergence properties and computational efficiencies. Our experiments on document clustering, image clustering, and image segmentation support SymNMF as a graph clustering method that captures latent linear and nonlinear relationships in the data.
Protein function prediction in conventional computational approaches is usually conducted one function at a time, fundamentally. As a result, the functions are treated as separate target classes. However, biological p...
详细信息
ISBN:
(纸本)9783319052694;9783319052687
Protein function prediction in conventional computational approaches is usually conducted one function at a time, fundamentally. As a result, the functions are treated as separate target classes. However, biological processes are highly correlated, which makes functions assigned to proteins are not independent. Therefore, it would be beneficial to make use of function category correlations in predicting protein function. We propose a novel Maximization of Data-Knowledge Consistency (MDKC) approach to exploit function category correlations for protein function prediction. Our approach banks on the assumption that two proteins are likely to have large overlap in their annotated functions if they are highly similar according to certain experimental data. We first establish a new pairwise protein similarity using protein annotations from knowledge perspective. Then by maximizing the consistency between the established knowledge similarity upon annotations and the data similarity upon biological experiments, putative functions are assigned to unannotated proteins. Most importantly, function category correlations are elegantly incorporated through the knowledge similarity. Comprehensive experimental evaluations on Saccharomyces cerevisiae data demonstrate promising results that validate the performance of our methods.
Web link analysis methods such as PageRank, HITS, and SALSA have focused on obtaining global popularity or authority of the set of Web pages in question. Although global popularity is useful for general queries, we fi...
详细信息
Web link analysis methods such as PageRank, HITS, and SALSA have focused on obtaining global popularity or authority of the set of Web pages in question. Although global popularity is useful for general queries, we find that global popularity is not as useful for queries in which the global population has less knowledge of. By examining the many different communities that appear within a Web page graph, we are able to compute the popularity or authority from a specific community. Multiresolution popularity lists allow us to observe the popularity of Web pages with respect to communities at different resolutions within the Web. Multiresolution popularity lists have been shown to have high potential when compared against PageRank. In this paper, we generalize the multiresolution popularity analysis to use any form of Web page link relations. We provide results for both the PageRank relations and the In-degree relations. By utilizing the multiresolution popularity lists, we achieve a 13 percent and 25 percent improvement in mean average precision over In-degree and PageRank, respectively.
暂无评论