With the success of graph Neural Network (GNN) in network data, some GNN-based representation learning methods for networks have emerged recently. variationalgraphautoencoder (VGAE) is a basic GNN framework for netw...
详细信息
With the success of graph Neural Network (GNN) in network data, some GNN-based representation learning methods for networks have emerged recently. variationalgraphautoencoder (VGAE) is a basic GNN framework for network representation. Its purpose is to well preserve the topology and node attribute information of the network to learn node representation, but it only reconstructs network topology, and does not consider the reconstruction of node features. This strategy will make node representation can not well reserve node features information, impairing the ability of the VGAE method to learn higher quality representations. To solve this problem, we arise a new network representation method to improve the VGAE method for well retaining both node features and network structure information. The method utilizes adversarial mutual information learning to maximize the mutual information (MI) of node features and node representations during the encoding process of the variationalautoencoder, which forces the variationalencoder to get the representation containing the most informative node features. The method consists of three parts: a variationalgraphautoencoder includes a variationalencoder (MI generator (G)) and a decoder, a positive MI sample module (maximizing MI module), and an MI discriminator (D). Furthermore, we explain why maximizing MI between node features and node representation can reconstruct node attributes. Finally, we conduct experiments on seven public representative datasets for nodes classification, nodes clustering, and graph visualization tasks. Experimental results demonstrate that the proposed algorithm significantly outperforms current popular network representation algorithms on these tasks. The best improvement is 17.13% than the VGAE method.
Optimal integration of transcriptomics data and associated spatial information is essential towards fully exploiting spatial transcriptomics to dissect tissue heterogeneity and map out inter-cellular communications. W...
详细信息
Optimal integration of transcriptomics data and associated spatial information is essential towards fully exploiting spatial transcriptomics to dissect tissue heterogeneity and map out inter-cellular communications. We present SEDR, which uses a deep autoencoder coupled with a masked self-supervised learning mechanism to construct a low-dimensional latent representation of gene expression, which is then simultaneously embedded with the corresponding spatial information through a variationalgraphautoencoder. SEDR achieved higher clustering performance on manually annotated 10 x Visium datasets and better scalability on high-resolution spatial transcriptomics datasets than existing methods. Additionally, we show SEDR's ability to impute and denoise gene expression (URL: https://***/JinmiaoChenLab/SEDR/).
While variationalgraphauto-encoder (VGAE) has presented promising ability to learn representations for documents, most existing VGAE methods do not model a latent topic structure and therefore lack semantic interpret...
详细信息
ISBN:
(纸本)9781450393850
While variationalgraphauto-encoder (VGAE) has presented promising ability to learn representations for documents, most existing VGAE methods do not model a latent topic structure and therefore lack semantic interpretability. Exploring hidden topics within documents and discovering key words associated with each topic allow us to develop a semantic interpretation of the corpus. Moreover, documents are usually associated with authors. For example, news reports have journalists specializing in writing certain type of events, academic papers have authors with expertise in certain research topics, etc. Modeling authorship information could benefit topic modeling, since documents by the same authors tend to reveal similar semantics. This observation also holds for documents published on the same venues. However, most topic models ignore the auxiliary authorship and publication venues. Given above two challenges, we propose a variationalgraph Author Topic Model for documents to integrate both semantic interpretability and authorship and venue modeling into a unified VGAE framework. For authorship and venue modeling, we construct a hierarchical multilayered document graph with both intra- and cross-layer topic propagation. For semantic interpretability, three word relations (contextual, syntactic, semantic) are modeled and constitute three word sub-layers in the document graph. We further propose three alternatives for variational divergence. Experiments verify the effectiveness of our model on supervised and unsupervised tasks.
Identifying key nodes is an important task of complex network. While many previous algorithms evaluate node importance from the perspective of the network topological properties such as degree or betweenness, they hav...
详细信息
Identifying key nodes is an important task of complex network. While many previous algorithms evaluate node importance from the perspective of the network topological properties such as degree or betweenness, they have not considered the low-dimensional feature of nodes. In this paper, we propose a general improved approach based on deep learning, in which the node feature is learned via a variationalgraphautoencoder(VGAE). Due to the unsupervised learning way, the VGAE does not rely on any external label information, but is determined by the network structure. The extracted node feature provides an essential complement to the topological properties of nodes, and can be generalized to improve different algorithms. We employ the VGAE to improve ten distinct algorithms, which are evaluated by two accuracy indicators using the susceptible-infected-recovered (SIR) model around the epidemic threshold as the benchmark, and an indicator of the significance discrimination ability on the nodes with similar importance. Testing the algorithms on an ER random network and eight real networks, we demonstrate better performance of the ten improved algorithms than their original ones in most cases. Our work sheds a new light on node importance identification from the latent feature point of view.
graph neural network, with its powerful learning ability, has become a cutting-edge method of processing ultra-large-scale network data. In order to polished up the representation accuracy of embedding, the key is to ...
详细信息
ISBN:
(纸本)9781665421744
graph neural network, with its powerful learning ability, has become a cutting-edge method of processing ultra-large-scale network data. In order to polished up the representation accuracy of embedding, the key is to find the intrinsic geometric metric of the complex network. Since the real data is mostly scale-free network, the embedding accuracy of traditional models is still limited by the dimensionality of the euclidean space and computational complexity. Therefore, the hyperbolic embedding, whose metric properties conform to the power-law distribution and tree-like hierarchical structure of the complex network, will effectively approximates the latent lowdimensional manifold of the data distribution. This paper proposes an auto-encoder in hyperbolic space (HVGAE), taking full use of hyperbolic graph convolutional (HGCN) and the idea of variationalautoencoder. Under the optimal combination of the encoder module, competitive results have been achieved in different real scenarios.
Background Protein-protein interactions (PPIs) are central to many biological processes. Considering that the experimental methods for identifying PPIs are time-consuming and expensive, it is important to develop auto...
详细信息
Background Protein-protein interactions (PPIs) are central to many biological processes. Considering that the experimental methods for identifying PPIs are time-consuming and expensive, it is important to develop automated computational methods to better predict PPIs. Various machine learning methods have been proposed, including a deep learning technique which is sequence-based that has achieved promising results. However, it only focuses on sequence information while ignoring the structural information of PPI networks. Structural information of PPI networks such as their degree, position, and neighboring nodes in a graph has been proved to be informative in PPI prediction. Results Facing the challenge of representing graph information, we introduce an improved graph representation learning method. Our model can study PPI prediction based on both sequence information and graph structure. Moreover, our study takes advantage of a representation learning model and employs a graph-based deep learning method for PPI prediction, which shows superiority over existing sequence-based methods. Statistically, Our method achieves state-of-the-art accuracy of 99.15% on Human protein reference database (HPRD) dataset and also obtains best results on Database of Interacting Protein (DIP) Human,Drosophila, Escherichia coli (E. coli), and Caenorhabditis elegans (C. elegan) datasets. Conclusion Here, we introduce signed variational graph auto-encoder (S-VGAE), an improved graph representation learning method, to automatically learn to encode graph structure into low-dimensional embeddings. Experimental results demonstrate that our method outperforms other existing sequence-based methods on several datasets. We also prove the robustness of our model for very sparse networks and the generalization for a new dataset that consists of four datasets: HPRD,***,***, andDrosophila.
暂无评论