While variationalgraphauto-encoder (VGAE) has presented promising ability to learn representations for documents, most existing VGAE methods do not model a latent topic structure and therefore lack semantic interpret...
详细信息
ISBN:
(纸本)9781450393850
While variationalgraphauto-encoder (VGAE) has presented promising ability to learn representations for documents, most existing VGAE methods do not model a latent topic structure and therefore lack semantic interpretability. Exploring hidden topics within documents and discovering key words associated with each topic allow us to develop a semantic interpretation of the corpus. Moreover, documents are usually associated with authors. For example, news reports have journalists specializing in writing certain type of events, academic papers have authors with expertise in certain research topics, etc. Modeling authorship information could benefit topic modeling, since documents by the same authors tend to reveal similar semantics. This observation also holds for documents published on the same venues. However, most topic models ignore the auxiliary authorship and publication venues. Given above two challenges, we propose a variationalgraph Author Topic Model for documents to integrate both semantic interpretability and authorship and venue modeling into a unified VGAE framework. For authorship and venue modeling, we construct a hierarchical multilayered document graph with both intra- and cross-layer topic propagation. For semantic interpretability, three word relations (contextual, syntactic, semantic) are modeled and constitute three word sub-layers in the document graph. We further propose three alternatives for variational divergence. Experiments verify the effectiveness of our model on supervised and unsupervised tasks.
Identifying key nodes is an important task of complex network. While many previous algorithms evaluate node importance from the perspective of the network topological properties such as degree or betweenness, they hav...
详细信息
Identifying key nodes is an important task of complex network. While many previous algorithms evaluate node importance from the perspective of the network topological properties such as degree or betweenness, they have not considered the low-dimensional feature of nodes. In this paper, we propose a general improved approach based on deep learning, in which the node feature is learned via a variationalgraphautoencoder(VGAE). Due to the unsupervised learning way, the VGAE does not rely on any external label information, but is determined by the network structure. The extracted node feature provides an essential complement to the topological properties of nodes, and can be generalized to improve different algorithms. We employ the VGAE to improve ten distinct algorithms, which are evaluated by two accuracy indicators using the susceptible-infected-recovered (SIR) model around the epidemic threshold as the benchmark, and an indicator of the significance discrimination ability on the nodes with similar importance. Testing the algorithms on an ER random network and eight real networks, we demonstrate better performance of the ten improved algorithms than their original ones in most cases. Our work sheds a new light on node importance identification from the latent feature point of view.
graph clustering based on embedding aims to divide nodes with higher similarity into several mutually disjoint groups, but it is not a trivial task to maximumly embed the graph structure and node attributes into the l...
详细信息
graph clustering based on embedding aims to divide nodes with higher similarity into several mutually disjoint groups, but it is not a trivial task to maximumly embed the graph structure and node attributes into the low dimensional feature space. Furthermore, most of the current advanced methods of graph nodes clustering adopt the strategy of separating graph embedding technology and clustering algorithm, and ignore the potential relationship between them. Therefore, we propose an innovative end to-end graph clustering framework with joint strategy to handle the complex problem in a non-Euclidean space. In terms of learning the graph embedding, we propose a new variational graph auto-encoder algorithm based on the graph Convolution Network (GCN), which takes into account the boosting influence of joint generative model of graph structure and node attributes on the embedding output. On the basis of embedding representation, we implement a self-training mechanism through the construction of auxiliary distribution to further enhance the prediction of node categories, thereby realizing the unsupervised clustering mode. In addition, the loss contribution of each cluster is normalized to prevent large clusters from distorting the embedding space. Extensive experiments on real-world graph datasets validate our design and demonstrate that our algorithm has highly competitive in graph clustering over state-of-theart methods. (c) 2021 Elsevier Ltd. All rights reserved.
graph neural network, with its powerful learning ability, has become a cutting-edge method of processing ultra-large-scale network data. In order to polished up the representation accuracy of embedding, the key is to ...
详细信息
ISBN:
(纸本)9781665421744
graph neural network, with its powerful learning ability, has become a cutting-edge method of processing ultra-large-scale network data. In order to polished up the representation accuracy of embedding, the key is to find the intrinsic geometric metric of the complex network. Since the real data is mostly scale-free network, the embedding accuracy of traditional models is still limited by the dimensionality of the euclidean space and computational complexity. Therefore, the hyperbolic embedding, whose metric properties conform to the power-law distribution and tree-like hierarchical structure of the complex network, will effectively approximates the latent lowdimensional manifold of the data distribution. This paper proposes an auto-encoder in hyperbolic space (HVGAE), taking full use of hyperbolic graph convolutional (HGCN) and the idea of variationalautoencoder. Under the optimal combination of the encoder module, competitive results have been achieved in different real scenarios.
Background Protein-protein interactions (PPIs) are central to many biological processes. Considering that the experimental methods for identifying PPIs are time-consuming and expensive, it is important to develop auto...
详细信息
Background Protein-protein interactions (PPIs) are central to many biological processes. Considering that the experimental methods for identifying PPIs are time-consuming and expensive, it is important to develop automated computational methods to better predict PPIs. Various machine learning methods have been proposed, including a deep learning technique which is sequence-based that has achieved promising results. However, it only focuses on sequence information while ignoring the structural information of PPI networks. Structural information of PPI networks such as their degree, position, and neighboring nodes in a graph has been proved to be informative in PPI prediction. Results Facing the challenge of representing graph information, we introduce an improved graph representation learning method. Our model can study PPI prediction based on both sequence information and graph structure. Moreover, our study takes advantage of a representation learning model and employs a graph-based deep learning method for PPI prediction, which shows superiority over existing sequence-based methods. Statistically, Our method achieves state-of-the-art accuracy of 99.15% on Human protein reference database (HPRD) dataset and also obtains best results on Database of Interacting Protein (DIP) Human,Drosophila, Escherichia coli (E. coli), and Caenorhabditis elegans (C. elegan) datasets. Conclusion Here, we introduce signed variational graph auto-encoder (S-VGAE), an improved graph representation learning method, to automatically learn to encode graph structure into low-dimensional embeddings. Experimental results demonstrate that our method outperforms other existing sequence-based methods on several datasets. We also prove the robustness of our model for very sparse networks and the generalization for a new dataset that consists of four datasets: HPRD,***,***, andDrosophila.
暂无评论