检索结果-内蒙古大学图书馆

i/o-efficient algorithms for Degeneracy Computation on Massive Networks

iEEE TRANSACTioNS oN KNoWLEDGE AND DATA ENGiNEERiNG 2022年第7期34卷 3335-3348页

作者： Li, Ronghua Song, Qiushuo Xiao, Xiaokui Qin, Lu Wang, Guoren Yu, Jeffrey Mao, Rui Beijing Inst Technol Beijing 100811 Peoples R China Natl Engn Lab Big Data Syst Comp Technol Beijing Peoples R China Shenzhen Univ Shenzhen Inst Comp Sci Guangdong Prov Key Lab Popular High Performance C Shenzhen 518060 Guangdong Peoples R China Natl Univ Singapore Singapore 119077 Singapore Univ Technol Sydney NSW 2007 Australia Chinese Univ Hong Kong Hong Kong Peoples R China

Degeneracy is an important concept to measure the sparsity of a graph which has been widely used in many network analysis applications. Many network analysis algorithms, such as clique enumeration and truss decomposition, perform very well in graphs having small degeneracies. in this paper, we propose an i/o-efficient algorithm to compute the degeneracy of the massive graph that cannot be fully kept in the main memory. The proposed algorithm only uses o(n) memory, where n denotes the number of nodes of the graph. We also develop an i/o-efficient algorithm to incrementally maintain the degeneracy on dynamic graphs. Extensive experiments show that our algorithms significantly outperform the state-of-the-art degeneracy computation algorithms in terms of both running time and i/o costs. The results also demonstrate high scalability of the proposed algorithms. For example, in a real-world web graph with 930 million nodes and 13.3 billion edges, the proposed algorithm takes only 633 seconds and uses less than 4.5GB memory to compute the degeneracy.

关键词： Degeneracy i/o-efficient algorithm k-core massive graphs

来源：评论

学校读者我要写书评

暂无评论

i/o-efficient Butterfly Counting at Scale

引用

Proceedings of the ACM on Management of Data 2023年第1期1卷 1-27页

作者： Zhibin Wang Longbin Lai Yixue Liu Bing Shui Chen Tian Sheng Zhong Nanjing University Nanjing China Alibaba Group Hangzhou China

Butterfly (a cyclic graph motif) counting is a fundamental task with many applications in graph analysis, which aims at computing the number of butterflies in a large graph. With the rapid growth of graph data, it is more and more challenging to do butterfly counting due to the super-linear time complexity and large memory consumption. in this paper, we study i/o-efficient algorithms for doing butterfly counting on hierarchical memory. Existing algorithms of the kind cannot guarantee i/o optimality. observing that in order to count butterflies, it suffices to "witness" a subgraph instead of the whole structure, a new class of algorithms called semi-witnessing algorithm is proposed. We prove that a semi-witnessing algorithm is not restricted by the lower bound Ømega(|E|2/MB) of a witnessing algorithm, and give a new bound of Ømega(min(|E|2/MB, |E|/|V| √M B)). We further develop the ioBufs algorithm that manages to approach the i/o lower bound, and thus claim its optimality. Finally, we make efforts to parallelize ioBufs to further improve the performance and scalability. We show in the experiment that ioBufs significantly outperforms the state-of-the-art algorithms EMRC and BFC-EM. in addition, ioBufs can scale to conducting butterfly counting on the Clueweb graph with 37 billion edges and quintillions (10^18 ) of butterflies.

关键词： i/o-efficient algorithm butterfly graph parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

i/o efficient structural clustering and maintenance of clusters for large-scale graphs

引用

EXPERT SYSTEMS WiTH APPLiCATioNS 2021年 168卷

作者： Seo, Jung Hyuk Kim, Myoung Ho Korea Adv Inst Sci & Technol Sch Comp 291 Daehak Ro Daejeon 34141 South Korea

in recent years, the size of graph data has increased significantly, but most existing graph clustering algorithms do not consider the case where the size of main memory is not sufficient to handle large amount of graph data. Exploring entire region of graph for clustering causes too many random disk accesses to use data that are not loaded into memory, resulting in excessive disk i/o and thrashing. To address this problem, we propose an i/o-efficient algorithm for structural clustering of a graph, called pm-SCAN. in the proposed method, if memory is insufficient, an input graph is partitioned into several subgraphs smaller than memory, and clustering is first performed for each subgraph. And then clusters from the subgraphs are merged based on connectivity between clusters so that global results can be obtained in the point of view of an original input graph. Not only does pm SCAN produce scalable performance even for very large graphs, i.e., significant shortage of available memory, but also the result of pm-SCAN is the same as that of the original structural clustering algorithm SCAN. We also propose a cluster maintenance method for large-scale dynamic graphs that change over time. instead of reclustering with a whole graph, only a small set of nodes whose structural connectivities are subject to change by a given update operation is first identified, and we access only those nodes in disk and update their clusters to reduce maintenance costs. This dynamic graph handling mechanism shows significant performance improvement compared to the existing method and the baseline that performs clustering from scratch.

关键词： Graph Structural graph clustering i/o-efficient algorithm Cluster maintenance Dynamic graph

来源：评论

学校读者我要写书评

暂无评论

pm-SCAN: an i/o efficient Structural Clustering algorithm for Large-scale Graphs 17

pm-SCAN: an I/O Efficient Structural Clustering Algorithm fo...

引用

ACM Conference on information and Knowledge Management (CiKM)

作者： Seo, Jung Hyuk Kim, Myoung Ho Korea Adv Inst Sci & Technol Sch Comp Daejeon South Korea

ISBN: (纸本)9781450349185

Most existing algoritluns for graph clustering, including SCAN, are not designed to cope with large volumes of data that cannot fit in main memory. When there is not enough memory, those algorithms will incur thrashing, i.e. result in huge i/o costs. We propose an i/o-efficient algorithm for structural clustering, pm-SCAN. The main idea of our scheme is to partition a large graph into several subgraphs that can fit into main memory. We first find clusters in each subgraph, and then merge them to produce final clustering of the input graph. Experimental results show that while other existing algorithms are riot scalable to the graph size, our proposed method produces scalable performance for limited memory space.

关键词： Graph structural graph clustering i/o-efficient algorithm

来源：评论

学校读者我要写书评

暂无评论

Finding influential communities in massive networks

引用

VLDB JoURNAL 2017年第6期26卷 751-776页

作者： Li, Rong-Hua Qin, Lu Yu, Jeffrey Xu Mao, Rui Shenzhen Univ Coll Comp Sci & Software Engn Shenzhen Peoples R China Univ Technol Ctr QCIS FEIT Sydney NSW Australia Chinese Univ Hong Kong Hong Kong Hong Kong Peoples R China

Community search is a problem of finding densely connected subgraphs that satisfy the query conditions in a network, which has attracted much attention in recent years. However, all the previous studies on community search do not consider the influence of a community. in this paper, we introduce a novel community model called k-influential community based on the concept of k-core to capture the influence of a community. Based on this community model, we propose a linear time online search algorithm to find the top-r k-influential communities in a network. To further speed up the influential community search algorithm, we devise a linear space data structure which supports efficient search of the top-r k-influential communities in optimal time. We also propose an efficient algorithm to maintain the data structure when the network is frequently updated. Additionally, we propose a novel i/o-efficient algorithm to find the top-r k-influential communities in a disk-resident graph under the assumption of , where and n denote the size of the main memory and the number of nodes, respectively. Finally, we conduct extensive experiments on six real-world massive networks, and the results demonstrate the efficiency and effectiveness of the proposed methods.

关键词： influential community Core decomposition Tree-shape data structure Dynamic graph i/o-efficient algorithm

来源：评论

学校读者我要写书评

暂无评论

iTri: index-based triangle listing in massive graphs

引用

iNFoRMATioN SCiENCES 2016年 336卷 1-20页

作者： Rase, Mostofa Kamal Han, Yongkoo Kim, Jinseung Park, Kisung Nguyen Anh Tu Lee, Young-Koo Kyung Hee Univ Dept Comp Engn Yongin 446701 Gyeonggi Do South Korea

Triangle listing is a basic operator when dealing with many graph problems. However, in memory algorithms do not work well with recently developed massive graphs such as social networks because these graphs cannot be accommodated in the memory. Thus, external memory-based algorithms have been proposed recently, but these approaches still require frequent multiple scans of the whole graph on the disk and large volumes of calculations are performed that involve the whole graph during every iteration. in this study, we propose a novel index-based method for listing triangles in massive graphs. First, we present new notions for the vertex range index and potential cone vertex index. Next, we propose an index join-based triangle listing algorithm. our method accesses the indexed data asynchronously and joins them to list triangles using a multi-threaded parallel processing technique. Based on experiments, we demonstrate that our algorithm outperforms the state-of-the-art solution methods by three to eight times in terms of the wall clock time. (C) 2015 Elsevier inc. All rights reserved.

关键词： CPU parallelism Graph indexing Graph mining i/o-efficient algorithm Triangle listing

来源：评论

学校读者我要写书评

暂无评论

PDTL: Parallel and Distributed Triangle Listing for Massive Graphs 44

PDTL: Parallel and Distributed Triangle Listing for Massive ...

引用

44th Annual international Conference on Parallel Processing Workshops (iCPPW)

作者： Giechaskiel, ilias Panagopoulos, George Yoneki, Eiko Univ Cambridge Cambridge England Univ Oxford Oxford England

ISBN: (纸本)9781467375887

This paper presents the first distributed triangle listing algorithm with provable CPU, i/o, Memory, and Network bounds. Finding all triangles (3-cliques) in a graph has numerous applications for density and connectivity metrics, but the majority of existing algorithms for massive graphs are sequential, while distributed versions of algorithms do not guarantee their CPU, i/o, Memory, or Network requirements. our Parallel and Distributed Triangle Listing (PDTL) framework focuses on efficient external-memory access in distributed environments instead of fitting subgraphs into memory. it works by performing efficient orientation and load-balancing steps, and replicating graphs across machines by using an extended version of Hu et al.'s Massive Graph Triangulation algorithm. PDTL suits a variety of computational environments, from single-core machines to high-end clusters, and computes the exact triangle count on graphs of over 6B edges and 1B vertices (e.g. Yahoo graphs), outperforming and using fewer resources than the state-of-the-art systems PowerGraph, oPT, and PATRiC by 2x to 4x. our approach thus highlights the importance of i/o in a distributed environment.

关键词： Triangle Listing Triangle Counting Big Data Massive Graphs i/o-efficient algorithm Distributed algorithm Parallel algorithm

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：