The traditional pagerank algorithm can't efficiently dispose large data Webpage scheduling problem. This paper proposes an accelerated algorithm named topK-Rank. It is based on pagerank on the MapReduce platform. ...
详细信息
The traditional pagerank algorithm can't efficiently dispose large data Webpage scheduling problem. This paper proposes an accelerated algorithm named topK-Rank. It is based on pagerank on the MapReduce platform. Owing to this algorithm, Top k nodes can be found efficiently for a given graph without sacrificing accuracy. It can iteratively estimate lower/upper bounds of pagerank scores, and construct subgraphs in each iteration by pruning unnecessary nodes and edges. Theoretical analysis shows that this method guarantees result exactness. Experiments show that it can find top k nodes much faster than the existing approaches.
The rapid growth of the amount of information on the Internet makes web search technology become the main means for people to obtain information. How to measure the importance of web pages efficiently and quickly has ...
详细信息
ISBN:
(数字)9781510651890
ISBN:
(纸本)9781510651890;9781510651883
The rapid growth of the amount of information on the Internet makes web search technology become the main means for people to obtain information. How to measure the importance of web pages efficiently and quickly has become an important topic, and has different degrees of influence on all aspects of the web page, such as web crawling, web page grading and ranking. This paper discusses an important algorithm-pagerank, based on Markov chain model then a discussion of the advantages and disadvantages of pagerank is provided here. Finally, according to the drawbacks, some corrections and improvements are also discussed in this paper.
The existing methods of business information mining are flexible and cannot effectively mine the business information of media platform. In order to summarize and manage the mass business information effectively, a re...
详细信息
ISBN:
(数字)9783031288678
ISBN:
(纸本)9783031288661;9783031288678
The existing methods of business information mining are flexible and cannot effectively mine the business information of media platform. In order to summarize and manage the mass business information effectively, a research method of business information mining based on pagerank algorithm for social media platform is proposed. Different from the existing methods, it innovatively optimizes the business information evaluation algorithm of social media platforms, increases the flexibility of information mining, and realizes the business information mining of social media platforms. The experiment proves that the technology of business information mining based on social media platforms can effectively summarize and manage a large amount of business information.
Numerous citizen science platforms aiming at monitoring biodiversity have emerged in the recent years. These platforms collect biodiversity data from participants and allow them to increase their scientific knowledge ...
详细信息
ISBN:
(纸本)9783030017682;9783030017675
Numerous citizen science platforms aiming at monitoring biodiversity have emerged in the recent years. These platforms collect biodiversity data from participants and allow them to increase their scientific knowledge and share it with other participants, experts and scientists. One key aspect of such platforms is quality control on the data, a task usually performed by a limited number of co-opted experts. With the amount of data collected increasing steeply, finding new experts is needed. In this paper we propose a new graph-based expert finding approach for the citizen science platform SPIPOLL, aiming at collecting data on pollinator diversity across France. We exploit both users comments quality and users social relations to calculate users expertise for specific insect family. Experimental results show that the proposed method performs better than the state-of-the-art expert finding algorithms.
The amount of global information in the World Wide Web is growing at an incredible rate. Millions of results are returned from search engines. The rank of pages in the search engines is very important. One of the basi...
详细信息
ISBN:
(纸本)9789898111852
The amount of global information in the World Wide Web is growing at an incredible rate. Millions of results are returned from search engines. The rank of pages in the search engines is very important. One of the basic rank algorithms is pagerank algorithm. This paper proposes an enhancement of pagerank algorithm to speed up the computational process. The enhancement of pagerank algorithm depends on using the Ant algorithm. On average, this technique yields about 7.5 out of ten relevant pages to the query topic, and the total time reduced by 19.9 %.
Severe natural catastrophes may directly cause large-scale cascading failures in energy systems. This paper presents a comprehensive scheme of vulnerability assessment for multi-energy systems, including both electric...
详细信息
Severe natural catastrophes may directly cause large-scale cascading failures in energy systems. This paper presents a comprehensive scheme of vulnerability assessment for multi-energy systems, including both electricity and natural gas. Differing from cascading failure theory, this scheme employs pagerank algorithm to measure the structural importance of the integrated system. A modified weighted pagerank methodology is applied to determine the importance of nodes in both natural gas and electricity networks. The effectiveness of this scheme is demonstrated and analyzed via an integrated energy system in the case study. This algorithm can help identify the weakness in the integrate energy system so that purposeful planning and operation strategies can be deployed to increase the security. (C) 2019 The Authors. Published by Elsevier Ltd.
pagerank algorithm is a famous algorithm to mine the web structure,but it has a drawback of topic-drift. To eliminate the topic-drift of the pagerank algorithm, and after the analysis of existing algorithms, a new alg...
详细信息
ISBN:
(纸本)1424403316
pagerank algorithm is a famous algorithm to mine the web structure,but it has a drawback of topic-drift. To eliminate the topic-drift of the pagerank algorithm, and after the analysis of existing algorithms, a new algorithm called TC-pagerank algorithm is put forward. The TC-pagerank algorithm is based on, fictitious file vector and correlation measure of cosine. Experimental results illustrate that TC-pagerank algorithm eliminates the topic-drift phenomenon effectively, and thus improves the quality of retrieving.
The construction industry faces significant challenges with frequent accidents, largely due to the inefficient use of safety guidelines. These guidelines, which are often text and figure heavy, demand substantial huma...
详细信息
The construction industry faces significant challenges with frequent accidents, largely due to the inefficient use of safety guidelines. These guidelines, which are often text and figure heavy, demand substantial human effort to identify the most relevant items for specific tasks and conditions. Additionally, the guidelines contain both central and peripheral elements, and central items are critical yet difficult to identify without extensive domain knowledge. This study proposes a novel recommendation framework to enhance the usability of these safety guidelines. By leveraging natural language processing (NLP) and knowledge graph (KG) modeling techniques, unstructured safety texts are transformed into a structured, interconnected KG. The pagerank and Louvain Clustering algorithm is then employed to rank guidelines by their relevance and importance. A case study on "High-rise Building Construction (General) Safety and Health Guidelines", using 'scaffolding' as the keyword, demonstrates the framework's effectiveness in improving retrieval efficiency and practical application. The analysis highlighted key clusters such as 'fall', 'drop', and 'scaffolding', with critical safety measures identified through their interconnections. This research not only overcomes the fragmentation of safety management documents but also contributes to advancing hazard analysis and risk prevention practices in construction management.
pagerank algorithm is a benchmark for many graph analytics and is the underlying kernel for link predictions. recommendation systems. It is an iterative algorithm that updates ranks of pages until the value converges....
详细信息
ISBN:
(纸本)9781665414555
pagerank algorithm is a benchmark for many graph analytics and is the underlying kernel for link predictions. recommendation systems. It is an iterative algorithm that updates ranks of pages until the value converges. Implementation of pagerank algorithm on a shared memory architecture while taking advantage of line-grained parallelism using large-scale graphs is a challenging task. In this paper. We present parallel algorithms for computing the pagerank suitable to the shared memory systems. Initially, we present parallel implementations of page-rank algorithms using harrier and lock variants. Later. we propose new approaches which are lock-free and are harrier-less synchronization to overcome the issues of lock based methods. A detailed experimental analysis of our approach is carried out using real-world web graphs from SNAP and Synthetic Graphs from RMAT on an Intel(R) Xeon E5-2660 v4 processor architecture with 56 threads using the POSIX thread library.
This study explores methods to efficiently summarize extensive Arabic texts, addressing the growing need to condense large volumes of content across various fields. Three primary techniques are evaluated: Word Frequen...
详细信息
ISBN:
(纸本)9798350367782;9798350367775
This study explores methods to efficiently summarize extensive Arabic texts, addressing the growing need to condense large volumes of content across various fields. Three primary techniques are evaluated: Word Frequency Analysis, K-means Clustering based on Sentence Proximity, and the pagerank algorithm. The research finds the pagerank algorithm to be the most effective, delivering higher compression ratios while maintaining strong recall and precision metrics. In particular, the pagerank method achieved the highest compression ratio of 0.562 while maintaining a population standard deviation of 2.0, compared to other techniques. Evaluation metrics such as population standard deviation, F1 score, and compression ratio support these findings. The study also examines advanced approaches like fuzzy logic-based methods, transformers, and multi-document summarization, aiming to enhance Arabic text summarization.
暂无评论