A collection of web pages which are about a common topic and are created by individuals or any kind of associations that have a common interest on that specific topic is called a web community. Since at present, the s...
详细信息
ISBN:
(纸本)9781424442614
A collection of web pages which are about a common topic and are created by individuals or any kind of associations that have a common interest on that specific topic is called a web community. Since at present, the size of the web is over 3 billion pages and it is still growing very fast, identification of web communities has become an increasingly hard task. In this paper, a method based on asynchronous cellular learning automata (ACLA) for identification of web communities is proposed. In the proposed method first an asynchronous cellular learning automaton is used to determine the related pages and their relevance degree (the relationship structure of web pages). For determination of relationship structure of web pages information about hyperlinks and the users' behaviour in visiting the web pages are used. Then, an algorithm similar to the hits algorithm is applied on the obtained structure to identify the web communities. One of the advantages of the proposed method is that the web community obtained using this method is not dependent on a specific web graph structure. To evaluate the proposed approach, it is implemented and the results are compared with the results obtained for two existing methods, hits and a complete bipartite graph based method. Experimental results show the superiority of the proposed method.
A survey of existing algorithms used in the analysis of hyperlinks in the Web, such as PageRank, hits, and modified variants of these algorithms, is presented. Particular attention is devoted to the quality and accele...
详细信息
A survey of existing algorithms used in the analysis of hyperlinks in the Web, such as PageRank, hits, and modified variants of these algorithms, is presented. Particular attention is devoted to the quality and acceleration of the ranking of the results of a search. An "ant algorithm" is proposed as a means of accelerating the computation of the ranks of Web pages.
algorithms such as Kleinberg's hits algorithm, the PageRank algorithm of Brin and Page, and the SALSA algorithm of Lempel and Moran use the link structure of a network of web pages to assign weights to each page i...
详细信息
algorithms such as Kleinberg's hits algorithm, the PageRank algorithm of Brin and Page, and the SALSA algorithm of Lempel and Moran use the link structure of a network of web pages to assign weights to each page in the network. The weights can then be used to rank the pages as authoritative sources. These algorithms share a common underpinning;they find a dominant eigenvector of a nonnegative matrix that describes the link structure of the given network and use the entries of this eigenvector as the page weights. We use this commonality to give a unified treatment, proving the existence of the required eigenvector for the PageRank, hits, and SALSA algorithms, the uniqueness of the PageRank eigenvector, and the convergence of the algorithms to these eigenvectors. However, we show that the hits and SALSA eigenvectors need not be unique. We examine how the initialization of the algorithms affects the final weightings produced. We give examples of networks that lead the hits and SALSA algorithms to return nonunique or nonintuitive rankings. We characterize all such networks in terms of the connectivity of the related hits authority graph. We propose a modi. cation, Exponentiated Input to hits, to the adjacency matrix input to the hits algorithm. We prove that Exponentiated Input to hits returns a unique ranking, provided that the network is weakly connected. Our examples also show that SALSA can give inconsistent hub and authority weights, due to nonuniqueness. We also mention a small modi. cation to the SALSA initialization which makes the hub and authority weights consistent.
Collaborative filtering recommender systems often suffer from the 'Matchmaker' problem, which comes from the false assumption that users are counted only based on their similarity, and high similarity means go...
详细信息
Collaborative filtering recommender systems often suffer from the 'Matchmaker' problem, which comes from the false assumption that users are counted only based on their similarity, and high similarity means good advisers. In order to find good advisers for every user, a matchmaker's reliability mode based on the algorithm deriving from hits is constructed, and it is applied in the proposed World Wide Web (WWW) collaborative recommendation system. Comparative experimental results also show that our approach obviously improves the substantial performance.
暂无评论