The hits algorithm is a very popular and effective algorithm to rank web documents based on the link information among a set of web pages. However, it assigns every link with the same weight. This assumption results i...
详细信息
The hits algorithm is a very popular and effective algorithm to rank web documents based on the link information among a set of web pages. However, it assigns every link with the same weight. This assumption results in topic drift. In this paper, we firstly define the generalized similarity between a query and a page, and the popularity of a web page. Then we propose a weighted hits algorithm which differentiates the importance of links with the query-page similarities and the popularity of web pages. Experimental results indicate that the improved hits algorithm can find more relevant pages than hits and improve the relevance by 30%-50%. Furthermore, it can avoid the problem of topic drift and enhance the quality of web search effectively.
The explosive growth of Web service technologies for integrating virtual enterprises has resulted in the serious information overload. Thus the users are facing increasing difficulty in selecting the correct manufactu...
详细信息
The explosive growth of Web service technologies for integrating virtual enterprises has resulted in the serious information overload. Thus the users are facing increasing difficulty in selecting the correct manufacturing services from the vast amount provided or recommended by collaborative partners for service-oriented supply chain deployment. Therefore, in this paper, a novel approach is presented for recommending personalised manufacturing services by combining a Hyperlink-Induced Topic Search (hits) algorithm and the Bayesian approach. The personalised service recommendation problem is modelled to determine the optimal manufacturing services that are most probably the best selections to user preferences for some known manufacturing services. Further, the Bayesian approach decomposes such a problem of posterior probability into two sub-problems: the prior probability of a manufacturing service hypothesis before considering any user preferences, and the conditional probability of considering the aggregated preferences for some known manufacturing services as evidences when the user actually wants an unknown most preferable manufacturing service. Next, the personalised hits algorithm is adapted to the network of service-oriented supply chain to rank the authority scores of manufacturing services that determine the relative probabilities of service execution through personalised trust propagation. The experimental results show that the proposed method can produce more accurate manufacturing service recommendation results than the existing approaches.
Recently, location-based services (LBSs) have been increasingly popular for people to experience new possibilities, for example, personalized point-of-interest (POI) recommendations that leverage on the overlapping of...
详细信息
Recently, location-based services (LBSs) have been increasingly popular for people to experience new possibilities, for example, personalized point-of-interest (POI) recommendations that leverage on the overlapping of user trajectories to recommend POI collaboratively. POI recommendation is yet challenging as it suffers from the problems known for the conventional recommendation tasks such as data sparsity and cold start, and to a much greater extent. In the literature, most of the related works apply collaborate filtering to POI recommendation while overlooking the personalized time-variant human behavioral tendency. In this article, we put forward a fourth-order tensor factorization-based ranking methodology to recommend users their interested locations by considering their time-varying behavioral trends while capturing their long-term preferences and short-term preferences simultaneously. We also propose to categorize the locations to alleviate data sparsity and cold-start issues, and accordingly new POIs that users have not visited can thus be bubbled up during the category ranking process. The tensor factorization is carefully studied to prune the irrelevant factors to the ranking results to achieve efficient POI recommendations. The experimental results validate the efficacy of our proposed mechanism, which outperforms the state-of-the-art approaches significantly.
This article has applied the hits algorithm in Web mining in the evaluation of academic literature. Through the empirical analysis of academic articles on the credit risk of the domestic banks, this article proves the...
详细信息
This article has applied the hits algorithm in Web mining in the evaluation of academic literature. Through the empirical analysis of academic articles on the credit risk of the domestic banks, this article proves the feasibility and comprehensiveness of using the hits algorithm in literature evaluation and finds that the generality and authority of corresponding theses have enormous space for further study and development.
The Hyperlink-Induced Topic Search (hits) algorithm developed by Jon Kleinberg made use of the link structure of the web pages on the Web in order to discover and rank web pages being relevant to a particular topic. H...
详细信息
The Hyperlink-Induced Topic Search (hits) algorithm developed by Jon Kleinberg made use of the link structure of the web pages on the Web in order to discover and rank web pages being relevant to a particular topic. However it only took account of the hyperlink structure, while completely excluded contents of web pages, and it ignored the fact that degrees of the importance of many hyperlinks on the Web may be different. In this paper, to overcome the topic drifts, we proposed a novel page ranking algorithm combining the hyperlink with the triadic closure theory by considering fully the Vector Space Model (VSM) and the TrustRank algorithm. The method firstly computed the relevance between two randomly arbitrary web pages based on web page topic similarity and common reference degree. Then, by using that model as a point of reference, a new adjacency matrix was constructed to iteratively calculate the authority and hub values of web pages. Next, we calculated the trust-degree for each web page in the basic set by the trust-score algorithm. Finally, the score for each web page is computed by linearly merging the authority and the trust-degree. In our experiments, we used five classic hits-based algorithms to compare with our proposed page ranking algorithm-PCThits (Web Page Topic Similarity, Common Reference Degree, Trust-degree) algorithm. The experimental results demonstrated that our proposed algorithm outperform the other four classic improved algorithms and hits algorithm.
Web page ranking algorithms are used to score the universal resource locators or simply online links of the web applications. The corporate world strives to develop the web applications in such a way so that it can be...
详细信息
ISBN:
(纸本)9781479917976
Web page ranking algorithms are used to score the universal resource locators or simply online links of the web applications. The corporate world strives to develop the web applications in such a way so that it can be visible on the top results in the major search engines and search directories. A number of web page ranking algorithms are developed with different scientific approaches making use of assorted mathematical formulations. Google Page Rank, Yahoo Web Rank, hits algorithm, Tag Rank algorithm, Weighted Links Rank algorithm, Relation Based algorithm, Time Rank algorithm and many other approaches are used in the industry so that the accurate and effective scoring can be done. In this research paper, the effective algorithmic approach based on the enhancement in hits algorithm is performed. This work focuses on the implementation of augmentation in the classical hits algorithm. The results depict the proposed approach providing effective and long term unbiased scoring of the web applications. In this research work, the implementation of deep link validity is performed and the results are obtained in terms of deep scoring and execution time.
Opinion leaders are core users in online communities, who can guide the direction of the public opinion. With the rapid development of microblog, identification of the microblog opinion leaders has become a significan...
详细信息
ISBN:
(纸本)9781479957279
Opinion leaders are core users in online communities, who can guide the direction of the public opinion. With the rapid development of microblog, identification of the microblog opinion leaders has become a significant task. In this paper, we propose a hybrid data mining approach based on user feature and interaction network, which includes three parts: a way to analyze users' authority, activity and influence, a way to consider the orientation of sentiment in interaction network and a combined method based on hits algorithm for identifying micro blog opinion leaders. Comparative results show that this mechanism can provide an effective mining of the user feature and a better rate of recognition.
This paper presents an Ontology Learning From Text (OLFT) method follows the well-known OLFT cake layer framework. Based on the distributional similarity, the proposed method generates multi-level ontologies from comp...
详细信息
ISBN:
(纸本)9782951740884
This paper presents an Ontology Learning From Text (OLFT) method follows the well-known OLFT cake layer framework. Based on the distributional similarity, the proposed method generates multi-level ontologies from comparatively small corpora with the aid of hits algorithm. Currently, this method covers terms extraction, synonyms recognition, concepts discovery and concepts hierarchical clustering. Among them, both concepts discovery and concepts hierarchical clustering are aided by the hits authority, which is obtained from the hits algorithm by an iteratively recommended way. With this method, a set of diachronic ontologies is constructed for each year based on People's Daily corpora of fifty years (i. e., from 1947 to 1996). Preliminary experiments show that our algorithm outperforms the Google's RNN and K-means based algorithm in both concepts discovery and concepts hierarchical clustering.
As one of the new service model of Web 2.0, the emergence of Community Question-Answering system brings a new way for users to obtain information. However, the explosive growth of users and information, it will be har...
详细信息
ISBN:
(纸本)9783038350125
As one of the new service model of Web 2.0, the emergence of Community Question-Answering system brings a new way for users to obtain information. However, the explosive growth of users and information, it will be hard for users to obtain the information quickly and accurately. Therefore, it is important to find experts in Community Question-Answering system to improve the accuracy and efficiency of information obtaining. This paper firstly analyzed the relationship among users, questions, and answers in Community Question-Answering system, and built the user diagram, and then by means of the Web mining technology, that is the link analysis weighted hits algorithm, to find experts out. Finally, three evaluation indices were used to measure the validity of the experts finding algorithm. Experimental results show the effectiveness of the weighted hits algorithm.
We propose new metrics for customers' purchasing behaviors in a group buying coupon website, based on hits algorithms and information entropy: that is, popularity awareness index, recency index, and purchase diver...
详细信息
ISBN:
(纸本)9783319089768;9783319089751
We propose new metrics for customers' purchasing behaviors in a group buying coupon website, based on hits algorithms and information entropy: that is, popularity awareness index, recency index, and purchase diversity. These indices are used to classify customers and predict future behaviors. This paper includes definitions of these new indices to be used in real group buying websites. In these websites, adequate characteristics for customers are strongly required and are critical for marketing purpose. We will also provide some experimental results on real data set, including customer segmentation used in future marketing planning.
暂无评论