Dependence is a common relationship between objects. Many works have paid their attentions on dependence, but many of them mainly focus on constructing or exploiting dependence graphs on some specific domain. In this ...
详细信息
Similarity calculation has many applications, such as information retrieval, and collaborative filtering, among many others. It has been shown that link-based similarity measure, such as SimRank, is very effective in ...
详细信息
ISBN:
(纸本)9781424452422
Similarity calculation has many applications, such as information retrieval, and collaborative filtering, among many others. It has been shown that link-based similarity measure, such as SimRank, is very effective in characterizing the object similarities in networks, such as the Web, by exploiting the object-to-object relationship. Unfortunately, it is prohibitively expensive to compute the link-based similarity in a relatively large graph. In this paper, based on the observation that link-based similarity scores of real world graphs follow the power-law distribution, we propose a new approximate algorithm, namely Power-SimRank, with guaranteed error bound to efficiently compute link-based similarity measure. We also prove the convergence of the proposed algorithm. Extensive experiments conducted on real world datasets and synthetic datasets show that the proposed algorithm outperforms SimRank by four-five times in terms of efficiency while the error generated by the approximation is small.
Classification is an important subject in data mining and machine learning, which has been studied extensively and has a wide range of applications. Classification based on association rules is one of the most effecti...
详细信息
ISBN:
(纸本)9780769538174
Classification is an important subject in data mining and machine learning, which has been studied extensively and has a wide range of applications. Classification based on association rules is one of the most effective classification method, whose accuracy is higher and discovered rules are easier to understand comparing with classical classification methods. However, current algorithms for classification based on association rules is single table oriented, which means they can only apply to the data stored in a single relational table. Directly applying these algorithms in multi-relational data environment will result in many problems. This paper proposes a novel algorithm MrCAR for classification based on association rules in multi-relational data environment. MrCAR mines relevant features in each table to predict the class label. Close itemsets technique and Tuple ID Propagation method are used to improve the performance of the algorithm. Experimental results show that MrCAR has higher accuracy and better understandability comparing with a typical existing multirelational classification algorithm.
Based on the analysis of problems and difficulties in apparel quotation system, this paper puts forward the combination of BPM and SOA as a new idea for analysis of apparel quotation system, according to their advanta...
详细信息
Based on the analysis of problems and difficulties in apparel quotation system, this paper puts forward the combination of BPM and SOA as a new idea for analysis of apparel quotation system, according to their advantages in business goals and requirements analysis, and the corresponding services' definition, extraction, optimization and integration. Through the combination, system flexibility, rapidity and accuracy could be achieved. The establishment of Service Repository according to the business requirements, is a crucial part in the architecture, however there are no definite rules for service extraction. In this paper, the detailed activities and steps, as well as a specific establishment case is illustrated. At last, architecture based on BPM and SOA for the apparel trade quotation is put forward, and its composition and implement are also analyzed.
Based on the analysis of quote business processes, as well as the characteristics of SOA, a quotation system is brought forward based on SOA for apparel international trade, in order to enhance the Quote's diversi...
详细信息
Traditional web database cache techniques have a major disadvantage, namely poor data freshness, because they employ an asynchronous data refresh strategy. A novel web database cache, DB Façade, is proposed in th...
详细信息
Since the proposition of Journal Impact Factor [1] in 1963, the classical citation-based ranking scheme has been a standard criterion to rank journals and conferences. However, the reference of a paper cannot list all...
详细信息
In this paper, a rapid resynchronization method using intent logs is suggested for replication In-memory databases supporting mobile communication applications. Both the identifiers of unsynchronized segments and the ...
详细信息
Reverse Skyline Queries have been proved very useful in business location, environmental monitoring and some other applications. In this paper, we consider reverse skyline queries processing on data stream, which prov...
详细信息
Reverse Skyline Queries have been proved very useful in business location, environmental monitoring and some other applications. In this paper, we consider reverse skyline queries processing on data stream, which provides continuous, high-speed data elements. Specifically, we consider the latest objects in the sliding window. The challenge is that it is difficult to maintain a multidimensional index (for example, R-tree) in a dynamic dataset. Focusing on this challenge, we propose an algorithm with a DC-Tree as index and effective pruning methods to reduce the search space of query processing and the cost of index maintaining. Extensive experiments show that our algorithms are efficient and effective for on-line reverse skyline query.
Top-k queries in uncertain databases are quite popular and useful due to its wide application usage. However, compared to Top-k in traditional databases, queries over uncertain database are more complicated because of...
详细信息
暂无评论