Taking data mining technologies as the key, with the analysis of electronic commerce, this paper studies deeply on the implementation of data mining system that is electronic commerce-oriented.
ISBN:
(纸本)9783037859391
Taking data mining technologies as the key, with the analysis of electronic commerce, this paper studies deeply on the implementation of data mining system that is electronic commerce-oriented.
This paper introduces the concept of data mining and its an important branch association rules,describes the basic concept of association rules,the basic model of mining association rules;introduces the classical algo...
详细信息
This paper introduces the concept of data mining and its an important branch association rules,describes the basic concept of association rules,the basic model of mining association rules;introduces the classical algorithm of association rules,and then classified discusses the association rules mining from several angles such as width,depth,partition,sampling and incremental updating. Finally,this paper prospects the association rules mining.
Graph stream classification concerns building learning models from continuously growing graph data, in which an essential step is to explore subgraph features to represent graphs for effective learning and classificat...
详细信息
ISBN:
(纸本)9781450322638
Graph stream classification concerns building learning models from continuously growing graph data, in which an essential step is to explore subgraph features to represent graphs for effective learning and classification. When representing a graph using subgraph features, all existing methods employ coarse-grained feature representation, which only considers whether or not a subgraph feature appears in the graph. In this paper. we propose a fine-grained graph factorization approach for Fast Graph Stream Classification (FGSC). Our main idea is to find a set of cliques as feature base to represent each graph as a linear combination of the base cliques. To achieve this goal, we decompose each graph into a number of cliques and select discriminative cliques to generate a transfer matrix called Clique Set Matrix (M). By using M as the base for formulating graph factorization, each graph is represented in a vector space with each element denoting the degree of the corresponding subgraph feature related to the graph, so existing supervised learning algorithms can be applied to derive learning models for graph classification.
mining maximal frequent itemsets is of paramount relevance in many of data mining applications. The "traditional" algorithms address this problem through scanning databases many times. The latest research ha...
详细信息
mining maximal frequent itemsets is of paramount relevance in many of data mining applications. The "traditional" algorithms address this problem through scanning databases many times. The latest research has already focused on reducing the number of scanning times of databases and then decreasing the number of accessing times of I/O resources in order to improve the overall mining efficiency of maximal frequent itemsets of association rules. In this paper, we present a form of the directed itemsets graph to store the information of frequent itemsets of transaction databases, and give the trifurcate linked list storage structure of directed itemsets graph. Furthermore, we develop the mining algorithm of maximal frequent itemsets based on this structure. As a result, one realizes scanning a database only once, and improves storage efficiency of data structure and time efficiency of mining algorithm. (C) 2011 Elsevier Ltd. All rights reserved.
The efficiency of alarm mining in large scale network is a hot topic in research area of fault location. Many projects have been utilized to improve it and some advance has been achieved, but it is still not very sati...
详细信息
ISBN:
(纸本)9781424435296
The efficiency of alarm mining in large scale network is a hot topic in research area of fault location. Many projects have been utilized to improve it and some advance has been achieved, but it is still not very satisfied. Large scale telecom system daily generates large numbers of original alarm records and many of them are redundant information, actually useful information for fault location is only a small part. So in order to increase efficiency of alarm mining, we propose a solution called network topology constraint based transaction separation in this paper and we also compare it with the method of adding topology constraint into mining algorithms. The experiments show that separating transaction with network topology constraint deletes a huge number of redundant alarm information and incorrect transaction sets before executing mining algorithms, therefore, it is can be more efficient and precise than topology constraint based mining algorithm, which filters erroneous frequent sub-patterns during algorithm execution.
The efficiency of alarm mining in large scale network is a hot topic in research area of fault *** projects have been utilized to improve it and some advance has been achieved,but it is still not very *** scale teleco...
详细信息
The efficiency of alarm mining in large scale network is a hot topic in research area of fault *** projects have been utilized to improve it and some advance has been achieved,but it is still not very *** scale telecom system daily generates large numbers of original alarm records and many of them are redundant information,actually useful information for fault location is only a small *** in order to increase efficiency of alarm mining,we propose a solution called network topology constraint based transaction separation in this paper and we also compare it with the method of adding topology constraint into mining *** experiments show that separating transaction with network topology constraint deletes a huge number of redundant alarm information and incorrect transaction sets before executing mining algorithms,therefore,it is can be more efficient and precise than topology constraint based mining algorithm,which filters erroneous frequent sub-patterns during algorithm execution.
The emergence of cloud manufacturing (CMfg) provides a new opportunity for the change of manufacturing towards service-oriented model. Cloud service composition (CSC), which can realize the added value of cloud servic...
详细信息
ISBN:
(纸本)9781457707391
The emergence of cloud manufacturing (CMfg) provides a new opportunity for the change of manufacturing towards service-oriented model. Cloud service composition (CSC), which can realize the added value of cloud service (CS), is the core to implement CMfg. Since there always exist correlations among CSs, especially composable correlation (CoC), which can affect the construction of CSC path. Hence, how to mine the CoC among CSs and judge which kind of CoC between them is a key issue. This paper presents the formalized description for CoC, and designs decision algorithms to judge CoCs between CSs based on bipartite graph. The case study illustrates the application of proposed algorithms.
The mainstream of development in knowledge discovery is researching on new high-performance and high- scalability mining algorithm. In fact, the research of process model and inner mechanism is more important, which h...
详细信息
In this paper, we study a new problem of mining dynamic association rules with comments (DAR-C for short). A DAR-C contains not only rule itself, but also its comments that specify when to apply the rule. In order to ...
详细信息
In this paper, we study a new problem of mining dynamic association rules with comments (DAR-C for short). A DAR-C contains not only rule itself, but also its comments that specify when to apply the rule. In order to formalize this problem, we first present the expression method of candidate effective time slots, and then propose several definitions concerning DAR-C. Subsequently, two algorithms, namely ITS2 and EFP-Growth2, are developed for handling the problem of mining DAR-C. In particular, ITS2 is an improved two-stage dynamic association rule mining algorithm, while EFP-Growth2 is based on the EFP-tree structure and is suitable for mining high-density mass data. Extensive experimental results demonstrate that the efficiency and scalability of our proposed two algorithms (i.e., ITS2 and EFP-Growth2) on DAR-C mining tasks, and their practicability on real retail dataset.
暂无评论