The classical algorithm of finding association rules generated by a frequent itemset has to generate all non-empty subsets of the frequent itemset as candidate set of consequents. Xiongfei Li aimed at this and propose...
详细信息
The classical algorithm of finding association rules generated by a frequent itemset has to generate all non-empty subsets of the frequent itemset as candidate set of consequents. Xiongfei Li aimed at this and proposed an improved algorithm. The algorithm finds all consequents layer by layer, so it is breadth-first. In this paper, we propose a new algorithm Generate Rules by using Set-Enumeration Tree (GRSET) which uses the structure of Set-Enumeration Tree and depth-first method to find all consequents of the association rules one by one and get all association rules correspond to the consequents. Experiments show GRSET algorithm to be practicable and efficient.
Because mining complete set of frequent patterns from dense database could be impractical, an interesting alternative has been proposed recently. Instead of mining the complete set of frequent patterns, the new model ...
详细信息
Because mining complete set of frequent patterns from dense database could be impractical, an interesting alternative has been proposed recently. Instead of mining the complete set of frequent patterns, the new model only finds out the maximal frequent patterns, which can generate all frequent patterns. FP-growth algorithm is one of the most efficient frequent-pattern mining methods published so far. However, because FP-tree and conditional FP-trees must be two-way traversable, a great deal memory is needed in process of mining. This paper proposes an efficient algorithm Unid_FP-Max for mining maximal frequent patterns based on unidirectional FP-tree. Because of generation method of unidirectional FP-tree and conditional unidirectional FP-trees, the algorithm reduces the space consumption to the fullest extent. With the development of two techniques: single path pruning and header table pruning which can cut down many conditional unidirectional FP-trees generated recursively in mining process, Unid_FP-Max further lowers the expense of time and space.
The classical algorithm of finding association rules generated by a frequent itemset has to generate all nonempty subsets of the frequent itemset as candidate set of consequents. Xiongfei Li aimed at this and proposed...
详细信息
The classical algorithm of finding association rules generated by a frequent itemset has to generate all nonempty subsets of the frequent itemset as candidate set of consequents. Xiongfei Li aimed at this and proposed an improved algorithm. The algorithm finds all consequents layer by layer, so it is breadth-first. In this paper, we propose a new algorithm Generate Rules by using Set-Enumeration Tree (GRSET) which uses the structure of Set-Enumeration Tree and depth-first method to find all consequents of the association rules one by one and get all association rules correspond to the consequents. Experiments show GRSET algorithm to be practicable and efficient.
In this paper, we present the foundations for mining frequent tri-concepts, which extend the notion of closed item-sets to three-dimensional data to allow for mining folk-sonomies. We provide a formal definition of th...
详细信息
In this paper, we present the foundations for mining frequent tri-concepts, which extend the notion of closed item-sets to three-dimensional data to allow for mining folk-sonomies. We provide a formal definition of the problem, and present an efficient algorithm for its solution as well as experimental results on a large real-world example.
Becausemining complete set of frequent patterns from dense database could be impractical, an interesting alternative has been proposed recently. Instead of mining the complete set of frequent patterns, the new model o...
详细信息
Becausemining complete set of frequent patterns from dense database could be impractical, an interesting alternative has been proposed recently. Instead of mining the complete set of frequent patterns, the new model only finds out the maximal frequent patterns, which can generate all frequent patterns. FP-growth algorithm is one of the most efficient frequent-pattern mining methods published so far. However,because FP-tree and conditional FP-trees must be two-way traversable, a great deal memory is needed in process of mining. This paper proposes an efficient algorithm Unid_FP-Max for mining maximal frequent patterns based on unidirectional FP-tree. Because of generation method of unidirectional FP-tree and conditional unidirectional FP-trees, the algorithm reduces the space consumption to the fullest extent. With the development of two techniques:single path pruning and header table pruning which can cut down many conditional unidirectional FP-trees generated recursively in mining process, Unid_ FP-Max further lowers the expense of time and space.
In this paper, we give the algebraic independence measures for the values ofMahler type functions in complex number field and p-adic number field, respectively.
In this paper, we give the algebraic independence measures for the values ofMahler type functions in complex number field and p-adic number field, respectively.
data mining (DM) brings knowledge and theories from several fields including databases, machine learning, optimization, statistics, and data visualization and has been applied to various real-life applications. A larg...
详细信息
data mining (DM) brings knowledge and theories from several fields including databases, machine learning, optimization, statistics, and data visualization and has been applied to various real-life applications. A large amount of data mining articles have been published. The goal of this study is to establish an overview of the past and current data mining research activities from the title and abstract more than 1400 textual documents collected from premier data mining journals and conference proceedings. Specifically, this study applied document clustering approaches to determine which subjects had been studied over the last several years, which subjects are currently popular, and describe the longitudinal changes of data mining publications
Health insurance fraud detection is an important and challenging task. Traditionally, insurance companies use human inspections and heuristic rules to detect fraud. As the size of databases increases, the traditional ...
详细信息
Health insurance fraud detection is an important and challenging task. Traditionally, insurance companies use human inspections and heuristic rules to detect fraud. As the size of databases increases, the traditional approaches may miss a great portion of fraud for two main reasons. First, it is impossible to detect all health care fraud by manual inspection over large databases. Second, new types of health care fraud emerge constantly. SQL operations based on heuristic rules cannot identify those new emerging fraud schemes. Such a situation demands more sophisticated analytical methods and techniques that are capable of detecting fraud activities from large databases. The goal of this paper is to understand and detect suspicious health care frauds from large databases using clustering technique. Specifically, this paper applies two clustering methods, SAS EM and CLUTO, to a large real-life health insurance dataset and compares the performances of these two methods
The classical algorithm of finding association rules generated by a frequent itemset has to generate all nonempty subsets of the frequent itemset as candidate set of consequents. Xiongfei Li aimed at this and proposed...
详细信息
The classical algorithm of finding association rules generated by a frequent itemset has to generate all nonempty subsets of the frequent itemset as candidate set of consequents. Xiongfei Li aimed at this and proposed an improved algorithm. The algorithm finds all consequents layer by layer, so it is breadth-first. In this paper, we propose a new algorithm Generate Rules by using Set-Enumeration Tree (GRSET) which uses the structure of Set-Enumeration Tree and depth-first method to find all consequents of the association rules one by one and get all association rules correspond to the *** show GRSET algorithm to be practicable and efficient.
暂无评论