algorithm Eclat is a classical algorithm for mining frequent itemsets, which is based on vertical layout databases. It is greatly different from those algorithms based on horizontal layout databases, such as algorithm...
详细信息
ISBN:
(纸本)9781479932610
algorithm Eclat is a classical algorithm for mining frequent itemsets, which is based on vertical layout databases. It is greatly different from those algorithms based on horizontal layout databases, such as algorithm Apriori and FP-Growth. In order to improve the efficiency of mining frequent itemsets from massive datasets, parallelalgorithm MREclat based on Map/Reduce framework is presented. The algorithm also overcomes the problem of memory and computational capability insufficient when mining frequent itemsets from massive datasets. In this paper, the idea of MREclat is introduced and the performance of the algorithm is studied. The experimental results show that algorithm MREclat has high scalability and good speedup.
Nowadays, the explosive growth in data collection in business and scientific areas has required the need to analyze and mine useful knowledge residing in these data. The recourse to data mining techniques seems to be ...
详细信息
ISBN:
(纸本)9781538635810
Nowadays, the explosive growth in data collection in business and scientific areas has required the need to analyze and mine useful knowledge residing in these data. The recourse to data mining techniques seems to be inescapable in order to extract useful and novel patterns/models from large datasets. In this context, frequent itemsets (patterns) play an essential role in many data mining tasks that try to find interesting patterns from datasets. However, conventional approaches for mining frequent itemsets in Big Data era encounter significant challenges when computing power and memory space are limited. This paper proposes an efficient distributed frequent itemset miningalgorithm, called parallelCharMax, that is based on a powerful sequential algorithm, called Charm, and computes the maximal frequent itemsets that are considered perfect summaries of the frequent ones. The proposed algorithm has been implemented using MapReduce framework. The experimental component of the study shows the efficiency and the performance of the proposed algorithm compared with well known algorithms such as MineWithRounds and HMBA.
暂无评论