Inducing of classification rules is a research areaof machine learning and datamining that has receiveda lot of attention in recentyears. The ID3 and C4.5 arethe well-known algorithms for mining classificationrules. ...
详细信息
ISBN:
(纸本)0780379411
Inducing of classification rules is a research areaof machine learning and datamining that has receiveda lot of attention in recentyears. The ID3 and C4.5 arethe well-known algorithms for mining classificationrules. However, The rules induced by ID3 are notoptimal. Thus, Hu X.G propose a novel approach thatinducing classification rules in Extended ConceptLattice (ECL) which improve Concept Lattice byintroducing the equivalent intent. The approach issuperior to ID3 and C4.5. But the new features ofdatabase such as the high-dimensionality andheterogeneity make the parallel/distributed data mininga hot research domain. Under this circumstance, wepropose the parallel ECL for datamining from largescale database. In this paper, inducing classificationrules in parallel ECL is investigated both theoreticallyand experimentally.
According to the characteristics of distributeddatabases and constraints, two algorithms for distributedmining association rules with item constraints called DMAIC and DAMICFP are developed. The DMAIC algorithm is b...
详细信息
According to the characteristics of distributeddatabases and constraints, two algorithms for distributedmining association rules with item constraints called DMAIC and DAMICFP are developed. The DMAIC algorithm is based on Apriori algorithm and DAMICFP is based on FP-growth algorithm. The two algorithms are both tested by an illustration and analyzed for their qualities. The advantages, shortcomings and suited conditions of the two algorithms are also given. The results show that DMAIC is an algorithm with high reliability and simple communication protocol, and it is suitable for the system of low communication requirement. DAMICFP is an algorithm with high efficiency and excellent communication quality, and it is suitable for the system of high communication requirement. The two algorithms are effective ways to solve the problem of distributedmining association rules with item constraints.
暂无评论