AIDCOR is an artificial immunity inspired densitybasedclustering algorithm which is able to identify crisp clusters with high degree of accuracy where the input dataset presented can have varied shaped clusters, var...
详细信息
AIDCOR is an artificial immunity inspired densitybasedclustering algorithm which is able to identify crisp clusters with high degree of accuracy where the input dataset presented can have varied shaped clusters, varied density distribution, low inter cluster separation and with noise/outliers also. The algorithm works out into two phases, a data preprocessing module and a clustering module. The initial data processing part of AIDCOR is artificial immune system inspired and uses a novel approach of somatic hypermutation and affinity maturation with selective antigenic binding to reduce data redundancy while preserving the original data patterns. The actual data clustering part pursues a densitybased approach which forms clusters with the compressed data set and doing so it inherently identifies outliers also. We have thoroughly analyzed both theoretical aspects and experimental results of the proposed algorithm with wide variety of real and synthetic data set. The results of AIDCOR are compared with several current state of art algorithms where we found that it is giving much higher clustering accuracy for nearly all type of dataset. The time complexity of AIDCOR is coming to be sub quadratic when some indexing data structure is used for nearest neighbor search and quadratic otherwise. AIDCOR needs 3 user defined parameters for its operation. A heuristic method is also proposed to automatically determine those parameters.
In this paper we propose an algorithm which can identify varied shaped clusters from wide variety of input dataset with high degree of accuracy in presence of noise. The initial data processing module adopts a novel a...
详细信息
ISBN:
(纸本)9781479930807
In this paper we propose an algorithm which can identify varied shaped clusters from wide variety of input dataset with high degree of accuracy in presence of noise. The initial data processing module adopts a novel approach of Artificial Immune system to reduce data redundancy while preserving the original data patterns. The clustering module pursues a densitybased approach to identify clusters from the compressed dataset produced by the preprocessing module. We introduced several new concepts like selective Antigenic binding, Local Reachability Factor, Global Reachability Factor to effectively recognize clusters with varied shape, varied density and low intercluster separation with acceptable computational cost. We performed experimental evaluation of our algorithm with wide variety of real and synthetic dataset and obtained higher cluster success rate for all dataset when compared to DBSCAN.
暂无评论