In order to solve the overload problem of root ONS in the EPC network, a load balancing algorithm based on multi-root ONS is proposed. Based on the proposed load balancing ONS (LB ONS) architecture, the ONS Root is de...
详细信息
The formal model of spatial directional relations is one of the most important parts in spatial relation research. The most of models are based on Minimum Bounding Rectangle (MBR), and they are not compliant with the ...
详细信息
The formal model of spatial directional relations is one of the most important parts in spatial relation research. The most of models are based on Minimum Bounding Rectangle (MBR), and they are not compliant with the regular pattern of human cognition. In order to get a closer conclusion to human cognition on directional relationship, Angle Histogram model based on Double-projection and Rounded-subdivision (AHDPRS) is proposed in this paper. The model uses the maximum inscribed circles to find out the maximum parts of the object, and calculates the directional relationship between the centers of the circles. This model ignores the inessential details to ensure the result which will be closer to human cognition. The experiments show that this model is feasible.
The traditional clustering algorithms often fail to detect meaningful clusters in high-dimensional data space. To improve the above shortcoming, we propose GDRH-Stream, a clustering method based on the attribute relat...
详细信息
The traditional clustering algorithms often fail to detect meaningful clusters in high-dimensional data space. To improve the above shortcoming, we propose GDRH-Stream, a clustering method based on the attribute relativity and grid density for high-dimensional data stream, which consists of an online component and an offline component. First, the algorithm filters out redundant attributes by computing the relative entropy. Then we define a weighted attribute relativity measure and estimate the relativity of the non-redundant attributes, and form the attribute triple. At last, the best interesting subspaces are searched by the attribute triple. On the online component, GDRH-Stream maps each data object into a grid and updates the characteristic vector of the grid. On the offline component, when a clustering request arrives, the best interesting subspaces will be generated by attribute relativity. Then the original grid structure is projected to the subspace and a new grid structure is formed. The clustering will be performed on the new grid structure by adopting an approach based on the density grid. Experimental results show that GDRH-Stream algorithm has better quality and scalab.lity.
The problems of AGMA (Automatic Graph Mining Algorithm) are improved and a novel algorithm, namely CRMA (Clustering Re-clustering Merging Algorithm) is proposed which can realize more reasonable community division for...
详细信息
In recent years, closed frequent itemsets mining has become a hot topic. In this paper, we present an algorithm BCTCF, which is based on Bit complementary tree (BCTree) in order to mine closed frequent itemsets effici...
详细信息
In recent years, closed frequent itemsets mining has become a hot topic. In this paper, we present an algorithm BCTCF, which is based on Bit complementary tree (BCTree) in order to mine closed frequent itemsets efficiently. First we adopt bit vectors to compress the database and define a novel structure, BCTree, in which a node stores two bit vectors that are complementary and each path is given a prime value. Based on the left-most bit in the bit vectors we adopt a divide-and-conquer strategy which handles the itemsets separately and then according to the prime unique feature we can get the closed frequent itemsets quickly and it makes us need not to mine all the frequent itemsets first. Both the divide-and-conquer strategy and prime unique can decrease the runtime. Experiment results show that BCTCF is very effective and scalab.e.
Due to the randomness of the partition of grids, the edge points of clusters might be partitioned into the sparse grids. These points would become noise information out of clusters when we cluster data stream by grid-...
详细信息
Due to the randomness of the partition of grids, the edge points of clusters might be partitioned into the sparse grids. These points would become noise information out of clusters when we cluster data stream by grid-density based algorithm. A data stream clustering algorithm based on spatial directed graph with core, SDGCStream, is proposed. It uses the spatial directed graph and the orthocenter of the sparse grids to handle the edge points of clusters. At first, the algorithm defines a structure SDGC (Spatial Directed Graph with Core) to store the summary statistics of data stream. The vertices of SDGC are maintained as the stream arriving. When the clustering quest comes, the edge information is generated. The initial clustering results are got through clustering on SDGC, then we judge whether the points of sparse grids which are adjacent to the border of a cluster belong to the cluster according to the orthocenter information and the border vertices of SDGC. At last, a strategy based on the distance between clusters is presented to adjust the clustering results after handling the border of clusters. The experimental results on synthetic and real datasets show the better validity of SDGCStream on handling the edge data points of clusters, and the scalab.lity as the increasing of the length and dimensions of data stream.
In this paper, we present an algorithm TKBT(top-k closed frequent mining based on TKTT) to mine top-k closed frequent itemsets in data streams efficiently. First according to the consecutive and changeable characteris...
详细信息
In this paper, we present an algorithm TKBT(top-k closed frequent mining based on TKTT) to mine top-k closed frequent itemsets in data streams efficiently. First according to the consecutive and changeable characteristics of the data from data streams in sliding window, a novel structure, BWT(bit-vector window table) is defined. In BWT horizontal direction we use bit vectors to express the transactions, record the count of items in the oldest, the newest window and all the windows a t current time, which decreases the calculating time of the items count when a new window slides in. In BWT vertical direction we set window partition, which makes us just need replace the oldest window information with the corresponding newest window when a new window comes. The construction of TKTT (top-k temporary table) is based on BWT. The itemsets in TKTT are ranked in a descending count order. TKBT can get top-k closed frequent itemsets by connecting the candidates in TKTT using top-down strategy. The candidate number is reduced by using closed itemset displace its subset and less connection times are contributed to the less runtime. Experiment results show that TKBT is very effective and scalab.e.
Due to the introduction of weight of attributes, most of existing dissimilarity measures can not accurately reflect the difference between two heterogeneous objects, and then clustering quality was decreased. In this ...
详细信息
Due to the introduction of weight of attributes, most of existing dissimilarity measures can not accurately reflect the difference between two heterogeneous objects, and then clustering quality was decreased. In this paper, we present HIDK-means, an approach for clustering heterogeneous data based on information dissimilarity. At first, the algorithm defines heterogeneous information dissimilarity between two heterogeneous objects based on Kolmogorov information theory, and calculates approximately heterogeneous information dissimilarity by a universal probability of an object. Then, in the clustering process, the algorithm selects the initial cluster centers by maximum sum of dissimilarity. After that, each remaining object is assigned to a cluster center which has the smallest dissimilarity with it and the criterion function is calculated. Iteratively, cluster centers are updated and the process is ceased until the criterion function converges or the iteration number reaches the pre-set threshold. The experimental results show that the proposed algorithm HIDK-means is effective in clustering heterogeneous objects and also scalab.e to large datasets.
According to the former research achievements about load balancing strategy, considering the load distribution, the load condition of service node and the number of user connections, we propose a load balancing strate...
详细信息
The study on signed social networks community detection has been paid more and more attention. Research shows that two-phase signed social networks community detection algorithm can not correctly divide the network. T...
详细信息
暂无评论