As regard to improve the efficiency of grouping aggregation calculation, the data is compressed by using binary encoding, and the dimension hierarchical grouping attribute set encodings of each dimension table are cal...
详细信息
During the process of real-time multi-resolution terrain rendering, avoiding cracks between LOD (Level of Detail) has always been a tedious task that requires extra complicated handling. In this paper, we propose a no...
详细信息
Neuronal oscillations in the gamma frequency range and spikes have been reported in many cortical areas in the information processing, but the role of spikes play in cortical processing remains unclear. The aim of thi...
详细信息
Neuronal oscillations in the gamma frequency range and spikes have been reported in many cortical areas in the information processing, but the role of spikes play in cortical processing remains unclear. The aim of this study was to examine the role of kainate in the generation of oscillatory field activity at gamma frequency in the hippocampal slice prepared from rat brain, and to determine the phase relationship between spikes and gamma oscillations. The kainate induced large amplitude field population spiking activity, which is correlated linearly with the field gamma oscillations. The relationship between timing of spikes and the phase of gamma oscillation was determined by an analysis of circular statistics. Further analysis with Rayleigh test revealed that the relationship of phase-locking between spikes and gamma rhythm is of statistical significance. These data demonstrate that kainate sensitive neuronal networks within hippocampus are able to generate gamma oscillations which are modulated by large amplitude population spikes and phase-locking of spikes to the gamma rhythm suggest the role of memory enhancement in the presence of Kainate receptor activation.
There exist two major problems in weighted closed sequential patterns mining. The first is that only the weights of items are considered and they ignore the time-interval information of data elements during the mining...
详细信息
There exist two major problems in weighted closed sequential patterns mining. The first is that only the weights of items are considered and they ignore the time-interval information of data elements during the mining process;the second is that the existing weighted closed sequential pattern mining algorithms need to scan the sequence database many times or to construct numerous intermediate databases. To address these problems, we propose a memory-based algorithm, MIWCSpan (Memory Indexing for Weighted Closed Sequential pattern mining), for weighted closed sequential pattern mining. In the algorithm, we define a novel sequence weighting approach to find more interesting sequential patterns. Both the weight of sequence items and the time-interval of the data elements are considered in this approach. Moreover, an improved index set based on time-interval, p-iidx, is defined. The structure is a set of triples which store the pointer pointing to the sequence containing p, the time-interval of p in the sequence and the position where p occurs. In the mining process, MIWCSpan first scans the sequence database to read the database into memory. Then it adopts the find-then-index technique recursively to find the items which can constitute a weighted closed sequential pattern and construct p-iidx for the possible weighted closed sequential pattern. At last, the algorithm uses the close-detecting to mine the weighted closed sequential patterns efficiently. The experimental results show that MIWCSpan is better on running time, and it has good scalability.
Most of algorithms based on tree structure for mining frequent pattern on uncertain data streams always store a large number of tree nodes, and record the corresponding information of data streams which can cause mass...
详细信息
Most of algorithms based on tree structure for mining frequent pattern on uncertain data streams always store a large number of tree nodes, and record the corresponding information of data streams which can cause massive information storages. In this paper, an algorithm CTBVT based on compressed tree and bit vector table for mining frequent patterns on uncertain data streams, is proposed. The uncertain data streams are initialized to probability-vector table, in the table, the items are represented by transactions, unlike other bit vector tables the occurrence probabilities of items are stored in it. When the window slides, all the columns in probability-vector table are left shift m bits at the same time and m is the number of transactions in the window. We also propose compressed tree in which the items with different probabilities are stored in the same tree nodes, which will reduce the number of tree nodes significantly, then the items and its probability in the tree node correspond to the bit vector table are converted into binary bit vector, the number of 1s in the binary bit vector is the frequency of the tree node. Afterwards, each leaf node of the tree is connected to an array which is used to store the combination of all items and their expected support in the path. The leaf nodes are stored in the LeafList. Finally, we scan the arrays that are linked to the leaf nodes in the LeafList and compare the expected support that is stored in the array with a minimum support threshold minSup to get all the frequent itemsets, mining time will reduce dramatically. Experiment results show that CTBVT is very efficiency and scalable.
In high-dimensional data space, because the data is sparse inherently, clusters tend to exist in different subspaces, which makes the traditional methods no longer suitable for use. In this paper, we present SCFES, a ...
详细信息
In high-dimensional data space, because the data is sparse inherently, clusters tend to exist in different subspaces, which makes the traditional methods no longer suitable for use. In this paper, we present SCFES, a subspace clustering algorithm based on finding effective spaces. First, we define the effective dimension. By calculating relative entropy we remove redundancy dimensions which affect clustering accuracy. Second, according to the data distribution in the effective dimensions, we get the effective intervals through merging adjacent intervals. The effective space is composed of effective intervals. Third, we extend the density estimator based on undirected acyclic connected graph by using weight so as to estimate the expectation of existing clusters in the space, at the same time combine it with the monotonicity of the clustering criterion mentioned in the CLIQUE algorithm to prune candidates. Consequently we get the effective spaces. Finally, we adopt the structure of sibling tree to store all the effective spaces and use DBSCAN algorithm based on density to generate maximal subspace clusters in some effective spaces. Experimental results show that SCFES effectively finds arbitrarily shaped and positioned clusters in different subspaces. Meanwhile SCFES has better clustering quality and scalability.
As regard to the case of extending the lifetime of zigbee network, the defination of node's boundary is proposed. First, all the information for node's boundary is stored when zigbee network is built. Then, th...
详细信息
Most existing vulnerability taxonomy classifies vulnerabilities by their idiosyncrasies, weaknesses, flaws and faults et al. The disadvantage of the taxonomy is that the classification standard is not unified and ther...
详细信息
Most existing vulnerability taxonomy classifies vulnerabilities by their idiosyncrasies, weaknesses, flaws and faults et al. The disadvantage of the taxonomy is that the classification standard is not unified and there is overlap classification phenomenon in vulnerability taxonomy. In order to solve the problem, we will propose an algorithm VUNClique, virtual Grid-based Clustering of Uncertain Data on vulnerability database. Firstly, this paper transforms the vulnerability database into uncertain dataset using the existing vulnerability database pretreatment model. Secondly, we define a virtual grid structure, the cells are divided into real cells and virtual cells, but only the real cells which contain data objects stored in memory. The probability attribute value similarity is defined to deal with the similarity of non-numeric attributes, which compares the number of non-numeric attributes with the same value between tuples to measure the similarity. We provide a secondary partition algorithm to improve the similarity between the tuples in the same cell, the algorithm merges a tuple into it's high-density neighbor cell which has the maximum value of probability attribute value similarity with it. Then, a novel identify cluster algorithm is provided to cluster the high-density real cells. It can identify clusters of arbitrary shapes by traversing real cells twice. Finally, performance experiments over the uncertain dataset transformed by NVD vulnerability database. The experiments results show that VUNClique can find clusters of arbitrary shapes, and greatly improve the efficiency of clustering.
High dimensional data clustering is an important issue for data mining. Firstly, the records in the dataset are mapped to the vertices of hypergraph, the hyperedges of hypergraph are composed of the vertices which hav...
详细信息
The traditional clustering algorithms often fail to detect meaningful clusters in high-dimensional data space. To improve the above shortcoming, we propose GDRH-Stream, a clustering method based on the attribute relat...
详细信息
The traditional clustering algorithms often fail to detect meaningful clusters in high-dimensional data space. To improve the above shortcoming, we propose GDRH-Stream, a clustering method based on the attribute relativity and grid density for high-dimensional data stream, which consists of an online component and an offline component. First, the algorithm filters out redundant attributes by computing the relative entropy. Then we define a weighted attribute relativity measure and estimate the relativity of the non-redundant attributes, and form the attribute triple. At last, the best interesting subspaces are searched by the attribute triple. On the online component, GDRH-Stream maps each data object into a grid and updates the characteristic vector of the grid. On the offline component, when a clustering request arrives, the best interesting subspaces will be generated by attribute relativity. Then the original grid structure is projected to the subspace and a new grid structure is formed. The clustering will be performed on the new grid structure by adopting an approach based on the density grid. Experimental results show that GDRH-Stream algorithm has better quality and scalability.
暂无评论