As product reviews accumulate more and more at online shopping sites, customers begin to have an increasing demand for analyzing reviews automatically. In some previous studies, clustering algorithms have been proved ...
详细信息
In this paper, we present a clustering based method used to process 3D seismic data and automatically map seismic horizons in the presence of discontinuities. Our approach uses the cosine of instantaneous phase attrib...
详细信息
ISBN:
(纸本)9789462821859
In this paper, we present a clustering based method used to process 3D seismic data and automatically map seismic horizons in the presence of discontinuities. Our approach uses the cosine of instantaneous phase attributes and applies Principal Component Analysis to the original datasets of trace shapes to improve the quality of the original samples. We also propose a measurement to infer the quality of the clusters used to map the seismic horizons. Based on this measurement, we show that using the cosine of instantaneous phase attributes and PCA greatly improves the mapping of seismic horizons.
In some application contexts, data are better described by a matrix of pairwise dissimilarities rather than by a vector representation. clustering and topographic mapping algorithms have been adapted to this type of d...
详细信息
In some application contexts, data are better described by a matrix of pairwise dissimilarities rather than by a vector representation. clustering and topographic mapping algorithms have been adapted to this type of data, either via the generalized Median principle, or more recently with the so called relational approach, in which prototypes are represented by virtual linear combinations of the original observations. One drawback of those methods is their complexity, which scales as the square of the number of observations, mainly because they use dense prototype representations: each prototype is obtained as a virtual combination of all the elements of its cluster (at least). We propose in this paper to use a sparse representation of the prototypes to obtain relational algorithms with sub-quadratic complexity.
For massive data, traditional clustering methods often require repeated iterations and calculations, which consume a lot of time and resources. Therefore, this article chooses to use big data clustering algorithms to ...
详细信息
Peer-to-Peer (p2p) networks are used by millions for searching content. Recently, clustering algorithms were shown to be useful for helping users find content in such networks. However, p2p networks often exhibit powe...
详细信息
Semi-supervised clustering uses a small amount of supervised data in the form of pairwise constraints to improve the clustering performance. However, most current methods are passive in the sense that the pairwise con...
详细信息
A symmetric dataset is defined as an n x n dataset that when transposed, it is equal to that of prior transposed. In data mining algorithms that employ vertical data structure1, symmetric datasets are used, for exampl...
详细信息
ISBN:
(纸本)9781604234961
A symmetric dataset is defined as an n x n dataset that when transposed, it is equal to that of prior transposed. In data mining algorithms that employ vertical data structure1, symmetric datasets are used, for example, in clustering gene expression pattern and density-based clustering algorithms. In the former, a symmetric dataset is the pre-computed pairwise similarity of genes, while in the latter, a symmetric dataset is the pre-computed pair-wise Euclidian distance of objects. We realized that when n is large, arranging a symmetric dataset into n x n is impractical for vertical algorithms because a large number of vertical bit sequences must be generated. In this paper, we propose an alternative arrangement of symmetric datasets for vertical clustering algorithms. The dataset is arranged in n' x m, where n' m, instead of n x n, but the number of elements remains the same. In other words, the cardinality of the dataset is extended, but the dimension is narrowed. More importantly, the results should be similar to the results when a symmetric dataset is arranged in n x n. The experimental results show that the proposed arrangement greatly expedites the time to generate and load the vertical bit vectors.
The paper considers the Gaussian mixtures model and the possibilities of its application for solving clustering tasks. First, the case is considered when the Gaussian mixtures model is formed in such a way that all th...
详细信息
Among the power system corrective controls, defensive islanding is considered as the last resort to secure the system from severe cascading contingencies. The primary motive of defensive islanding is to limit the affe...
详细信息
clustering algorithms are useful whenever one needs to classify an excessive amount of information into a set of manageable and meaningful subsets. Using an analogy from vector analysis, a clustering algorithm can be ...
详细信息
ISBN:
(纸本)0819440620
clustering algorithms are useful whenever one needs to classify an excessive amount of information into a set of manageable and meaningful subsets. Using an analogy from vector analysis, a clustering algorithm can be said to divide up state space into discrete "chunks" such that each vector lies within one chunk. These vectors can best be thought of as sets of features. A canonical vector for each region of state space is chosen to represent all vectors which are located within that region. The following paper presents a survey of clustering algorithms. It pays particular attention to those algorithms that require the least amount of a priori knowledge about the domain being clustered. In the current work, an algorithm is compelling to the extent that it minimizes any assumptions about the distribution of vectors being classified.
暂无评论