In this paper, we analyze some clustering algorithms that have been widely employed in the past to support the comprehension of Web applications. To this end, we have defined an approach to identify static pages that ...
详细信息
In this paper, we analyze some clustering algorithms that have been widely employed in the past to support the comprehension of Web applications. To this end, we have defined an approach to identify static pages that are duplicated or cloned at the content level. This approach is based on a process that first computes the dissimilarity between Web pages using latent semantic indexing, a well known information retrieval technique, and then groups similar pages using clustering algorithms. We consider five instances of this process, each based on three variants of the agglomerative hierarchical clustering algorithm, a divisive clustering algorithm, k-means partitional clustering algorithm, and a widely employed partitional competitive clustering algorithm, namely Winner Takes All. In order to assess the proposed approach, we have used the static pages of three Web applications and one static Web site.
The complexity and size of digital circuits have grown exponentially, and today's circuits can contain millions of logic elements. clustering algorithms have become popular due to their ability to reduce circuit s...
详细信息
The complexity and size of digital circuits have grown exponentially, and today's circuits can contain millions of logic elements. clustering algorithms have become popular due to their ability to reduce circuit sizes. clustering enables circuit layout design problems, such as partitioning and placement to be performed faster and with higher quality. In this paper, current clustering algorithms and the effect of these algorithms on industry test benchmarks are studied. It is revealed that the score-based clustering algorithms are the most successful clustering techniques for circuit layout design and deserve more future research investigations.
This paper investigates how clustering algorithms and Recency, Frequency, and Monetary value (RFM) analysis can be performed on online transactions to provide strategies for customer purchasing behaviors. Along with p...
详细信息
ISBN:
(数字)9781728196565
ISBN:
(纸本)9781728196572
This paper investigates how clustering algorithms and Recency, Frequency, and Monetary value (RFM) analysis can be performed on online transactions to provide strategies for customer purchasing behaviors. Along with performing RFM analysis on the retail dataset, clustering algorithms such as Mean-shift, Density-Based Spatial clustering of Applications with Noise (DBSCAN), Agglomerative clustering, and K-Means were utilized. By comparing these clustering algorithms, we have found valuable customer groups based on RFM values.
A perceptual image hash function maps an image to a short binary string, based on an image's appearance to the human eye. Perceptual image hashing is useful in image databases, watermarking, and authentication. In...
详细信息
A perceptual image hash function maps an image to a short binary string, based on an image's appearance to the human eye. Perceptual image hashing is useful in image databases, watermarking, and authentication. In this paper, we decouple image hashing into feature extraction (intermediate hash) followed by data clustering (final hash). For any perceptually significant feature extractor, we propose a polynomial-time heuristic clustering algorithm that automatically determines the final hash length needed to satisfy a specified distortion. We prove that the decision version of our clustering problem is NP complete, Based on the proposed algorithm, we develop two variations to facilitate perceptual robustness vs. fragility trade-offs. We test the proposed algorithms against Stirmark attacks.
A Mobile Ad hoc Network (MANET) is a multihop wireless network in which the mobile nodes are dynamic in nature and has a limited bandwidth and minimum battery power. Due to this challenging environment the mobile node...
详细信息
A Mobile Ad hoc Network (MANET) is a multihop wireless network in which the mobile nodes are dynamic in nature and has a limited bandwidth and minimum battery power. Due to this challenging environment the mobile nodes can be grouped into clusters to achieve better stability and scalability. Grouping the mobile nodes is called clustering, in which a leader node is elected to manage the entire network. In this paper, we consider the various approaches for clustering focus on different performance metrics. Each cluster contain a particular node called cluster head which is elected as cluster head according to the specific metric or combination of metrics such as mobility, energy, degree, weight etc. In this survey paper, we study some clustering schemes such as Mobility-based clustering, Energy-efficient clustering, Connectivity-based clustering, Weighted-based clustering and discuss their advantages and disadvantages.
The authors introduce a concept for a global classification of remote sensing images in large archives, e.g. covering the whole globe. Such an archive for example will be created after the Shuttle Radar Topography Mis...
详细信息
The authors introduce a concept for a global classification of remote sensing images in large archives, e.g. covering the whole globe. Such an archive for example will be created after the Shuttle Radar Topography Mission in 1999. The classification is realized as a two step procedure: unsupervised clustering and supervised hierarchical classification. Features, derived from different and non-commensurable models, are combined using an extended k-means clustering algorithm and supervised hierarchical Bayesian networks incorporating any available prior information about the domain.
With the rapid growth of geographic data, generated by various sensors and end equipment, new opportunities for research and practical applications can be found in various applications. However, effective utilization ...
详细信息
ISBN:
(数字)9798331520861
ISBN:
(纸本)9798331520878
With the rapid growth of geographic data, generated by various sensors and end equipment, new opportunities for research and practical applications can be found in various applications. However, effective utilization of this data often requires the division of geospatial space into smaller, manageable regions. An important challenge is to ensure that these regions with closed data points are balanced in terms of data size distribution (e.g., population density, resource allocation, etc.), creating a double optimization problem. The contributions of this paper are twofold. First, we propose a balance-driven partitioning algorithm, which is a coordinate-descent based algorithm using a dynamic programming technique. Second, we present a clustering-centric algorithm that improves the classic k-means algorithm with an imbalance-penalized function to allow the geographic data to be clustered together not only in terms of geographic location, but also in terms of the per-cluster total sizes in balance. Finally, to evaluate the efficiency of the proposed algorithms, we conducted experiments based on a trace geographic dataset and compared the results with those of the existing clustering algorithms. Our results demonstrate that the proposed algorithms can not only achieve the competitive clustering effects but also exhibit better performance in terms of data-size balance.
Topology control is one of the most important parts of Wireless Sensor Networks (WSNs) which is the current hotspot of research and application. Comparing with the other traditional wireless networks, this paper first...
详细信息
ISBN:
(纸本)9781424458721;9781424458745
Topology control is one of the most important parts of Wireless Sensor Networks (WSNs) which is the current hotspot of research and application. Comparing with the other traditional wireless networks, this paper firstly summarizes the specialties of typic WSNs, and then studies the recent representative clustering algorithms in this area by summing up their characteristics and application areas, posting their limitations, and pointing out the future trend of the clustering Arithmetic of WSNs emphatically.
clustering aims at discovering groups and identifying interesting distributions and patterns in data sets. Researchers have extensively studied clustering since it arises in many application domains in engineering and...
详细信息
ISBN:
(纸本)0769512186
clustering aims at discovering groups and identifying interesting distributions and patterns in data sets. Researchers have extensively studied clustering since it arises in many application domains in engineering and social sciences. In the last years the availability of huge transactional and experimental data sets and the arising requirements for data mining created needs for clustering algorithms that scale and can be applied in diverse domains. The paper surveys clustering methods and approaches available in the literature in a comparative way. It also presents the basic concepts, principles and assumptions upon which the clustering algorithms are based. Another important issue is the validity of the clustering schemes resulting from applying algorithms. This is also related to the inherent features of the data set under concern. We review and compare clustering validity measures available in the literature. Furthermore, we illustrate the issues that are under-addressed by the recent algorithms and we address new research directions.
Wireless sensor networks (WSNs) have many applications in military services, health centers, industries as well as home surveillances. In such networks energy efficiency of nodes and life time of network are main conc...
详细信息
Wireless sensor networks (WSNs) have many applications in military services, health centers, industries as well as home surveillances. In such networks energy efficiency of nodes and life time of network are main concerns. Different clustering approaches are used to efficiently optimize the energy of sensor nodes. clustering also improves the scalability of sensor nodes. We reviewed different approaches of clustering which are centralized, distributed and hybrid used in Sensor Networks. Recently there have been many researches on developing algorithms using equal and unequal clustering techniques. These techniques use residual energy of nodes and distance to base station as parameters for selecting cluster heads. This paper aims to examine various distributed and hybrid clustering algorithm as on date reported by different authors actively working in this area. We also briefly discuss the operations of these algorithms, as well as compare on the basis of various clustering attributes.
暂无评论