Recent development of semantic web has opened new research to design search engines which organize and manage semantic data. The core of a search engine is the indexing system which consists of two main parts: data st...
详细信息
ISBN:
(纸本)9781479954865
Recent development of semantic web has opened new research to design search engines which organize and manage semantic data. The core of a search engine is the indexing system which consists of two main parts: data storage and data retrieval. With the increasing amount of semantic data, the most important goal expected from an indexing system is the ability to store large amount of data and retrieve them as fast as possible. In other words, having a scalable indexing system is one of the major challenges in semantic search engines. In this paper, a scalable method is presented to index the RDF data which utilizes HBase database, a NOSQL database management system, as its underlying data storage. HBase provides random access to massive data on the distributed framework of Hadoop, therefore, it can be a proper option for the management of the massive data. Further, due to the importance and popularity of the entity-based queries, a new schema based on a clusteringalgorithm is designed to effectively respond to this type of queries. The experimental evaluation shows that the proposed indexing system is effective in terms of improving scalability and retrieval of RDF data.
With existing telephone networks nearing saturation and demand for wire and wireless services continuing to grow, telecommunication engineers are looking at technologies that will deliver sites and can satisfy the req...
详细信息
ISBN:
(纸本)9781424453849
With existing telephone networks nearing saturation and demand for wire and wireless services continuing to grow, telecommunication engineers are looking at technologies that will deliver sites and can satisfy the required demand and grade of service constraints while achieving minimum possible costs. The city data is given as a map of streets, intersection nodes coordinates, distribution of the subscribers' loads within the city and the location of base station in mobile network in this city. The available cable sizes, the cost per unit for each size and the maximum distance of wire that satisfied the allowed grade of service. Net Plan (Network Planning package) is developed in the spirit of DBSCAN and agglomerative clustering algorithms. In this paper we studied the problem of congestion in Multi Service Access Node (MSAN) due to the increasing the number of subscribers which cause degradation in grade of service and in some time impossible to add new subscribers. The Net Plan algorithm is introduced to solve this problem. This algorithm is Density-based clusteringalgorithm using physical shortest paths available routes and the subscriber loads. In other hand decreasing the cost also is our deal in this paper so in the second phase in clustering process we modify the agglomerativealgorithm that merge the neighboring cluster which satisfying certain condition. Experimental results and analysis indicate that the combination to algorithms was effective, leads to minimum costs for network construction and make the best grade of service.
Partitioning objects into closely related groups that have different states allows to understand the underlying structure in the data set treated. Different kinds of similarity measure with clusteringalgorithms are c...
详细信息
Partitioning objects into closely related groups that have different states allows to understand the underlying structure in the data set treated. Different kinds of similarity measure with clusteringalgorithms are commonly used to find an optimal clustering or closely akin to original clustering. Using shrinkage-based and rank-based correlation coefficients, which are known to be robust, the recovery level of six chosen clusteringalgorithms is evaluated using Rand's C values. The recovery levels using weighted likelihood estimate of correlation coefficient are obtained and compared to the results from using those correlation coefficients in applying agglomerative clustering algorithms.
Distributional and asymptotic results on the moment of Rand's C-k statistic were derived by DuBien and Warde [1981. Some distributional results concerning a comparative statistic used in cluster analysis. ASA Proc...
详细信息
Distributional and asymptotic results on the moment of Rand's C-k statistic were derived by DuBien and Warde [1981. Some distributional results concerning a comparative statistic used in cluster analysis. ASA Proceedings of the Social Statistics Section, 309-313.]. Based on those results, a method to predict the number of clusters is suggested by applying various agglomerative clustering algorithms. In the procedure, the methods using different indexes are examined and compared based on the concept of agreement (or, disagreement) between clusterings generated by different clusteringalgorithms on the set of data. Our method having practical generality works better than the other methods and assigns statistical meaning to C-k values in determining the number of clusters from the comparison. (C) 2005 Elsevier B.V. All rights reserved.
Classified vector quantisation (CVQ) of images is a vector quantisation-based coding method for preserving perceptual features while retaining simple vector quantiser distortion measures during codebook design and enc...
详细信息
Classified vector quantisation (CVQ) of images is a vector quantisation-based coding method for preserving perceptual features while retaining simple vector quantiser distortion measures during codebook design and encoding process. In the paper, a new algorithm for CVQ codebook design the ‘classified nearest neighbour clustering’ (CNNC) algorith, is presented. The CNNC algorithm is based on a classification process of small image blocks and on an agglomerative clustering algorithm, and is used to design simultaneously M codebooks for M different classes, defined for a CVQ system. The CNNC algorithm can be used with squared error and weighted squared error distortion measures employing one of two optimisation criteria which are presented and tested in the paper. In addition, fast search algorithm is presented aimed at reducing computational efforts encountered during codebook design. The CNNC algorithm is shown to provide a systematic and effective method for CVQ codebook design making CVQ more feasible and easy to implement.
暂无评论