A Cloud may be seen as a type of flexible computing infrastructure consisting of many compute nodes, where resizable computing capacities can be provided to different customers. To fully harness the power of the Cloud...
详细信息
Data distribution is the basic behavior of P2 P applications(file sharing and streaming service) and it is a key element affecting the performance of P2 P systems. However, there are few research works that focus on d...
详细信息
ISBN:
(纸本)9783037851555
Data distribution is the basic behavior of P2 P applications(file sharing and streaming service) and it is a key element affecting the performance of P2 P systems. However, there are few research works that focus on data distribution of P2 P applications from the view of whole system. In this paper we study the data distribution in P2 P applications in terms of decreasing the system distribution load. We define the distribution load of P2 P systems formally and analyze how to decrease the system load quickly by means of mathematical analysis. Moreover, we give a feasible fast distribution algorithm according to our theoretic conclusion. The experimental results show that our algorithm has significant improvement on data distribution speed and load balance.
Mining of repeated patterns from HTML documents is the key step towards Web-based data mining and knowledge extraction. Many web crawling applications need efficient repeated patterns mining techniques to generate the...
详细信息
There are two key issues for information diffusion in blogosphere: (1) blog posts are usually short, noisy and contain multiple themes, (2) information diffusion through blogosphere is primarily driven by the "wo...
详细信息
ISBN:
(纸本)9781577355120
There are two key issues for information diffusion in blogosphere: (1) blog posts are usually short, noisy and contain multiple themes, (2) information diffusion through blogosphere is primarily driven by the "word-of- mouth" effect, thus making topics evolve very fast. This paper presents a novel topic tracking approach to deal with these issues by modeling a topic as a semantic graph, in which the semantic relatedness between terms are learned from Wikipedia. For a given topic/post, the name entities, Wikipedia concepts, and the semantic relatedness are extracted to generate the graph model. Noises are filtered out through the graph clustering algorithm. To handle topic evolution, the topic model is enriched by using Wikipedia as background knowledge. Furthermore, graph edit distance is used to measure the similarity between a topic and its posts. The proposed method is tested by using the real-world blog data. Experimental results show the advantage of the proposed method on tracking the topic in short, noisy texts.
Strongly promoted by the leading industrial companies, cloud computing becomes increasingly popular in recent years. The growth rate of cloud computing surpasses even the most optimistic predictions. A cloud applicati...
详细信息
Applying graph clustering algorithms in real world networks needs to overcome two main challenges: the lack of prior knowledge and the scalability issue. This paper proposes a novel method based on the topological fea...
详细信息
暂无评论