The reliability issue of Exascale system is extremely serious. Traditional passive fault-tolerant methods, such as rollback-recovery, can not fully guarantee system reliability any more because of their large executin...
详细信息
As the energy consumption becomes increasingly prominent, how to meet time constraints while reducing the energy consumption as much as possible is a fundamental problem of the real-time scheduling in multi-core syste...
详细信息
A Cloud may be seen as a type of flexible computing infrastructure consisting of many compute nodes, where resizable computing capacities can be provided to different customers. To fully harness the power of the Cloud...
详细信息
Mining of repeated patterns from HTML documents is the key step towards Web-based data mining and knowledge extraction. Many web crawling applications need efficient repeated patterns mining techniques to generate the...
详细信息
There are two key issues for information diffusion in blogosphere: (1) blog posts are usually short, noisy and contain multiple themes, (2) information diffusion through blogosphere is primarily driven by the "wo...
详细信息
ISBN:
(纸本)9781577355120
There are two key issues for information diffusion in blogosphere: (1) blog posts are usually short, noisy and contain multiple themes, (2) information diffusion through blogosphere is primarily driven by the "word-of- mouth" effect, thus making topics evolve very fast. This paper presents a novel topic tracking approach to deal with these issues by modeling a topic as a semantic graph, in which the semantic relatedness between terms are learned from Wikipedia. For a given topic/post, the name entities, Wikipedia concepts, and the semantic relatedness are extracted to generate the graph model. Noises are filtered out through the graph clustering algorithm. To handle topic evolution, the topic model is enriched by using Wikipedia as background knowledge. Furthermore, graph edit distance is used to measure the similarity between a topic and its posts. The proposed method is tested by using the real-world blog data. Experimental results show the advantage of the proposed method on tracking the topic in short, noisy texts.
Strongly promoted by the leading industrial companies, cloud computing becomes increasingly popular in recent years. The growth rate of cloud computing surpasses even the most optimistic predictions. A cloud applicati...
详细信息
The reliability issue of Exascale system is extremely serious. Traditional passive fault-tolerant methods, such as rollback-recovery, can not fully guarantee system reliability any more because of their large executin...
详细信息
The reliability issue of Exascale system is extremely serious. Traditional passive fault-tolerant methods, such as rollback-recovery, can not fully guarantee system reliability any more because of their large executing overhead and long recovering duration. Active fault tolerance is expected to become another important fault-tolerant approach for Exascale system. Focusing on system failure prediction, which is one key step of active fault tolerance, we construct online failure prediction model and research on the effective method of system status pretreatment. In order to improve the accuracy and real-time feature of current methods, the proposed Improved Adaptive Semantic Filter (IASF) method processes the latest system logs regularly, filtering useless information out of them according to their semantics. Adopting the main idea of Vector Space Model (VSM), IASF method creates Event Vector corresponding to each log record. By calculating the cosine of vectorial angle, it evaluates the semantics correlation between different log records, and then executes temporal and spatial redundant filter considering the burst feature of log records. IASF method is insensitive to the type of system log and does not introduce any expert system or domain knowledge. The experiment result shows that system can eliminate about 99.6% of useless log records after executing IASF method.
When a large-scale distributed interactive simulation system is running on WAN, the sites usually disperse over a wide area in geography, which results in the simulation clock of each site is hardly to be accurately s...
详细信息
暂无评论