The Internet has become a vital information infrastructure for modern society. However, the concurrent nature of network introduces a wide-range of difficulties in traditional programming methodology in developing hig...
详细信息
Many challenges in multi-agent coordination can be modeled as distributed Constraint Optimization Problems (DCOPs). Aiming at DCOPs with low constraint density, this paper proposes a distributed algorithm based on the...
详细信息
It is quite a headache for developers to online detect performance problems in large-scale cloud computing systems. The behavior and the hidden connections among the huge amount of runtime request execution paths in c...
详细信息
It is quite a headache for developers to online detect performance problems in large-scale cloud computing systems. The behavior and the hidden connections among the huge amount of runtime request execution paths in cloud computing systems usually contain useful information for performance problem detection. In this paper, we propose an approach to rapidly diagnose the source of performance degradation in large-scale non-stop cloud computing systems. The approach first groups the user requests into categories with a fast clustering algorithm; then applies the principal components analysis to extract the primary methods; finally compares the normal and abnormal behaviors of the primary methods to localize the main cause of performance problems. We conduct extensive experiments over a real-world enterprise system providing services for the public. The results show that our approach can locate the prime causes of performance problems accurately and efficiently.
Extracting fault features with the error logs of fault injection tests has been widely studied in the area of large scale distributed systems for decades. However, the process of extracting features is severely affect...
详细信息
Extracting fault features with the error logs of fault injection tests has been widely studied in the area of large scale distributed systems for decades. However, the process of extracting features is severely affected by a large amount of noisy logs. While the existing work tries to solve the problem by compressing logs in temporal and spatial views or removing the semantic redundancy between logs, they fail to consider the co-existence of other noisy faults that generate error logs instead of injected faults, for example, random hardware faults, unexpected bugs of softwares, system configuration faults or the error rank of a log severity. During a fault feature extraction process, those noisy faults generate error logs that are not related to a target fault, and will strongly mislead the resulted fault features. We call an error log that is not related to a target fault a noisy error log. To filter out noisy error logs, we present a similarity-based error log filtering method SBF, which consists of three integrated steps: (1) model error logs into time series and use haar wavelet transform to get the approximate time series; (2) divide the approximate time series into sub time series by valleys; (3) identify noisy error logs by comparing the similarity between the sub time series of target error logs and the template of noisy error logs. We apply our log filtering method in an enterprise cloud system and show its effectiveness. Compared with the existing work, we successfully filter out noisy error logs and increase the precision and the recall rate of fault feature extraction.
The development of multi-core processor makes the parallelization of traditional sequential algorithms increasingly important. Meanwhile, transactional memory serves a good parallel programming model. This paper takes...
详细信息
The development of multi-core processor makes the parallelization of traditional sequential algorithms increasingly important. Meanwhile, transactional memory serves a good parallel programming model. This paper takes the advantage of software transactional memory to parallelize the Multi-Exit Asymmetric Adaboost algorithm for face detection. The parallel version is evaluated on three different implementations of software transactional memory. The experiment results show that the transactional memory based parallelization outperforms the traditional lock based approach. A speedup of nearly seven is achieved on a eight-core machine on an eight-core system.
A hierarchical diagnosis approach, namely Magnifier, was proposed, which models the execution path graph of a user request as component layer, module layer and function layer, and detects anomalies from higher layer t...
详细信息
A hierarchical diagnosis approach, namely Magnifier, was proposed, which models the execution path graph of a user request as component layer, module layer and function layer, and detects anomalies from higher layer to lower layer separately. Extensive experiments were conducted on the Alibaba cloud computing platform. The results indicate that, under the conditions of large volume of data and high complexity of execution paths, Magnifier can accurately and efficiently locate the prime causes of performance degradation.
暂无评论