A common way to construct a fault model is injecting the fault into the system and observing the subsequent symptoms, e. g. event logs. However, fault features would vary during the propagation period, and present dif...
详细信息
A common way to construct a fault model is injecting the fault into the system and observing the subsequent symptoms, e. g. event logs. However, fault features would vary during the propagation period, and present different symptoms at different stage of the fault propagation process. The exiting detection window based feature extraction methods can only identify the early symptoms of a fault, but fail to detect the latter symptoms and cause false alarms. To solve the problem, we present a fault feature extraction method, called Companion State Tracer (CSTracer), which consists of 3 integrated steps: (1) pre-process logs to remove the unrelated logs;(2) construct a general identifier for the early symptoms of a fault;(3) construct a finite state machine model for the fault to trace the latter symptoms. CSTracer can persistently monitor a fault after the fault has been identified. We have justified the effectiveness of CSTracer in an enterprise cloud system. Compared with the existing, the results show that CSTracer has a better detection accuracy.
The recently used deep sequencing techniques represent a new data processing challenge: mapping short fragment reads to open-access eukaryotic genomes at the scale of several hundred thousand. This problem is solvable...
详细信息
The recently used deep sequencing techniques represent a new data processing challenge: mapping short fragment reads to open-access eukaryotic genomes at the scale of several hundred thousand. This problem is solvable by BLAST, BWA and similar sequence alignment tools. BLAST is one of the most frequently used tool in bioinformatics and BWA is a relative new fast light-weighted tool that aligns effectively short sequences. Local installations of these algorithms are typically not able to handle large problem size therefore the sequence alignment process runs slowly, while web based implementations cannot accept high number of queries. HP-SEE infrastructure allows accessing massively parallel supercomputing infrastructure. With gUSE/WS-PGRADE we have created successfully an online Bioinformatics eScience Gateway, which is capable to serve the short fragment sequence alignment demand of the regional bioinformatics communities within the SEE region. Using workflows we have ported algorithms (BLAST and BWA) to the massively parallel HP-SEE infrastructure. In this paper we describe the created Bioinformatics eScience Gateway, and show as case study how we have implemented the ported BLAST workflow using parameter study. With our online service, researchers can do high throughput sequence alignments against the eukaryotic genomes to search for regulatory mechanisms controlled by short fragments on HP-SEE's supercomputing infrastructure.
We study the problem of scheduling in parallel systems with many users. We analyze scenarios with many submissions issued over time by several users. These submissions contain one or more jobs; the set of submissions ...
We study the problem of scheduling in parallel systems with many users. We analyze scenarios with many submissions issued over time by several users. These submissions contain one or more jobs; the set of submissions are organized in successive campaigns. Jobs belonging to a single campaign are sequential and independent, but any job from a campaign cannot start until all the jobs from the previous campaign are completed. Each user's goal is to minimize the sum of flow times of his campaigns. We define a theoretical model for Campaign scheduling and show that, in the general case, it is NP-hard. For the single-user case, we show that an ρ-approximation scheduling algorithm for the (classic) parallel job scheduling problem is also an ρ-approximation for the Campaign scheduling problem. For the general case with k users, we establish a fairness criterion inspired by time sharing. We propose FAIRCAMP, a scheduling algorithm which uses campaign deadlines to achieve fairness among users between consecutive campaigns. We prove that FAIRCAMP increases the flow time of each user by a factor of at most k ρ compared with a machine dedicated to the user. We also prove that FAIRCAMP is a ρ-approximation algorithm for the maximum stretch. By simulation, we compare FAIRCAMP to the First-Come-First-Served (FCFS). We show that, compared with FCFS, FAIRCAMP reduces the maximum stretch by up to 3.4 times. The difference is significant in systems used by many (k > 5) users. Our results show that, rather than just individual, independent jobs, campaigns of jobs can be handled by the scheduler efficiently and fairly.
The processing of microscopic tissue images and especially the detection of cell nuclei is nowadays done more and more using digital imagery and special immunodiagnostic software products. Since several methods (and a...
详细信息
The processing of microscopic tissue images and especially the detection of cell nuclei is nowadays done more and more using digital imagery and special immunodiagnostic software products. Since several methods (and applications) were developed for the same purpose, it is important to have a measuring number to determine which one is more efficient than the others. The purpose of the article is to develop a generally usable measurement number that is based on the “gold standard” tests used in the field of medicine and that can be used to perform an evaluation using any of image segmentation algorithms. Since interpreting the results themselves can be a pretty time consuming task, the article also contains a recommendation for the efficient implementation and a simple example to compare three algorithms used for cell nuclei detection.
The processing of microscopic tissue images is nowadays done more and more using special immunodiagnostic-evaluation software products. Often to evaluate the samples, the first step is determining the number and locat...
详细信息
The processing of microscopic tissue images is nowadays done more and more using special immunodiagnostic-evaluation software products. Often to evaluate the samples, the first step is determining the number and location of cell nuclei. To do this, one of the most promising methods is the region growing, but this algorithm is very sensitive to the appropriate setting of different parameters. Due to the large number of parameters and due to the big set of possible values setting those parameters manually is a quite hard task, so we developed a genetic algorithm to optimize these values. The first step of the development is the statistical analysis of the parameters, and the determination of the important features, to extract valuable information for a to-be-implemented genetic algorithm that will perform the optimization.
The project MoSGrid (Molecular Simulation Grid) has been developing a web-based science gateway supporting the community with various services for quantum chemistry, molecular modeling, and docking. Users gain access ...
详细信息
It is quite a headache for developers to online detect performance problems in large-scale cloud computing systems. The behavior and the hidden connections among the huge amount of runtime request execution paths in c...
详细信息
Extracting fault features with the error logs of fault injection tests has been widely studied in the area of large scale distributed systems for decades. However, the process of extracting features is severely affect...
详细信息
We have developed a combined network and service management and diagnostics solution for our in-house developed remote patient monitoring system. The developed system has included into the ALPHA eHealth/remote patient...
详细信息
Nowadays microscopic analysis of tissue samples is done more and more by using digital imagery and special immunodiagnostic software. These are typically specific applications developed for one distinct field, but som...
详细信息
暂无评论