For performance analysis, optimization and anomaly detection, there is strong need to monitor industrial systems, which, along modern datacenter infrastructures, feature a high level of decentralization. Continuous Di...
详细信息
Cloud storage is highly available, scalable, and cost-efficient. Yet, many cannot store data in cloud due to security concerns and legacy infrastructure such as network-attached storage ( NAS). We describe Kurma, a cl...
详细信息
ISBN:
(纸本)9781450367493
Cloud storage is highly available, scalable, and cost-efficient. Yet, many cannot store data in cloud due to security concerns and legacy infrastructure such as network-attached storage ( NAS). We describe Kurma, a cloud storage gateway system that allows NAS-based programs to seamlessly and securely access cloud storage. To share files among distant clients, Kurma maintains a unified file-system namespace by replicating metadata across geo-distributed gateways. Kurma stores only encrypted data blocks in clouds, keeps file-system and security metadata on-premises, and can verify data integrity and freshness without any trusted third party. Kurma uses multiple clouds to prevent cloud outage and vendor lock-in. Kurma's performance is 52-91% that of a local NFS server while providing geo-replication, confidentiality, integrity, and high availability.
In complex event processing (CEP), simple derived event tuples are combined in pattern matching procedures to derive complex events (CEs) of interest. Big Data applications analyze event streams online and extract CEs...
详细信息
Making large-scale Mass Spectrometry (MS) data FAIR (Findable, Accessible, Interoperable, Reusable) and democratizing access for the omics research community requires advance access and reuse mechanisms. In this work,...
详细信息
ISBN:
(纸本)9781450384506
Making large-scale Mass Spectrometry (MS) data FAIR (Findable, Accessible, Interoperable, Reusable) and democratizing access for the omics research community requires advance access and reuse mechanisms. In this work, we proposed a novel distributed data access infrastructure and developed a simulation test-bed to show the feasibility of this solution. In contrast to existing centralized approaches, participating nodes are relied upon to execute the search algorithm and search based on the comparison of raw spectra is supported as opposed to simple meta-data based searches. Simulation results using networking, stochastic modelling, and queuing theory, illustrated that search times were reduced by up-to 600 times for up-to a total of fifty billion spectra. Proteomics is vital because of the importance proteins to life and their role in state-of-the-art medicine such as custom drug delivery and cancer treatment. MS-based proteomics involves the fragmentation of proteins into peptide ions to generate raw MS spectra. Traditionally, scientists have relied on meta-data based searches of centralized repositories followed by complex database searches and protein sequencing. though useful, this technique may result in missed datasets because of poor meta-data or sheer amount of effort and computational time needed. Recently, direct raw spectra search has been proposed withthe development of centralized tools such as PeptideAtlas. However, PeptideAtlas hosts 13,000 spectra whereas systems supporting billions of spectra are needed. Let us assume users can submit one or more query spectra for search to a central controller. In the proposed novel distributed paradigm, the controller will forward the queries to several nodes hosting a total of multiple MS/MS datasets, where each of the nodes will run the search algorithm against against each spectrum in their local MS/MS dataset, and send the results as URLs/pointers and associated scores back to the controller. the controller w
In the era of rapid development of information technology brings about the protection of digital products. In order to solve the problem of traditional anti-collusion fingerprint code is difficult to detect multi-user...
详细信息
We describe means to run eBPF on a production environment for systems inspection. We examine the inspected system outputs in order to train and generate a model for the host. We model the specific application and netw...
详细信息
ISBN:
(纸本)9781450367493
We describe means to run eBPF on a production environment for systems inspection. We examine the inspected system outputs in order to train and generate a model for the host. We model the specific application and network traffic usage on the site based on the data collected by eBPF. Our system generates alerts when an anomaly in performance is detected on a specific host. these warnings can be used to discover the root cause for performance problems, cyber-security issues and warn in advance about potential performance peaks.
An ever increasing number of services requires real-time analysis of collected data streams. Emerging Fog/Edge computing platforms are appealing for such latency-sensitive applications, encouraging the deployment of D...
详细信息
Integrated access and backhaul (IAB) has been introduced in 3GPP Release 16 as a promising enabling technology for densely deployed 5G small cell networks at millimeter wave (mmWave) band. In this paper, we study MIMO...
详细信息
Temporally evolving graphs are an indispensable requisite of modern-day big data processing pipelines. Existing graph processing systems mostly focus on static graphs and lack the essential support for pattern detecti...
详细信息
ISBN:
(纸本)9781450367943
Temporally evolving graphs are an indispensable requisite of modern-day big data processing pipelines. Existing graph processing systems mostly focus on static graphs and lack the essential support for pattern detection and event processing in graph-shaped data. On the other hand, stream processing systems support event and pattern detection, but they are inadequate for graph processing. this work lies at the intersection of the graph and stream processing domains withthe following objectives: (i) It introduces the syntax of a language for the detection of temporal patterns in large-scale graphs. (ii) It presents a novel data structure called distributed label store (DLS) to efficiently store graph computation results and discover temporal patterns within them. the proposed system, called FlowGraph, unifies graph-shaped data with stream processing by observing graph changes as a stream flowing into the system. It provides an API to handle temporal patterns that predicate on the results of graph computations with traditional graph computations.
Motivated by the growth of Internet of things (IoT) technologies and the volumes and velocity of data that they can and will produce, we investigate automated data repair for event-driven, IoT applications. IoT device...
详细信息
ISBN:
(纸本)9781450367943
Motivated by the growth of Internet of things (IoT) technologies and the volumes and velocity of data that they can and will produce, we investigate automated data repair for event-driven, IoT applications. IoT devices are heterogeneous in their hardware architectures, software, size, cost, capacity, network capabilities, power requirements, etc. they must execute in a wide range of operating environments where failures and degradations of service due to hardware malfunction, software bugs, network partitions, etc. cannot be immediately remediated. Further, many of these failure modes cause corruption in the data that these devices produce and in the computations "downstream" that depend on this data. To "repair" corrupted data from its origin through its computational dependencies in a distributed IoT setting, we explore SANS-SOUCI - a system for automatically tracking causal data dependencies and re-initiating dependent computations in event-driven IoT deployment frameworks. SANS-SOUCI presupposes an event-driven programming model based on cloud functions, which we extend for portable execution across IoT tiers (device, edge, cloud). We add fast, persistent, append-only storage and versioning for efficient data robustness and durability. SANS-SOUCI records events and their causal dependencies using a distributedevent log and repairs applications dynamically, across tiers via replay. We evaluate SANS-SOUCI using a portable, open source, distributed IoT platform, example applications, and microbenchmarks. We find that SANS-SOUCI is able to perform repair for both software (function) and sensor produced data corruption with very low overhead.
暂无评论