The erasure-coded cross-data center storage system can achieve high disaster tolerance and low redundancy. But as it has large cross-data center update traffic, its data update time is long. In erasure-coded storage s...
详细信息
The erasure-coded cross-data center storage system can achieve high disaster tolerance and low redundancy. But as it has large cross-data center update traffic, its data update time is long. In erasure-coded storage systems, each data object is sequentially divided into several stripes, and each stripe consists of several data packets. When erasure-coded stripes do not undergo insertion or deletion, existing work can effectively reduce cross-data center update traffic by performing delta updates- delta update methods can update the old stripe without transferring matched packets that are new and old stripes' duplicate packets with the same offset-within-stripe. However, because existing delta update methods' stripe size is fixed, when a stripe undergoes insertion or deletion, its subsequent stripes' duplicate packets' offset-within-stripe will change. In this scenery, the matched packet number is small, resulting in large cross-data center update traffic. This paper proposes an elastic stripe-based delta update method for erasure-coded cross-data center storage systems (ESDU). Under insertion or deletion, ESDS tries to avoid duplicate packets' offset-within-stripe changing (i.e., maximizing matched packets) by adjusting the stripes' size flexibly according to the duplicate packet locating result. So, it can reduce cross-data center traffic. Moreover, ESDU can optimize stripes' update topology based on the location information of storage nodes to reduce cross-data center update traffic further. In addition, we implement an erasure-coded cross-data center storage system adopting ESDU, called ECESD. Experiments with the workloads derived from EduCoder's real-world trace show that compared with the existing erasure-coded cross-data center storage system adopting the fixed stripe-based delta update method, ECESD reduces average update time by 89.6%. Moreover, compared with a replication-based storage system with a delta update method (HadoopRsync), ECESD achieves an 8.3%
Heterogeneous hardware platforms comprised of CPUs, GPUs, and other accelerators offer the opportunity to choose the best-suited device for executing a given scientific simulation in order to minimize execution time a...
详细信息
The rise of deep learning methods has ignited interest in efficient hardware and software systems for tensor-based computing. A question worth investigating is whether other areas in computing can benefit as well from...
The rise of deep learning methods has ignited interest in efficient hardware and software systems for tensor-based computing. A question worth investigating is whether other areas in computing can benefit as well from the power and increasing availability of such systems. To this end, we study the integration of tensor-based techniques in agent-based simulators. In particular, we describe methods for the representation as vectors of agent and edge attributes in agent networks and the compilation of rules governing agent behavior and their edge-based interactions into functions that can process and compute such attribute vectors in parallel. We describe a proof of concept implementation of such an idea in Politika, our Elixir-based web framework for concurrent, agent-based simulation. We discuss various simulation scenarios for our implementation and provide directions for future optimizations and integration pathways with concurrent simulation environments.
Modern distributedsystems generate interleaved logs when performing parallel operations, and these logs become an important basis for anomaly detection and localization. To achieve more robust and accurate log anomal...
详细信息
In the transition to a society with net-zero carbon emissions, high penetration of renewable energy sources and novel energy vectors are driving traditional operational systems, models, and processes towards emerging ...
详细信息
We propose a circulating-current control scheme for a two-module parallel three-phase voltage-source inverter (VSI) that can be applied in general to any space-vector modulation (SVM) scheme and experimentally demonst...
详细信息
ISBN:
(纸本)9781665410816
We propose a circulating-current control scheme for a two-module parallel three-phase voltage-source inverter (VSI) that can be applied in general to any space-vector modulation (SVM) scheme and experimentally demonstrate its effectiveness for two commonly used SVM schemes (symmetrical SVM and bus-clamped SVM). The proposed control strategy relies on radiofrequency (RF) based wireless communication to exchange circulating-current information among the parallel VSIs. We also investigate the impacts of time delay due to wireless communication on the stability of the parallel VSI. Thus, by transmitting the circulating-current information over a wireless medium, we attain improved VSI performance, while also ensuring that the issues of survivability, robustness, self-configuration and reconfigurability of the system are not compromised. In general, such a control scheme may lead to more redundant control implementation of distributed power systems and could also be used as a back-up for wire-based control schemes to provide fault tolerance.
Processing high-throughput data-streams has become a major challenge in areas such as real-time event monitoring, complex dataflow processing, and big data analytics. While there has been tremendous progress in distri...
详细信息
Processing high-throughput data-streams has become a major challenge in areas such as real-time event monitoring, complex dataflow processing, and big data analytics. While there has been tremendous progress in distributed stream processing systems in the past few years, the high-throughput and low latency (a.k.a. high sustainable-throughput) requirement of modern applications is pushing the limits of traditional data processing infrastructures. This paper introduces a new distributed stream processing engine (DSPE), called Asynchronous Iterative Routing (or simply "AIR "), which implements a light-weight, dynamic sharding protocol. AIR expedites direct and asynchronous communication among all the worker nodes via a channel-like communication protocol on top of the Message Passing Interface (MPI), thereby completely avoiding the need for a dedicated driver node. The system adopts a new progress-tracking protocol, called hew-meld, which has been experimentally observed to show a low processing latency on our asynchronous master-less architecture when compared to the conventional low-watermark technique. The current version of AIR is also equipped with two fault tolerance and recovery strategies namely checkpointing & rollback and replication. With its unique design, AIR scales out particularly well to multi core HPC architectures;specifically, we deployed it on clusters with up to 16 nodes and 448 cores (thus reaching a peak of 435.3 million events and 55.14 GB of data processed per second), which we found to significantly outperform existing DSPEs. (c) 2022 Elsevier Inc. All rights reserved.
This paper explores the integration of distributed event processing in the context of real-time routing within a parallel transportation simulation framework for the Multi-Agent Transport Simulation (MATSim). A novel ...
详细信息
ISBN:
(数字)9798350369199
ISBN:
(纸本)9798350369205
This paper explores the integration of distributed event processing in the context of real-time routing within a parallel transportation simulation framework for the Multi-Agent Transport Simulation (MATSim). A novel simulation prototype, utilizing Rust and MPI, demonstrates significant re-ductions in computation time by applying domain decomposition of the network and assigning each part to a separate process. However, sharing and processing events within this setup remains challenging. We present a proof of concept for integrating distributed event processing. Our evaluation has shown that realtime routing significantly increases simulation runtime because uneven distribution of routing requests increases load imbalances and makes speedups less efficient. Sharing event data between processes benefits from smaller subdomains. However, global synchronization for sharing data leads to waiting times introduced by load imbalances.
This paper addresses the challenges of optimizing task scheduling for a distributed, task-based execution model in OpenMP for cluster computing environments. Traditional OpenMP implementations are primarily designed f...
详细信息
Classically, rasterization techniques are performed for real-time rendering to meet the constraint of interactive frame rates. However, such techniques do not produce realistic results as compared to ray tracing appro...
详细信息
暂无评论