In modern softwaresystems, log anomaly detection is a key technology to ensure system stability and reliability. The existing log-based anomaly detection methods have make significant development, but they fail to pr...
详细信息
The Internet of Things (IoT) is increasingly vulnerable to security risks due to new network attacks. Deep learning-based intrusion detection systems (DL-IDS) have emerged as a key solution, but they face challenges l...
详细信息
Modern softwareengineering is getting increasingly complicated. Especially in the HPC field, we are dealing with cutting edge infrastructure and a novel problem with unprecedented scale. The ability to monitor and an...
详细信息
In the domain of Vehicular Edge Computing (VEC), this paper addresses the complex problem of Service Function Chain (SFC) placement, which is crucial for the efficient deployment of cloud applications in vehicular env...
详细信息
Vulnerabilities in software and distributedsystems are increasing, and system security becomes a challenge for developers. Giving a quick vulnerability classification for newly discovered vulnerabilities is helpful f...
详细信息
In large-scale distributedsystems, computational nodes often experience random slowdown which can degrade the performance of timely computation tasks significantly. Recently, coded computing has emerged as a promisin...
详细信息
The end of Moore's law has placed a two-fold demand on hardware simulation. Firstly, efficient co-design requires fast simulation of hardware systems in order to vet proposed designs. Secondly, modern simulator pl...
详细信息
Real-world large-scale applications expose more and more pressures to storage services of modern supercomputers. Supercomputers have been introducing new storage devices and technologies to meet the performance requir...
详细信息
ISBN:
(纸本)9781665481069
Real-world large-scale applications expose more and more pressures to storage services of modern supercomputers. Supercomputers have been introducing new storage devices and technologies to meet the performance requirements of various applications, leading to more complicated architectures. High I/O demand of applications and the complicated and shared storage architectures make the issues, such as unbalanced load, I/O interference, system parameter configuration error, and node performance degradation, more frequently observed. And it is challenging to both achieve high I/O performance on application level and efficiently utilize scarce storage resources. We propose AIOT, an end-to-end and adaptive I/O optimization tool for HPC storage systems, which introduces effective I/O performance modeling and several active tuning strategies to improve both the I/O performance of applications and the utilization of storage resources. AIOT provides a global view of the whole storage system and searches for the optimal end-to-end I/O path through flow network modeling. Moreover, AIOT tunes system parameters across multiple layers of the storage system by using the automated identified application I/O behaviors and the instant status of the workload of storage system. We verified the effectiveness of AIOT for balancing I/O load, resolving I/O interference, improving I/O performance by configuring appropriate system parameters, and avoiding I/O performance degradation caused by abnormal nodes through quite a few realworld cases. AIOT has helped to save over ten millions of corehours during the deployment on Sunway TaihuLight since July 2021. It's worth mentioning that our proposed AIOT is capable of managing other I/O optimization methods across various storage platforms.
One of the major bottlenecks of traditional Blockchain is its low throughput resulting in poor scalability. One way to increase throughput is to shard the network nodes to form smaller groups (shards). There are a num...
详细信息
ISBN:
(数字)9781665488020
ISBN:
(纸本)9781665488020
One of the major bottlenecks of traditional Blockchain is its low throughput resulting in poor scalability. One way to increase throughput is to shard the network nodes to form smaller groups (shards). There are a number of sharding schemes in the literature with a common goal: nodes are split into groups to concurrently process different sets of transactions. parallelism is used to enhance scalability, however with a trade-off in fault-tolerance;i.e., the smaller the shard size is, the better is the performance but higher is the fault probability. Contemporary sharding schemes use variants of Byzantine Fault Tolerance (BFT) protocol as their intra-shard consensus algorithms. BFT gives good performance when shard sizes are kept relatively small and maximum allowable faults is below some threshold. However, all these systems make rigid assumptions about their shard sizes and maximum allowable faults which may not be practical at times. In recent years, there have been more practical hybrid fault models in the literature which are better applicable to Blockchain (e.g., hybrid of Byzantine and alive-but-corrupt (abc) faults where the latter only compromises on safety) and corresponding consensus protocols that offer flexibility in choice of fault types and quorum sizes, e.g., Flexible Byzantine Fault Tolerance (Flexible BFT). In this paper, we present a new sharding scheme, FlexiShard, that uses Flexible BFT as its intra-shard consensus algorithm. FlexiShard leverages the notion of flexible Byzantine quorums and the hybrid fault model introduced in Flexible BFT that comprises of Byzantine and abc faults. Use of Flexible BFT allows flexibility in the choice of fault types and choosing shard sizes based on a range of allowable fault thresholds. Additionally, it allows to form shards that can tolerate more total faults than traditional BFT shards of similar size, and hence can deliver similar performance but with more fault-tolerance. To the best of our knowledge, FlexiSh
Regenerating codes are new network codes proposed to reduce the data required for fault repair, which can improve the recovery efficiency of faulty nodes in data storage systems. However, unlike Reed-Solomon code, whi...
详细信息
暂无评论