Field Programmable Gate Arrays (FPGAs) have long been utilized in systems benefiting from hardware acceleration of processes unsuitable for execution on a traditional processor. Accordingly, as much of the world pivot...
详细信息
ISBN:
(纸本)9798350381993;9798350382006
Field Programmable Gate Arrays (FPGAs) have long been utilized in systems benefiting from hardware acceleration of processes unsuitable for execution on a traditional processor. Accordingly, as much of the world pivots from on-site datacenters and computing resources to hybrid or cloud based platforms, Multi-processing System-on-Chip (MPSoC) FPGAs are increasingly being employed in cloud computingsystems to speed up many computation-intensive applications. In cloud computing, multi-tenant FPGAs are constantly space- and time-shared among multiple tenants dynamically by leveraging the partial reconfiguration property of FPGAs. With increasing security and privacy concerns introduced by these memory-based volatile devices, most countermeasures rely on encryption and decryption engines such as AES (Advanced Encryption Standard) cores for user data protection. However, their high-resource requirements and long latency limit the number of such engines that can be implemented in hardware. They often become a performance bottleneck during peak time. In this paper, we propose a scheduling algorithm for aperiodic tasks to dynamically share multiple AES cores and hence to improve their utilization and overall system performance. Extensive experimental measurements on an FPGA development board featuring a Xilinx Ultrascale+ FPGA demonstrate the efficacy of our mechanism.
distributedsystems based on the Internet allow resources widely dispersed throughout the network to be utilized cooperatively and shared. This allows the end-user to obtain massive computational power to perform a co...
详细信息
Federated learning (FL) is an appealing model training technique that utilizes heterogeneous datasets and user devices, ensuring user data privacy. Existing FL research proposed device selection schemes to balance the...
详细信息
ISBN:
(纸本)9798350368543;9798350368536
Federated learning (FL) is an appealing model training technique that utilizes heterogeneous datasets and user devices, ensuring user data privacy. Existing FL research proposed device selection schemes to balance the computing speeds of devices. However, we observe that these schemes compromise prediction accuracy by similar to 57.7%. To solve this problem, we present Harmonia that enhances prediction accuracy, while also balancing the diverse computing speeds of devices. Our evaluation shows that Harmonia improves prediction accuracy by similar to 1.7x over existing schemes.
Transient stability assessment (TSA) is an indispensable routine in power system operation and control. The increasing integration of distributed energy resources highlights the necessity of distributed transient stab...
详细信息
ISBN:
(纸本)9798331541378
Transient stability assessment (TSA) is an indispensable routine in power system operation and control. The increasing integration of distributed energy resources highlights the necessity of distributed transient stability assessment which can effectively capture the complicated stability characteristics of the entire power system without compromising the data privacy of individual local subsystems. This paper devises a quantum-enabled distributed transient stability assessment (Q-dTSA) method to enable data-driven transient stability prediction of power grids in a distributed, expressive and privacy-preserving manner. Our contributions include: 1) A quantum federated learning (QFL) architecture, which enables local power grids to jointly realize the data-driven TSA for the entire system using shallow-depth quantum circuits;2) A distributed quantum gradient descent (d-QGD) algorithm, which supports effective coordination between local subsystems to perform distributed training of the QNNs without leaking local power system information. 3) Extensive experiments in real-scale power grids obtained from both noise-free simulators and noisy IBM quantum computers, which validate the accuracy, fidelity, and noise-resilience of Q-dTSA, as well as its superiority over centralized quantum computing algorithms.
Graph is often used for data analysis. distributed graph processing is gaining traction as it becomes more difficult for a single machine to store and process the complete graph due to the growing volume of data. We i...
详细信息
ISBN:
(纸本)9798350376975;9798350376968
Graph is often used for data analysis. distributed graph processing is gaining traction as it becomes more difficult for a single machine to store and process the complete graph due to the growing volume of data. We investigated 26 popular distributed graph processing systems and the graph algorithms and datasets provided by these systems. The computational logic of these graph algorithms does not distinguish between the types of vertices and edges, so distributed graph processing systems treat all vertices and edges in an undifferentiated way. However, using the hidden data connections of different types of vertices in multi-networks can greatly improve the accuracy of the community search algorithm. So we describe the challenges for the existing distributed graph processing systems to deal with different types of vertices and edges in multi-networks, and propose an index-based multi-network storage abstraction to store various vertices and edges, and a heuristic greedy algorithm to complete the partition job for different vertices and edges. Base on the two jobs, we finish the research of efficient community search for distributed graph processing in multi-networks, making it possible for future research of more algorithms for distributed graph processing in multi-networks.
High-performance computing is a prime area for many applications. Majorly, weather and climate forecast applications use the HPC system because it needs to give a good result with low latency. In recent years machine ...
详细信息
ISBN:
(数字)9781665471770
ISBN:
(纸本)9781665471770
High-performance computing is a prime area for many applications. Majorly, weather and climate forecast applications use the HPC system because it needs to give a good result with low latency. In recent years machine learning and deep learning models have been widely used to forecast the weather. However, to the best of the author's knowledge, many applications do not effectively utilise the HPC system for training, testing, validation, and inference of weather data. Our experiment is to conduct performance modeling and benchmark analysis of weather and climate forecast machine learning models and determine the characteristics between the application, model and the underlying HPC system. Our results will help the researchers improvise and optimise the weather forecast system and use the HPC system efficiently.
Communication overhead is a major obstacle to scaling distributed training systems. Gradient sparsification is a potential optimization approach to reduce the communication volume without significant loss of model fid...
详细信息
ISBN:
(纸本)9798350395679;9798350395662
Communication overhead is a major obstacle to scaling distributed training systems. Gradient sparsification is a potential optimization approach to reduce the communication volume without significant loss of model fidelity. However, existing gradient sparsification methods have low scalability owing to inefficient design of their algorithms, which raises the communication overhead significantly. In particular, gradient build-up and inadequate sparsity control methods degrade the sparsification performance considerably. Moreover, communication traffic increases drastically owing to workload imbalance of gradient selection between workers. To address these challenges, we propose a novel gradient sparsification scheme called ExDyna. In ExDyna, the gradient tensor of the model comprises fined-grained blocks, and contiguous blocks are grouped into non-overlapping partitions. Each worker selects gradients in its exclusively allocated partition so that gradient build-up never occurs. To balance the workload of gradient selection between workers, ExDyna adjusts the topology of partitions by comparing the workloads of adjacent partitions. In addition, ExDyna supports online threshold scaling, which estimates the accurate threshold of gradient selection on-the-fly. Accordingly, ExDyna can satisfy the user-required sparsity level during a training period regardless of models and datasets. Therefore, ExDyna can enhance the scalability of distributed training systems by preserving near-optimal gradient sparsification cost. In experiments, ExDyna outperformed state-of-the-art sparsifiers in terms of training speed and sparsification performance while achieving high accuracy.
Time series anomaly detection is one kind of critical time series analytical tasks, which is widely applied to various real-world applications. Recently, the diffusion models have shown promising performance on time s...
详细信息
distributed and central control are two complementary paradigms to establish self-adaptation in software systems. Both approaches have their individual benefits and drawbacks, which lead to significant trade-offs rega...
详细信息
ISBN:
(数字)9781665488792
ISBN:
(纸本)9781665488792
distributed and central control are two complementary paradigms to establish self-adaptation in software systems. Both approaches have their individual benefits and drawbacks, which lead to significant trade-offs regarding certain software qualities when designing such systems. The significance of these trade-offs even increases the more complex the target system becomes. In this paper, we present our work-in-progress towards an integrated control approach, which aims at providing the best of both control paradigms. We present the basic concepts of this multi-paradigm approach and outline its inherent support for complex system hierarchies. Further, we illustrate the vision of our approach using application scenarios from the smart energy grid as an example for self-adaptive systems of systems.
At present, there is a notable focus on Parallel and distributedcomputing (PDC) initiatives within the realm of undergraduate engineering education in India. Owing to differences in education systems across borders, ...
详细信息
ISBN:
(纸本)9798350383782;9798350383799
At present, there is a notable focus on Parallel and distributedcomputing (PDC) initiatives within the realm of undergraduate engineering education in India. Owing to differences in education systems across borders, along with variations in university policies, these efforts must be curated to cater to specific stakeholders, ensuring the achievement of the desired outcomes. Understanding such scenarios is crucial for the landscape of Indian undergraduate PDC education. This paper unveils a success story of implementing PDC at the undergraduate level for the past decade and a half, offering valuable insights gathered along this extended journey. Reflecting the idea that "every master was once a beginner," the narrative unfolds to inspire and empower educators who are just starting out. Whether introducing or already incorporating PDC education into the curriculum, this account is crafted to uplift and guide. Amidst the ongoing initiatives across the country, the time has come to progress and elevate PDC education beyond its current status. This paper presents a summary of potential efforts that the PDC community in India can explore for such initiatives.
暂无评论