Federated Learning (FL) has emerged as a privacy-preserving method for distributed clients to collaboratively train a model without exposing their raw data. However, the limited communication bandwidth and insufficien...
详细信息
ISBN:
(数字)9798350352719
ISBN:
(纸本)9798350352726
Federated Learning (FL) has emerged as a privacy-preserving method for distributed clients to collaboratively train a model without exposing their raw data. However, the limited communication bandwidth and insufficient computing capacity of clients hinder the efficient deployment of FL applications. Besides, the diverse data distribution among clients may also degrade model performance. To address the aforementioned issues, we propose a novel framework called FLMP. Specifically, our FLMP dynamically identifies and eliminates redundant components within models by analyzing representation similarity and filter importance based on clients' data. This process produces multiple optimal sub-models that are tailored to client capacities. Comprehensive experiments demonstrate that, Compared with other baselines, our proposed framework achieves better performance in both i.i.d and challenging non-i.i.d data scenarios. It achieves up to a 12.31% improvement in model accuracy while maintaining similar communication and computation costs.
In this paper, we conducted a group of evaluations on the SpMV kernel with sequential implementation to investigate cache performance on single-core platforms. We verified a similar pattern inside a suite of sparse ma...
详细信息
ISBN:
(纸本)9781665458382
In this paper, we conducted a group of evaluations on the SpMV kernel with sequential implementation to investigate cache performance on single-core platforms. We verified a similar pattern inside a suite of sparse matrices covering various domains, which makes cache hit rate extraordinary inspiring in a sequential environment. This implicit regularity drove us to propose a cache space splitting approach, aiming at a better locality in dense vector accessing and utilization of large cache capacity in modern processors. Finally, we explored the design space of cache on Matrix 3000 GPDSP and proposed a group of cache parameters, based on our experimental results.
The efficiency of concurrency control protocols plays a crucial role in transaction processing systems. However, when it comes to deterministic transactions (i.e., transactions with known read/write key sets), existin...
详细信息
ISBN:
(数字)9798350387117
ISBN:
(纸本)9798350387124
The efficiency of concurrency control protocols plays a crucial role in transaction processing systems. However, when it comes to deterministic transactions (i.e., transactions with known read/write key sets), existing concurrency control protocols are not optimized to make the most of the determinism. They either force transactions to be aborted and retried, which negatively affects system throughput, or use a centralized scheduler to organize transactions in a way that avoids aborts, but with limited system *** this paper, we present DecentSched, a highly efficient decentralized concurrency control protocol for deterministic transactions. DecentSched employs fine-grained queuing and a decentralized scheduling algorithm to enable serializable concurrent transaction execution with a high degree of parallelism. Extensive evaluation results show that DecentSched can outperform state-of-the-art concurrency control protocols in representative benchmarks.
This paper presents the design and implementation of a novel Network Function Virtualization (NFV) platform, NFVDC, specifically for dynamic service chaining. Unlike traditional service chaining methods, NFVDC support...
详细信息
ISBN:
(数字)9798331509712
ISBN:
(纸本)9798331509729
This paper presents the design and implementation of a novel Network Function Virtualization (NFV) platform, NFVDC, specifically for dynamic service chaining. Unlike traditional service chaining methods, NFVDC supports real-time, reconfigurable service chains that accommodate service functions dynamically joining or leaving the chain. By integrating an advanced traffic steering method, optimized CPU scheduling, and a causal message-passing mechanism, NFVDC addresses challenges in deploying dynamic service chaining, such as traffic steering, resource scheduling, and session state management. Our evaluation demonstrates that NFVDC’s forwarding performance nearly doubles compared to existing solutions when extending service chain lengths to six.
Coarse-grained reconfigurable arrays (CGRAs) belong to the family of configurable processing architectures that have recently attracted increasing interest for their adaptability and efficiency. Research on CGRA archi...
详细信息
ISBN:
(数字)9798350364606
ISBN:
(纸本)9798350364613
Coarse-grained reconfigurable arrays (CGRAs) belong to the family of configurable processing architectures that have recently attracted increasing interest for their adaptability and efficiency. Research on CGRA architectures and their associated CAD tools is typically conducted empirically, by modelling a CGRA fabric, mapping applications onto it, and then assessing performance, power, and/or area (PPA). In this paper, we describe an open-source framework for such CGRA research - CGRA-ME - CGRA modelling and exploration. We describe the recently released version 2.0 of CGRA - ME, which incorporates a new mapping approach, support for elastic CGRAs, floating point, predication, hybrid RISC-V+CGRA systems, and more.
Graph-based multi-view clustering methods have gained significant attention due to their outstanding ability of clustering-structure representation. Considering the influence on the quality of pre-constructed graphs b...
详细信息
ISBN:
(数字)9798331509712
ISBN:
(纸本)9798331509729
Graph-based multi-view clustering methods have gained significant attention due to their outstanding ability of clustering-structure representation. Considering the influence on the quality of pre-constructed graphs by noise, in this paper, we propose a novel method called ANGVR. Unlike existing methods that aim to directly learn a consensus graph from multi-view data, ANGVR rebuilds the graph constructed from raw data to seek a consensus graph across views for clustering. Furthermore, to guide the construction of graph, an embedding constraint based on neighboring group structures is introduced, which explores the neighborhood structure information corresponding to neighborhood sets. The experimental results show improvements in accuracy of 5.56%, 6.77% and 4.02%, respectively on 3Sources dataset with 10%, 30% and 50% missing view compared to the existing works.
Conventional power sharing strategies for parallel inverters are mainly divided into two categories. One is based on interconnect lines (ILs) for power information exchange, where the whole system would be subjected t...
Conventional power sharing strategies for parallel inverters are mainly divided into two categories. One is based on interconnect lines (ILs) for power information exchange, where the whole system would be subjected to paralysis once a fault hits ILs. Another approach is droop control, where no communication mechanism is required but has uneven power sharing due to the mismatched output impedance. To address this issue, a hybrid power sharing strategy based on adaptive virtual impedance is proposed for low-voltage parallel inverters, which reduces the number of ILs and achieves power sharing evenly. Besides, since the mismatch in output impedance among inverters is compensated by virtual impedance, the system can maintain normal operation even if the ILs fail, which shows superior fault ride-through capability. Simulation results verify the effectiveness of the proposed strategy.
Increasing diversity of applications and services are being migrated to modern data center networks (DCNs), and these applications and services are typically generating various combinations of long and short flows wit...
详细信息
ISBN:
(数字)9798331509712
ISBN:
(纸本)9798331509729
Increasing diversity of applications and services are being migrated to modern data center networks (DCNs), and these applications and services are typically generating various combinations of long and short flows with or without deadlines. However, most existing flow scheduling solutions for DCNs either adopt single-queue strategy (e.g., D 2 TCP) that inevitably results in non-urgent flows blocking urgent flows or multi-queue mechanism (e.g., Aemon) that is at the cost of packet reordering. In this paper, we present DPOT, a Dynamic Priority-based Ordered Transmission mechanism that is aimed at minimizing the flow completion time and deadline missing rate. To avoid the urgent flows being blocked by the non-urgent flows, the DPOT switch adopts different priority queues to buffer packets for different types of flows. What’s more, when the switch detects that a data flow promotes the priority of its packets, it utilizes a disorder-free flow scheduling mechanism based on dynamic prioritization to make sure that packets of the same flow arrive at the receiver without reordering. Through a series of experimental tests, we demonstrate that DPOT can decrease the deadline miss rate by up to 95% while reducing flow completion time by up to 45% in comparison to the state-of-the-art flow scheduling approaches.
The advent of distributed computing systems will offer great flexibility for application workloads, while also imposing more attention to security, where the future advent and adoption of quantum technology can introd...
详细信息
ISBN:
(数字)9798350362244
ISBN:
(纸本)9798350362251
The advent of distributed computing systems will offer great flexibility for application workloads, while also imposing more attention to security, where the future advent and adoption of quantum technology can introduce new security threats. For this reason, the Multi-access Edge Computing (MEC) working group at ETSI has recently started delving into security aspects, especially motivated by the upcoming reality of the MEC federation, which involves services made of application instances belonging to different systems (thus, different trust domains). On the other side, Quantum Key Distribution (QKD) can help strengthen the level of security by enabling the exchange of secure keys through an unconditionally secure protocol, e.g., to secure communication between REST clients and servers in distributed computing systems at the edge. In this paper, we propose a technical solution to achieve this goal, building on standard specifications, namely ETSI MEC and ETSI QKD, and discussing the gaps and limitations of current technology, which hamper full-fledged in-field deployment and mass adoption. Furthermore, we provide our look-ahead view on the future of secure distributed computing through the enticing option of federating edge computing domains.
Stateful serverless systems commonly adopt an architectural paradigm characterized by compute and storage separation within cloud data centers. Nevertheless, guaranteeing prompt response for real-time tasks at the edg...
详细信息
ISBN:
(数字)9798331509712
ISBN:
(纸本)9798331509729
Stateful serverless systems commonly adopt an architectural paradigm characterized by compute and storage separation within cloud data centers. Nevertheless, guaranteeing prompt response for real-time tasks at the edge becomes challenging due to network overheads. This paper introduces a Low-Latency state management framework for real-time tasks in edge serverless systems called LoLa, which adaptively places states proximate to functions, thereby mitigating delays in accessing states within edge serverless systems. Our approach aims at mitigating network latency and optimizing resource utilization by co-locating functions and states, thereby enhancing the system’s overall efficiency. We introduce an adaptive strategy to coordinate the migration of states. It dynamically adjusts the positions of states based on historical data and real-time feedback. Additionally, we designed an in-memory state storage mechanism to facilitate low-latency access and implement a lightweight and fine-grained state management to ensure stored state consistency. Evaluation results showcase the efficacy of LoLa in reducing state read and write latency within edge serverless systems. Specifically, the average response latency is observed to decrease by 65.2% and 38.1% in the best and worst-case scenarios, respectively.
暂无评论