For the training process of federated linear regression (FLR), which is the simplest form of federated learning, the integrated computation at each company is slowed down either by huge volume data or by time-consumin...
详细信息
For the training process of federated linear regression (FLR), which is the simplest form of federated learning, the integrated computation at each company is slowed down either by huge volume data or by time-consuming homomorphic encryption. Targetted at accelerating the training process of FLR, through the incorporation of edge computing aided coded distributed computing (CDC) into intensive computation (matrix multiplication), a novel coded FLR framework is proposed where several edge nodes aid the computing of one company. Two schemes, including linear combination (LC)-based vertical FLR and Matdot-based vertical FLR, are proposed and designed, which enjoy in-parallel computation and homomorphic encryption at the edge nodes. Since workload at each edge node is reduced significantly, the training runtime of these two schemes may be reduced significantly. Numerical studies show that our proposed coded schemes outperform traditional uncoded schemes significantly in terms of overall runtime (sum of encoding, computing, and decoding phases) of the training process. Besides, among the two proposed coded schemes, LC-based scheme and Matdot-based scheme each has its own advantage scenarios which conforms with the analysis.
coded distributed computing (CDC) is a new technique proposed with the purpose of decreasing the intense data exchange required for parallelizing distributedcomputing systems. Under the famous MapReduce paradigm, thi...
详细信息
coded distributed computing (CDC) is a new technique proposed with the purpose of decreasing the intense data exchange required for parallelizing distributedcomputing systems. Under the famous MapReduce paradigm, this coded approach has been shown to decrease this communication overhead by a factor that is linearly proportional to the overall computation load during the mapping phase. In this paper, we propose multi-access distributedcomputing (MADC) as a generalization of the original CDC model, where now mappers (nodes in charge of the map functions) and reducers (nodes in charge of the reduce functions) are distinct computing nodes that are connected through a multi-access network topology. Focusing on the MADC setting with combinatorial topology, which implies Lambda mappers and K reducers such that there is a unique reducer connected to any alpha mappers, we propose a coded scheme and an information-theoretic converse, which jointly identify the optimal inter-reducer communication load, as a function of the computation load, to within a constant gap of 1.5. Additionally, a modified coded scheme and converse identify the optimal max-link communication load across all existing links to within a gap of 4.
We formulate a novel framework to improve security and utility of the coded multi-access edge computing (MEC) network for Internet of Things (IoT) applications where multiple edge servers (ESs) jointly process raw IoT...
详细信息
We formulate a novel framework to improve security and utility of the coded multi-access edge computing (MEC) network for Internet of Things (IoT) applications where multiple edge servers (ESs) jointly process raw IoT data to obtain the final network output. To correctly recover the final output even when some processing outputs produced by malicious or malfunctioning ESs are erroneous, the network utilizes coded distributed computing (CDC) that enhances security by adding computational redundancy to the data processed by ESs. Within the framework, we propose an advanced approach to address limitations of contemporary CDC-based systems related to their inability to guarantee security when the number of malicious ESs is large and reduced network utility due to redundant computations. In this approach, the processing loads are allocated to ESs based on deep learning (DL) algorithms to identify the unknown ESs' types (faithful or malicious) and minimize the load of malicious ESs, thereby optimizing security and utility. The proposed DL algorithms adopt the message passing neural network (NN) - a generalized graph NN with lower complexity and faster convergence than conventional NNs. We prove that our framework yields the optimal security and utility, and verify its superior performance compared with the state-of-the-art schemes.
Federated Learning (FL) is a privacy-preserving collaborative learning approach that trains artificial intelligence (AI) models without revealing local datasets of the FL workers. While FL ensures the privacy of the F...
详细信息
Federated Learning (FL) is a privacy-preserving collaborative learning approach that trains artificial intelligence (AI) models without revealing local datasets of the FL workers. While FL ensures the privacy of the FL workers, its performance is limited by several bottlenecks, which become significant given the increasing amounts of data generated and the size of the FL network. One of the main challenges is the straggler effects where the significant computation delays are caused by the slow FL workers. As such, coded Federated Learning (CFL), which leverages coding techniques to introduce redundant computations to the FL server, has been proposed to reduce the computation latency. In CFL, the FL server helps to compute a subset of the partial gradients based on the composite parity data and aggregates the computed partial gradients with those received from the FL workers. In order to implement the coding schemes over the FL network, incentive mechanisms are important to allocate the resources of the FL workers and data owners efficiently in order to complete the CFL training tasks. In this paper, we consider a two-level incentive mechanism design problem. In the lower level, the data owners are allowed to support the FL training tasks of the FL workers by contributing their data. To model the dynamics of the selection of FL workers by the data owners, an evolutionary game is adopted to achieve an equilibrium solution. In the upper level, a deep learning based auction is proposed to model the competition among the model owners.
This paper proposes a novel framework based on Lagrange codedcomputing (LCC) for fast and secure offloading of computing tasks in the mobile edge computing (MEC) network. The network is formed by multiple base statio...
详细信息
This paper proposes a novel framework based on Lagrange codedcomputing (LCC) for fast and secure offloading of computing tasks in the mobile edge computing (MEC) network. The network is formed by multiple base stations (BSs) acting as "masters" which offload their computations to edge devices acting as "workers". The framework aims to ensure efficient allocation of computing loads and bandwidths to workers, and providing them with proper incentives to finish their tasks by the specified deadlines. Thus, each master must decide on the amounts of allocated load and bandwidth, and a service fee paid to each worker given that: i) other masters, i.e., BSs, can be privately-owned or controlled by different operators, i.e., they do not communicate/coordinate their decisions with the master;ii) workers are heterogeneous non-dedicated edge devices with constrained and nondeterministic computing resources. As such, masters compete for the best workers in a stochastic and partially-observable environment. To describe interactions between masters and workers, we formulate a new stochastic auction model with contingent values of bidders, i.e., masters and contingent payments to auctioneers, i.e., workers. To solve the auction, we represent it as a stochastic Bayesian game and develop machine learning algorithms to improve the auction solution.
We present two novel federated learning (FL) schemes that mitigate the effect of straggling devices by introducing redundancy on the devices' data across the network. Compared to other schemes in the literature, w...
详细信息
We present two novel federated learning (FL) schemes that mitigate the effect of straggling devices by introducing redundancy on the devices' data across the network. Compared to other schemes in the literature, which deal with stragglers or device dropouts by ignoring their contribution, the proposed schemes do not suffer from the client drift problem. The first scheme, codedPaddedFL, mitigates the effect of stragglers while retaining the privacy level of conventional FL. It combines one-time padding for user data privacy with gradient codes to yield straggler resiliency. The second scheme, codedSecAgg, provides straggler resiliency and robustness against model inversion attacks and is based on Shamir's secret sharing. We apply codedPaddedFL and codedSecAgg to a classification problem. For a scenario with 120 devices, codedPaddedFL achieves a speed-up factor of 18 for an accuracy of 95% on the MNIST dataset compared to conventional FL. Furthermore, it yields similar performance in terms of latency compared to a recently proposed scheme by Prakash et al. without the shortcoming of additional leakage of private data. codedSecAgg outperforms the state-of-the-art secure aggregation scheme LightSecAgg by a speed-up factor of 6.6-18.7 for the MNIST dataset for an accuracy of 95%.
We design a novel encoding model based on Lagrange codedcomputing (LCC) for private, secure, and resilient distributed mobile edge computing (MEC) systems, where multiple base stations (BSs) act as "masters"...
详细信息
We design a novel encoding model based on Lagrange codedcomputing (LCC) for private, secure, and resilient distributed mobile edge computing (MEC) systems, where multiple base stations (BSs) act as "masters" offloading their computations to edge nodes acting as "workers". A two-fold objective of the scheme is: i) efficient allocation of computing tasks to the workers;ii) providing the workers with appropriate incentives to complete their tasks. As such, each master must decide on its offloading requests to the workers including the allocated tasks and service fees to be paid. This problem is complex due to the following reasons: i) masters can be privately-owned or managed by different operators, i.e., there is no communication and no coordination among them;ii) workers are heterogeneous non-dedicated nodes with limited and nondeterministic transmission and computing resources. As a result, the masters must compete for constrained resources of workers in a stochastic partially-observable environment. To address this problem, we define the interactions between masters and workers as a direct stochastic first-price-sealed-bid (FPSB) auction. To analyze the auction, we represent it as a stochastic Bayesian game and develop a Bayesian learning framework to perfect the auction solution.
We consider the problems of Private and Secure Matrix Multiplication (PSMM) and Fully Private Matrix Multiplication (FPMM), for which matrices privately selected by a master node are multiplied at distributed worker n...
详细信息
We consider the problems of Private and Secure Matrix Multiplication (PSMM) and Fully Private Matrix Multiplication (FPMM), for which matrices privately selected by a master node are multiplied at distributed worker nodes without revealing the indices of the selected matrices, even when a certain number of workers collude with each other. We propose a novel systematic approach to solve PSMM and FPMM with colluding workers, which leverages solutions to a related Secure Matrix Multiplication (SMM) problem where the data (rather than the indices) of the multiplied matrices are kept private from colluding workers. Specifically, given an SMM strategy based on polynomial codes or Lagrange codes, one can exploit the special structure inspired by the matrix encoding function to design private coded queries for PSMM/FPMM, such that the algebraic structure of the computation result at each worker resembles that of the underlying SMM strategy. Adopting this systematic approach provides novel insights in private query designs for private matrix multiplication, substantially simplifying the processes of designing PSMM and FPMM strategies. Furthermore, the PSMM and FPMM strategies constructed following the proposed approach outperform the state-of-the-art strategies in one or more performance metrics including recovery threshold (minimal number of workers the master needs to wait for before correctly recovering the multiplication result), communication cost, and computation complexity, demonstrating a more flexible tradeoff in optimizing system efficiency.
This paper considers the design of heterogeneous multi-cloud systems for big data storage and computing in the presence of cloud collusion and failures. A fundamental concept of such a system is the secrecy capacity, ...
详细信息
This paper considers the design of heterogeneous multi-cloud systems for big data storage and computing in the presence of cloud collusion and failures. A fundamental concept of such a system is the secrecy capacity, which represents the maximum amount of information that can be stored for each unit of storage space under the requirements of secure distributedcomputing. A capacity-achieving code is designed for matrix multiplication, a computing subroutine widely used in machine learning applications. The code allows fast parallel decoding and unequal data allocation in the clouds. Such a flexibility leads naturally to the idea of optimizing data allocation to minimize the computing time. Given any feasible storage budget, the optimal solution is derived, characterizing explicitly the fundamental tradeoff between storage and computing. Furthermore, it is shown via majorization theory that the whole tradeoff curve improves if the cloud computing rates are more even. Experiments on Amazon EC2 clusters are conducted, corroborating our theoretical observations and the negligibility of decoding overhead.
distributedcomputing systems have been widely used in recent years to handle massive computations required by newly emerged machine learning algorithms and signal processing problems. In practice, a distributed compu...
详细信息
distributedcomputing systems have been widely used in recent years to handle massive computations required by newly emerged machine learning algorithms and signal processing problems. In practice, a distributedcomputing system often receives multiple tasks each needs to be finished by a specific deadline. This necessitates use of a task scheduler which orders and prioritizes tasks executions. In this work, we consider task scheduling for a homogeneous distributedcomputing system with multiple matrix-vector multiplication jobs, and try to maximize the number of tasks completed before their deadlines. The main challenges in such a system are random task arrivals and random execution times due to the straggling effect. To address these challenges, we propose two task scheduling algorithms namely "simple greedy" and "farsighted greedy" and compare their performance with the ultimate upper bound, i.e., a genie-aided algorithm that knows the exact arrival and execution times of all tasks. Our simulation results demonstrate that the proposed algorithms can approach the performance of the genie-aided algorithm.
暂无评论