This paper explores constrainednon-convex personalized federated learning (PFL), in which a group of workers train local models and a global model, under the coordination of a server. To address the challenges of eff...
详细信息
This paper explores constrainednon-convex personalized federated learning (PFL), in which a group of workers train local models and a global model, under the coordination of a server. To address the challenges of efficient information exchange and robustness against the so-called Byzantine workers, we propose a projected stochastic gradient descent algorithm for PFL that simultaneously ensures Byzantine-robustness and communication efficiency. We implement personalized learning at the workers aided by the global model, and employ a Huber function-based robust aggregation with an adaptive threshold-selecting strategy at the server to reduce the effects of Byzantine attacks. To improve communication efficiency, we incorporate random communication that allows multiple local updates per communication round. We establish the convergence of our algorithm, showing the effects of Byzantine attacks, random communication, and stochastic gradients on the learning error. Numerical experiments demonstrate the superiority of our algorithm in neural network training compared to existing ones.
Multiplex networks are a representation of real-world complex systems as a set of entities (i.e. nodes) connected via different types of connections (i.e. layers). The observed connections in these networks may not be...
详细信息
Multiplex networks are a representation of real-world complex systems as a set of entities (i.e. nodes) connected via different types of connections (i.e. layers). The observed connections in these networks may not be complete and the link prediction task is about locating the missing links across layers. Here, the main challenge is about collecting relevant evidence from different layers to assist the link prediction task. It is known that co-membership in communities increases the likelihood of connectivity between nodes. We discuss that co-membership in the communities of the similar layers augments the chance of connectivity. The layers are considered similar if they show significant inter-layer community overlap. Moreover, we found that although the presence of link is correlated in layers but the extent of this correlation is not the same across different communities. Our proposed, ML-BNMTF, as a link prediction method in multiplex networks, is devised based on these findings. ML-BNMTF outperforms baseline methods specifically when the global link overlap is low. (C) 2020 Published by Elsevier B.V.
PAC-Bayesian set up involves a stochastic classifier characterized by a posterior distribution on a classifier set, offers a high probability bound on its averaged true risk and is robust to the training sample used. ...
详细信息
PAC-Bayesian set up involves a stochastic classifier characterized by a posterior distribution on a classifier set, offers a high probability bound on its averaged true risk and is robust to the training sample used. For a given posterior, this bound captures the trade off between averaged empirical risk and KL-divergence based model complexity term. Our goal is to identify an optimal posterior with the least PAC-Bayesian bound. We consider a finite classifier set and 5 distance functions: KL-divergence, its Pinsker's and a sixth degree polynomial approximations;linear and squared distances. Linear distance based model results in a convexoptimization problem and we obtain a closed form expression for its optimal posterior. For uniform prior, this posterior has full support with weights negative-exponentially proportional to number of misclassifications. Squared distance and Pinsker's approximation bounds are possibly quasi-convex and are observed to have single local minimum. We derive fixed point equations (FPEs) using partial KKT system with strict positivity constraints. This obviates the combinatorial search for subset support of the optimal posterior. For uniform prior, exponential search on a full-dimensional simplex can be limited to an ordered subset of classifiers with increasing empirical risk values. These FPEs converge rapidly to a stationary point, even for a large classifier set when a solver fails. We apply these approaches to SVMs generated using a finite set of SVM regularization parameter values on 9 UCI datasets. The resulting optimal posteriors (on the set of regularization parameters) yield stochastic SVM classifiers with tight bounds. KL-divergence based bound is the tightest, but is computationally expensive due to its non-convex nature and multiple calls to a root finding algorithm. Optimal posteriors for all 5 distance functions have lowest 10% test error values on most datasets, with that of linear distance being the easiest to obtain.
暂无评论