检索结果-内蒙古大学图书馆

IEEE JOURNAL ON SELECTED AREAS IN INFORMATION THEORY 2021年第3期2卷 855-866页

作者： Glasgow, Margalit Wootters, Mary Stanford Univ Dept Comp Sci Stanford CA 94305 USA Stanford Univ Dept Elect Engn Stanford CA 94305 USA

gradient codes use data replication to mitigate the effect of straggling machines in distributed machine learning. Approximate gradient codes consider codes where the data replication factor is too low to recover the full gradient exactly. Our work is motivated by the challenge of designing approximate gradient codes that simultaneously work well in both the adversarial and random straggler models. We introduce novel approximate gradient codes based on expander graphs. We analyze the decoding error both for random and adversarial stragglers, when optimal decoding coefficients are used. With random stragglers, our codes achieve an error to the gradient that decays exponentially in the replication factor. With adversarial stragglers, the error is smaller than any existing code with similar performance in the random setting. We prove convergence bounds in both settings for coded gradient descent under standard assumptions. With random stragglers, our convergence rate improves upon rates obtained via black-box approaches. With adversarial stragglers, we show that gradient descent converges down to a noise floor that scales linearly with the adversarial error to the gradient. We demonstrate empirically that our codes achieve near-optimal error with random stragglers and converge faster than algorithms that do not use optimal decoding coefficients.

关键词： Coded computing gradient coding optimization random graphs

来源：评论

学校读者我要写书评

暂无评论

Optimal Communication-Computation Trade-Off in Heterogeneous gradient coding

IEEE JOURNAL ON SELECTED AREAS IN INFORMATION THEORY

引用

IEEE JOURNAL ON SELECTED AREAS IN INFORMATION THEORY 2021年第3期2卷 1002-1011页

作者： Jahani-Nezhad, Tayyebeh Maddah-Ali, Mohammad Ali Sharif Univ Technol Dept Elect Engn Tehran *** Iran

gradient coding allows a master node to derive the aggregate of the partial gradients, calculated by some worker nodes over the local data sets, with minimum communication cost, and in the presence of stragglers. In this paper, for gradient coding with linear encoding, we characterize the optimum communication cost for heterogeneous distributed systems with arbitrary data placement, with s is an element of N stragglers and a is an element of N adversarial nodes. In particular, we show that the optimum communication cost, normalized by the size of the gradient vectors, is equal to (r - s - 2a)(-1), where r is an element of N is the minimum number that a data partition is replicated. In other words, the communication cost is determined by the data partition with the minimum replication, irrespective of the structure of the placement. The proposed achievable scheme also allows us to target the computation of a polynomial function of the aggregated gradient matrix. It also allows us to borrow some ideas from approximation computing and propose an approximate gradient coding scheme for the cases when the repetition in data placement is smaller than what is needed to meet the restriction imposed on communication cost or when the number of stragglers appears to be more than the presumed value in the system design.

关键词： gradient coding heterogeneous systems communication-computation trade-off distributed computing approximate computing

来源：评论

学校读者我要写书评

暂无评论

Sequential gradient coding for Packet-Loss Networks

IEEE JOURNAL ON SELECTED AREAS IN INFORMATION THEORY

引用

IEEE JOURNAL ON SELECTED AREAS IN INFORMATION THEORY 2021年第3期2卷 919-930页

作者： Krishnan, M. Nikhil Hosseini, Erfan Khisti, Ashish Univ Toronto Dept Elect & Comp Engn Toronto ON M5S 3G4 Canada Int Inst Informat Technol Bangalore Bengaluru 560100 India

We consider distributed computation of a sequence of J gradients {g(0), . . . , g(J - 1)}. Each worker node computes a fraction of g(t) in round-t and attempts to communicate the result to a master. Master is required to obtain the full gradient g(t) by the end of round-(t+T). The goal here is to finish all the J gradient computations, keeping the cumulative processing time as short as possible. Delayed availability of results from individual workers causes bottlenecks in this setting. These delays can be due to factors such as processing delay of workers and packet losses. gradient coding (GC) framework introduced by Tandon et al. uses coding theoretic techniques to mitigate the effect of delayed responses from workers. In this paper, we primarily target mitigating communication-level delays. In contrast to the classical GC approach which performs coding only across workers (T = 0), the proposed sequential gradient coding framework is more general, as it allows for coding across workers as well as time. We present a new sequential gradient coding scheme which offers improved resiliency against communication-level delays compared to the GC scheme, without increasing computational load. Our experimental results establish performance improvement offered by the new coding scheme.

关键词： gradient coding distributed computation erasure coding packet-loss networks

来源：评论

学校读者我要写书评

暂无评论

Approximate gradient coding for Heterogeneous Nodes

Approximate Gradient Coding for Heterogeneous Nodes

引用

IEEE Information Theory Workshop (ITW)

作者： Johri, Amogh Yardi, Arti Bodas, Tejas IIIT Bangalore Bangalore Karnataka India TCS Res Bangalore Karnataka India

ISBN: (纸本)9781665403122

In distributed machine learning (DML), the training data is distributed across multiple worker nodes to perform the underlying training in parallel. One major problem affecting the performance of DML algorithms is presence of stragglers. These are nodes that are terribly slow in performing their task which results in under-utilization of the training data that is stored in them. Towards this, gradient coding mitigates the impact of stragglers by adding sufficient redundancy in the data. gradient coding and other straggler mitigation schemes assume that the straggler behavior of the worker nodes is identical. Our experiments on the Amazon AWS cluster however suggest otherwise and we see that there is a correlation in the straggler behavior across iterations. To model this, we introduce a heterogeneous straggler model where nodes are categorized into two classes, slow and active. To better utilize training data stored with slow nodes, we modify the existing gradient coding schemes with shuffling of the training data among workers. Our results (both simulation and cloud experiments) suggest remarkable improvement with shuffling over existing schemes. We perform theoretical analysis for the proposed models justifying their utility.

关键词： Distributed machine learning gradient coding Straggler mitigation in synchronous distributed computing

来源：评论

学校读者我要写书评

暂无评论

Heterogeneity-aware gradient coding for Straggler Tolerance 39

Heterogeneity-aware Gradient Coding for Straggler Tolerance

引用

39th IEEE International Conference on Distributed Computing Systems (ICDCS)

作者： Wang, Haozhao Guo, Song Tang, Bin Li, Ruixuan Li, Chengjie Huazhong Univ Sci & Technol Sch Comp Sci & Technol Wuhan 430074 Peoples R China Hong Kong Polytech Univ Dept Comp Hong Kong Peoples R China Nanjing Univ Natl Key Lab Novel Software Technol Nanjing Peoples R China

ISBN: (纸本)9781728125190

gradient descent algorithms are widely used in machine learning. In order to deal with huge volume of data, we consider the implementation of gradient descent algorithms in a distributed computing setting where multiple workers compute the gradient over some partial data and the master node aggregates their results to obtain the gradient over the whole data. However, its performance can be severely affected by straggler workers. Recently, some coding-based approaches are introduced to mitigate the straggler problem, but they are efficient only when the workers are homogeneous, i.e., having the same computation capabilities. In this paper, we consider that the workers are heterogenous which are common in modern distributed systems. We propose a novel heterogeneity-aware gradient coding scheme which can not only tolerate a predetermined number of stragglers but also fully utilize the computation capabilities of heterogenous workers. We show that this scheme is optimal when the computation capabilities of workers are estimated accurately. A variant of this scheme is further proposed to improve the performance when the estimations of the computation capabilities are not so accurate. We conduct our schemes for gradient descent based image classification on QingCloud clusters. Evaluation results show that our schemes can reduce the whole computation time by up to 3x compared with a state-of-the-art coding scheme.

关键词： Modern distributed system straggler tolerance gradient coding heterogeneity-aware

来源：评论

学校读者我要写书评

暂无评论

Iterative Sketching for Secure Coded Regression

IEEE JOURNAL ON SELECTED AREAS IN INFORMATION THEORY

引用

IEEE JOURNAL ON SELECTED AREAS IN INFORMATION THEORY 2024年 5卷 148-161页

作者： Charalambides, Neophytos Mahdavifar, Hessam Pilanci, Mert Hero, Alfred O., III Univ Calif San Diego Halicioglu Data Sci Inst La Jolla CA 92093 USA Univ Michigan Dept Elect Engn & Comp Sci Ann Arbor MI 48104 USA Northeastern Univ Dept Elect & Comp Engn Boston MA 02115 USA Stanford Univ Dept Elect Engn Stanford CA 94305 USA

Linear regression is a fundamental and primitive problem in supervised machine learning, with applications ranging from epidemiology to finance. In this work, we propose methods for speeding up distributed linear regression. We do so by leveraging randomized techniques, while also ensuring security and straggler resiliency in asynchronous distributed computing systems. Specifically, we randomly rotate the basis of the system of equations and then subsample blocks, to simultaneously secure the information and reduce the dimension of the regression problem. In our setup, the basis rotation corresponds to an encoded encryption in an approximate gradient coding scheme, and the subsampling corresponds to the responses of the non-straggling servers in the centralized coded computing framework. This results in a distributive iterative stochastic approach for matrix compression and steepest descent.

关键词： Coded computing gradient coding subspace embedding linear regression distributed computing compression

来源：评论

学校读者我要写书评

暂无评论

Soft BIBD and Product gradient Codes

IEEE JOURNAL ON SELECTED AREAS IN INFORMATION THEORY

引用

IEEE JOURNAL ON SELECTED AREAS IN INFORMATION THEORY 2022年第2期3卷 229-240页

作者： Sakorikar, Animesh Wang, Lele Univ British Columbia Dept Elect & Comp Engn Vancouver BC V6T 1Z4 Canada

gradient coding is a coding theoretic framework to provide robustness against slow or unresponsive machines, known as stragglers, in distributed machine learning applications. Recently, Kadhe et al. (2019) proposed a gradient code based on a combinatorial design, called balanced incomplete block design (BIBD), which is shown to outperform many existing gradient codes in worst-case adversarial straggling scenarios. However, parameters for which such BIBD constructions exist are very limited (Colbourn and Dinitz, 2006). In this paper, we aim to overcome such limitations and construct gradient codes which exist for a wide range of system parameters while retaining the superior performance of BIBD gradient codes. Two such constructions are proposed, one based on a probabilistic construction that relax the stringent BIBD gradient code constraints, and the other based on taking the Kronecker product of existing gradient codes. The proposed gradient codes allow flexible choices of system parameters while retaining comparable error performance.

关键词： gradient coding stragglers distributed machine learning combinatorial designs BIBD

来源：评论

学校读者我要写书评

暂无评论

A Code-Based Distributed gradient Descent Method 56

A Code-Based Distributed Gradient Descent Method

引用

56th Annual Allerton Conference on Communication, Control, and Computing (Allerton)

作者： Atallah, Elie Rahnavard, Nazanin Univ Cent Florida Dept Elect & Comp Engn Orlando FL 32816 USA

ISBN: (纸本)9781538665961

Distributed gradient descent is an optimization algorithm that is used to solve a minimization problem distributed over a network through minimizing local functions that sum up to form the overall objective function. These local functions f(i)(.) contribute to local gradients adding up incrementally to form the overall gradient. Recently, the gradient coding paradigm was introduced for networks with a centralized fusion center to resolve the problem of straggler nodes. Through introducing some kind of redundancy on each node, such coding schemes are utilized to form new coded local functions g(i) from the original local functions f(i). In this work, we consider a distributed network with a defined network topology and no fusion center. At each node, linear combinations of the local coded gradients del(g) over bari can be constructed to form the overall gradient. Our iterative method, referred to as Code-Based Distributed gradient Descent (CDGD), updates each node's local estimate by applying an adequate weighing scheme. This scheme adapts the coded local gradient descent step along with local estimates from neighboring nodes. We provide the convergence analysis for CDGD and we analytically show that we enhance the convergence rate by a scaling factor over conventional incremental methods without any predefined tuning. Furthermore, we demonstrate through numerical results significant performance and enhancements for convergence rates.

关键词： decentralized optimization gradient coding consensus distributed networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：