版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Qassim Univ Coll Engn Dept Elect Engn Buraydah 51452 Saudi Arabia George Washington Univ Dept Elect & Comp Engn Washington DC 20052 USA
出 版 物:《JOURNAL OF PARALLEL AND DISTRIBUTED COMPUTING》 (并行与分布式计算杂志)
年 卷 期:2023年第171卷
页 面:14-23页
核心收录:
学科分类:08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
主 题:Distributed computing Scheduling
摘 要:Data-intensive computing frameworks typically split job workload into fixed-size chunks, allowing them to be processed as parallel tasks on distributed machines. Ideally, when the machines are homogeneous and have identical speed, chunks of equal size would finish processing at the same time. However, such determinism in processing time cannot be guaranteed in practice. Diverging processing times can result from various sources such as system dynamics, machine heterogeneity, and variable network conditions. Such variation, together with dynamics and uncertainty during task processing, can lead to significant performance degradation at job level, due to long tails in job completion time resulted from residual chunk workload and stragglers. In this paper, we propose Forseti, a novel processing scheme that is able to reshape data chunk size on the fly with respect to heterogeneous machines and a dynamic execution environment. Forseti mitigates residual workload and stragglers to achieve significant improvement in performance. We note that Forseti is a fully online scheme and does not require any a priori knowledge of the machine configuration nor job statistics. Instead, it infers such information and adjusts data chunk sizes at runtime, making the solution robust even in environments with high volatility. In its implementation, Forseti also exploits a virtual machine reuse feature to avoid task start-up and initialization cost associated with launching new tasks. We prototype Forseti on a real-world cluster and evaluate its performance using several realistic benchmarks. The results show that Forseti outperforms a number of baselines, including default Hadoop by up to 68% and SkewTune by up to 50% in terms of average job completion time. (c) 2022 Elsevier Inc. All rights reserved.