版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:China Normal Univ Minist Educ Engn Res Ctr Software Hardware Codesign Technol & Shanghai Key Lab Trustworthy CompSoftware Engn In Shanghai 200062 Peoples R China Dalian Maritime Univ Dept Houston Int Inst Dalian 116026 Liaoning Peoples R China
出 版 物:《IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS》 (IEEE Trans Parallel Distrib Syst)
年 卷 期:2023年第34卷第12期
页 面:3280-3293页
核心收录:
学科分类:0808[工学-电气工程] 08[工学] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:National Key Research and Development Program of China [2021ZD0114600 2020AAA0107401]
主 题:Deep forest distributed computing Big Data bootstrap distributed AI
摘 要:As an alternative to the deep learning model, deep forest outperforms deep neural networks in many aspects with fewer hyperparameters and better robustness. To improve the computing performance of deep forest, ForestLayer proposes an efficient task-parallel algorithm S-FTA at a fine sub-forest granularity, but the granularity of the sub-forest cannot be adaptively adjusted. BLB-gcForest further proposes an adaptive sub-forest splitting algorithm to dynamically adjust the sub-forest granularity. However, with distributed storage, its BLB method needs to scan the whole dataset when sampling, which generates considerable communication overhead. Moreover, BLB-gcForest s tree-based vector aggregation produces extensive redundant transfers and significantly degrades the system s performance in vector aggregation stage. To deal with these existing issues and further improve the computing efficiency and scalability of the distributed deep forest, in this paper, we propose a novel Computing-Efficient and RobusT distributed Deep Forest framework, named CERT-DF. CERT-DF integrates three customized schemes, namely, block-level pre-sampling, two-stage pre-aggregation, and system-level backup. Specifically, CERT-DF adopts the block-level pre-sampling method to implement data blocks local sampling eliminating frequent data remote access and maximizing parallel efficiency, applies the two-stage pre-aggregation method to adjust the class vector aggregation granularity to greatly decrease the communication overhead, and leverages the system-level backup method to enhance the system s disaster tolerance and immensely accelerate task recovery with minimal system resource overhead. Comprehensive experimental evaluations on multiple datasets show that our CERT-DF significantly outperforms the state-of-the-art approaches with higher computing efficiency, lower system resource overhead, and better system robustness while ensuring good accuracy.