版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Chongqing Univ Sch Software Engn Chongqing Peoples R China Univ Cincinnati Dept Elect Engn & Comp Sci Cincinnati OH USA
出 版 物:《SOFTWARE QUALITY JOURNAL》 (软件质量杂志)
年 卷 期:2021年第29卷第2期
页 面:405-430页
核心收录:
学科分类:08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:National Key Research and Development Project [2019YFB1706101] Science-Technology Foundation of Chongqing, China [cstc2019jscx-mbdx0083]
主 题:Autoencoder Heterogeneous cross-project defect prediction Multi-source transfer learning Modified autoencoder
摘 要:Heterogeneous cross-project defect prediction (HCPDP) is aimed at building a defect prediction model for the target project by reusing datasets from source projects, where the source project datasets and target project dataset have different features. Most existing HCPDP methods only remove redundant or unrelated features without exploring the underlying features of cross-project datasets. Additionally, when the transfer learning method is used in HCPDP, these methods ignore the negative effect of transfer learning. In this paper, we propose a novel HCPDP method called multi-source heterogeneous cross-project defect prediction (MHCPDP). To reduce the gap between the target datasets and the source datasets, MHCPDP uses the autoencoder to extract the intermediate features from the original datasets instead of simply removing redundant and unrelated features and adopts a modified autoencoder algorithm to make instance selection for eliminating irrelevant instances from the source domain datasets. Furthermore, by incorporating multiple source projects to increase the number of source datasets, MHCPDP develops a multi-source transfer learning algorithm to reduce the impact of negative transfers and upgrade the performance of the classifier. We comprehensively evaluate MHCPDP on five open source datasets;our experimental results show that MHCPDP not only has significant improvement in two performance metrics but also overcomes the shortcomings of the conventional HCPDP methods.