检索结果-内蒙古大学图书馆

MASTER: Multi-source Transfer Weighted Ensemble Learning for multiple sources Cross-Project Defect Prediction

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING 2024年第5期50卷 1281-1305页

作者： Tong, Haonan Zhang, Dalin Liu, Jiqiang Xing, Weiwei Lu, Lingyun Lu, Wei Wu, Yumei Beijing Jiaotong Univ Sch Software Engn Beijing 100044 Peoples R China Beihang Univ Sch Reliabil & Syst Engn Beijing 100191 Peoples R China

Multi-source cross-project defect prediction (MSCPDP) attempts to transfer defect knowledge learned from multiple source projects to the target project. MSCPDP has drawn increasing attention from academic and industry communities owing to its advantages compared with single-source cross-project defect prediction (SSCPDP). However, two main problems, which are how to effectively extract the transferable knowledge from each source dataset and how to measure the amount of knowledge transferred from each source dataset to the target dataset, seriously restrict the performance of existing MSCPDP models. In this paper, we propose a novel multi-source transfer weighted ensemble learning (MASTER) method for MSCPDP. MASTER measures the weight of each source dataset based on feature importance and distribution difference and then extracts the transferable knowledge based on the proposed feature-weighted transfer learning algorithm. Experiments are performed on 30 software projects. We compare MASTER with the latest state-of-the-art MSCPDP methods with statistical test in terms of famous effort-unaware measures (i.e., PD, PF, AUC, and MCC) and two widely used effort-aware measures (P-opt 20% and IFA). The experiment results show that: 1) MASTER can substantially improve the prediction performance compared with the baselines, e.g., an improvement of at least 49.1% in MCC, 48.1% in IFA;2) MASTER significantly outperforms each baseline on most datasets in terms of AUC, MCC, P-opt 20% and IFA;3) MSCPDP model significantly performs better than the mean case of SSCPDP model on most datasets and even outperforms the best case of SSCPDP on some datasets. It can be concluded that 1) it is very necessary to conduct MSCPDP, and 2) the proposed MASTER is a more promising alternative for MSCPDP.

关键词： multiple source datasets cross-project defect prediction software defect proneness feature weighting transfer learning

来源：评论

学校读者我要写书评

暂无评论

An Empirical Study on Multi-source Cross-Project Defect Prediction Models 29

An Empirical Study on Multi-Source Cross-Project Defect Pred...

引用

29th Asia-Pacific Software Engineering Conference (APSEC)

作者： Liu, Xuanying Li, Zonghao Zou, Jiaqi Tong, Haonan Beijing Jiaotong Univ Sch Software Engn Beijing 100044 Peoples R China

ISBN: (纸本)9781665455374

Multi-source cross-project defect prediction (MSCPDP) refers to transferring defect knowledge from multiple source projects to the target project. MSCPDP has drawn increasing attention of academic and industry communities owing to its advantages compared with single-source cross-project defect prediction (SSCPDP) and some MSCPDP models have been proposed. However, to the best of our knowledge, there are no empirical studies to investigate the effect of different MSCPCP models on the performance of MSCPDP. To comprehensively investigate the performance of different MSCPDP models, we first conduct the literature research about MSCPDP studies, and then identify and compare 7 state-of-the-art MSCPDP models in terms of multiple performance measures including PD, PF, area under ROC curve (AUC), F1, precision, Matthews correlation coefficient (MCC), and Popt20% on 20 publicly available defect datasets. Furthermore, a robust multiple comparison method, i.e., the Scott-Knott effect-size difference (ESD) test, is used for statistical test. The experiment results show that 1) Burak's Filter always performs best in terms of precision, AUC, MCC, Popt20% except for F1;2) MSCPDP models outperform the mean performance of SSCPDP models on most datasets;3) the performance of MSCPDP models still needs to be further improved. We suggest software engineers use MSCPDP models but not SSCPDP models for CPDP and pay more attention to both the distribution difference of different datasets and the problems of sample similarity and weight when building MSCPDP models.

关键词： Empirical Software Engineering Cross-Project Defect Prediction multiple source datasets Software Defect Proneness Mining Software Repository

来源：评论

学校读者我要写书评

暂无评论

MSCPDPLab: A MATLAB toolbox for transfer learning based multi-source cross-project defect prediction

引用

SOFTWAREX 2023年 21卷

作者： Zou, Jiaqi Li, Zonghao Liu, Xuanying Tong, Haonan Beijing Jiaotong Univ Sch Software Engn Beijing 100044 Peoples R China

Software defect prediction (SDP) plays an important role in allocating testing resources and improving testing efficiency. Multi-source cross-project defect prediction (MSCPDP) based on transfer learning refers to transferring defect knowledge from multiple source projects to the target project. MSCPDP has drawn increasing attention from academic and industry communities, and some MSCPDP methods have been proposed. However, most existing MSCPDP models are not open-source. MSCPDPLab replicates nine state-of-the-art MSCPDP models with unified interface and integrates the processes of data loading, model training and testing, and performance evaluation (including 13 performance measures). This paper describes the toolbox's functionalities and presents its ease of use.(c) 2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://***/licenses/by/4.0/).

关键词： multiple source datasets Cross-project defect prediction Mining software repository Transfer learning

来源：评论

学校读者我要写书评

暂无评论

Enhanced multi-dataset transfer learning method for unsupervised person re-identification using co-training strategy

引用

IET COMPUTER VISION 2018年第8期12卷 1219-1227页

作者： Xian, Yuqiao Hu, Haifeng Sun Yat Sen Univ Sch Elect & Informat Technol Higher Educ Mega Ctr Waihuan East Rd Guangzhou Guangdong Peoples R China

This study proposes progressive unsupervised co-learning for unsupervised person re-identification by introducing a co-training strategy in an iterative training process. The authors' method adopts an iterative training process to improve transferred models by iterating among clustering, selection, exchange, and fine-tuning. To solve the problem of transferring representations learned from multiple source datasets, their method utilises multiple convolutional neural network (CNN) models trained on different labelled source datasets by feeding soft labels obtained by clustering on target dataset to each other. The enhanced model can learn more discriminative person representations than the single model trained on multiple datasets. Experimental results on two large-scale benchmark datasets (i.e. DukeMTMC-reID and Market-1501) demonstrate that their method can enhance transferred CNN models by using more source datasets and is competitive to the state-of-the-art methods.

关键词： neural nets unsupervised learning image recognition image representation iterative methods pattern clustering enhanced multidataset transfer learning method unsupervised person re-identification co-training strategy progressive unsupervised co-learning iterative training process transferred models multiple source datasets discriminative person representations single model large-scale benchmark datasets CNN models labelled source datasets multiple convolutional neural network models soft labels target dataset clustering

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：