Cross-project defect prediction (CPDP) utilizes the existing labeled data in the source project to assist with the prediction of unlabeled projects in the target dataset, which effectively improves the prediction perf...
详细信息
Cross-project defect prediction (CPDP) utilizes the existing labeled data in the source project to assist with the prediction of unlabeled projects in the target dataset, which effectively improves the prediction performance and has become a research hotspot in software engineering. At present, CPDP can be categorized into homogeneous cross-project defect prediction and heterogeneous cross-project defect prediction (HDP), in which HDP doesn’t require that the source project and the target project have the same feature space, thus, it is more widely used in the actual CPDP. Most of current HDP methods map the original features to the latent feature space and reduce the inter-project variation by transferring domain-independent features, but the transferring process ignores the use of domain-related features, which affects the prediction performance of the model. Moreover, the mapped latent features are not conducive to the model’s interpretability. Based on these, this paper proposes a heterogeneous defect prediction method based on feature disentanglement (FD-HDP). We disentangle the features using domain-related and domain-independent feature extractors, respectively, to improve the interpretability of the model by maximizing the domain adversarial loss during training and guiding the feature extractors to produce accurate domain-related and domain-independent features. The weighted sum of the prediction results from domain-related and domain-independent predictors is used as the final prediction result of the project during the prediction process, which realizes the combination of domain-independent and domain-related features and effectively improves the prediction performance. In this paper, we conducted experiments using four publicly available defect datasets to construct heterogeneous scenarios. The results demonstrate that the FD-HDP model shows significant advantages over state-of-the-art methods in six metrics.
As an advanced carrier of on-board sensors, connected autonomous vehicle (CAV) can be viewed as an aggregation of self-adaptive systems with monitor-analyze-plan-execute (MAPE) for vehicle-related services. Meanwhile,...
详细信息
As an advanced carrier of on-board sensors, connected autonomous vehicle (CAV) can be viewed as an aggregation of self-adaptive systems with monitor-analyze-plan-execute (MAPE) for vehicle-related services. Meanwhile, machine learning (ML) has been applied to enhance analysis and plan functions of MAPE so that self-adaptive systems have optimal adaption to changing conditions. However, most of ML-based approaches don’t utilize CAVs’ connectivity to collaboratively generate an optimal learner for MAPE, because of sensor data threatened by gradient leakage attack (GLA). In this article, we first design an intelligent architecture for MAPE-based self-adaptive systems on Web 3.0-based CAVs, in which a collaborative machine learner supports the capabilities of managing systems. Then, we observe by practical experiments that importance sampling of sparse vector technique (SVT) approaches cannot defend GLA well. Next, we propose a fine-grained SVT approach to secure the learner in MAPE-based self-adaptive systems, that uses layer and gradient sampling to select uniform and important gradients. At last, extensive experiments show that our private learner spends a slight utility cost for MAPE (e.g., \(0.77\%\) decrease in accuracy) defending GLA and outperforms the typical SVT approaches in terms of defense (increased by \(10\%\sim 14\%\) attack success rate) and utility (decreased by \(1.29\%\) accuracy loss).
knowledge Graphs (KGs) often suffer from incompleteness and this issue motivates the task of knowledge Graph Completion (KGC). Traditional KGC models mainly concentrate on static KGs with a fixed set of entities and r...
详细信息
knowledge Graphs (KGs) often suffer from incompleteness and this issue motivates the task of knowledge Graph Completion (KGC). Traditional KGC models mainly concentrate on static KGs with a fixed set of entities and relations, or dynamic KGs with temporal characteristics, faltering in their generalization to constantly evolving KGs with possible irregular entity drift. Thus, in this paper, we propose a novel link prediction model based on the embedding representation to handle the incompleteness of KGs with entity drift, termed as DCEL. Unlike traditional link prediction, DCEL could generate precise embeddings for drifted entity without imposing any regular temporal characteristic. The drifted entity is added into the KG with its links to the existing entity predicted in an incremental fashion with no requirement to retrain the whole KG for computational efficiency. In terms of DCEL model, it fully takes advantages of unstructured textual description, and is composed of four modules, namely MRC (Machine Reading Comprehension), RCAA (Relation Constraint Attentive Aggregator), RSA (Relation Specific Alignment) and RCEO (Relation Constraint Embedding Optimization). Specifically, the MRC module is first employed to extract short texts from long and redundant descriptions. Then, RCAA is used to aggregate the embeddings of textual description of drifted entity and the pre-trained word embeddings learned from corpus to a single text-based entity embedding while shielding the impact of noise and irrelevant information. After that, RSA is applied to align the text-based entity embedding to graph-based space to obtain the corresponding graph-based entity embedding, and then the learned embeddings are fed into the gate structure to be optimized based on the RCEO to improve the accuracy of representation learning. Finally, the graph-based model TransE is used to perform link prediction for drifted entity. Extensive experiments conducted on benchmark datasets in terms of evaluat
暂无评论