In the traditional centralized Android malware classification framework, privacy concerns arise as it requires collecting users’ app samples containing sensitive information directly. To address this problem, new cla...
详细信息
ISBN:
(数字)9798350381993
ISBN:
(纸本)9798350382006
In the traditional centralized Android malware classification framework, privacy concerns arise as it requires collecting users’ app samples containing sensitive information directly. To address this problem, new classification frameworks based on Federated Learning (FL) have emerged for privacy preservation. However, research shows that these frameworks still face risks of indirect information leakage due to adversary inference. Unfortunately, existing research lacks an effective assessment of the extent and location of this leakage risk. To bridge the gap, we propose the FedDLM, which provides a fine-grained assessment of the risk of sensitive information leakage in an FL-based Android malware classifier. FedDLM estimates attackers’ theoretical maximum inference ability from the information theory perspective to gauge the degree of leakage risk in the classifier effectively. It precisely identifies critical positions in the shared gradient where the leakage risk exists by utilizing characteristics of class activation in classifiers. Through extensive experiments on the Androzoo dataset, FedDLM demonstrates its superior effectiveness and precision compared to baseline methods in evaluating the risk of sensitive information leakage. The evaluation results provide valuable insights into information leakage problems in classifiers and targeted privacy protection methods.
The data-driven deep learning methods have brought significant progress and potential to intrusion detection. However, there are two thorny problems caused by the characteristics of intrusion data: "multi-type fe...
详细信息
ISBN:
(数字)9798350381993
ISBN:
(纸本)9798350382006
The data-driven deep learning methods have brought significant progress and potential to intrusion detection. However, there are two thorny problems caused by the characteristics of intrusion data: "multi-type features" and "data imbalance". The former means that forcefully and improperly transforming intrusion features from distinct metric spaces can result in semantic loss and noise. The latter indicates that the intrusion data is imbalanced in quantity and quality due to its complex spatial distribution. We propose a Hybrid Framework for Multi-type and Imbalance Data (HF-Mid) to address the above two problems. Firstly, we divide the intrusion features into equivalent and non-equivalent groups, and then embed them sequentially using Supervised Paragraph Vector-Distributed Memory (SPV-DM), which excels at modeling co-occurrence relationships, and Deep Neural Network (DNN), which is suitable for modeling non-linear relationships, thereby solving the "multitype features" problem. Secondly, we adopt a low-noise collective matrix factorization (CMF) model to fuse the two obtained features for dimensionality reduction. Finally, we employ a multiple classifier to detect intrusion. During the classifier training stage, we design a genetic algorithm-based proportional sampling method to select high-quality samples in each training batch. thus addressing the "data imbalance" problem. The experimental results demonstrate the proposed framework exhibits an overall improvement of 5.9% and 1.5% in terms of accuracy and false positive rate on average, respectively.
In the traditional centralized Android malware classification framework, privacy concerns exist due to collected users’ apps containing sensitive information. A new classification framework based on Federated Learnin...
详细信息
ISBN:
(数字)9798350349184
ISBN:
(纸本)9798350349191
In the traditional centralized Android malware classification framework, privacy concerns exist due to collected users’ apps containing sensitive information. A new classification framework based on Federated Learning (FL) has emerged to protect privacy. However, significant spatiotemporal heterogeneity exists in the distribution of Android malware samples in different clients. It presents a huge challenge to existing FL schemes, as trained local models differ significantly, resulting in slower model convergence and lower classification accuracy. To bridge this gap, we propose FedDRC, a robust FL-based Android malware classifier. First, we design a functional semantic embedding mechanism of API features, FSEM, using word embedding to improve the robustness of the model to the time heterogeneity of the client’s samples. Secondly, we use the idea of Information Bottleneck (IB) and transfer learning to design a robust local model, PAMIB, to deal with the model degradation caused by the space heterogeneity of the distribution of client samples. Extensive experiments on the Androzoo dataset show that FedDRC has the best robustness for Android malware classification tasks in various heterogeneity distribution settings: fastest convergence and best classification accuracy.
With the increasing complexity of users' needs and increasing uncertainty of a single web service in big data environment, service composition becomes more and more difficult. In order to improve the solution accu...
With the increasing complexity of users' needs and increasing uncertainty of a single web service in big data environment, service composition becomes more and more difficult. In order to improve the solution accuracy and computing speed of the constrained optimization model, several improvements are raised on ant colony optimization (ACO) and its calculation strategy. We introduce beetle antenna search (BAS) strategy to avoid the danger of falling into local optimization, and a service composition method based on fusing beetle-ant colony optimization algorithm (Be-ACO) is proposed. The model first generates search subspace for ant colony through beetle antenna search strategy and optimization service set by traversing subspace based on ant colony algorithm. Continuously rely on beetle antenna search strategy to generate the next search subspace in global scope for ant colony to traverse and converge to the global optimal solution finally. The experimental results show that compared with the traditional optimization method, the proposed method improves combination optimization convergence performance and solution accuracy greatly.
The federated Android malware classifier has attracted much attention owing to its advantages of privacy protection and multi-party joint modeling. However, the research indicates that the gradient transmitted within ...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329
The federated Android malware classifier has attracted much attention owing to its advantages of privacy protection and multi-party joint modeling. However, the research indicates that the gradient transmitted within the federated classifier still encodes the user's sensitive information, exposing it to indirect privacy inference threats from curious servers. Differential privacy is a recognized and effective way to address this privacy breach threat by adding noise to the user's model parameters to limit the attacker's inference of sensitive information. However, the protection effect of existing differential privacy methods is at the cost of significantly reducing the model's classification accuracy, and it cannot be reasonably balanced. To address this challenge, we propose a privacy protection method, FedDADP. FedDADP performs adaptive, lightweight privacy configuration in its training time dimension and model space dimension according to the privacy risk distribution law in the federated Android malware classifier to protect users' privacy while maintaining the model's utility. Numerous experiments on the Androzoo dataset and multiple baseline classifiers show that FedDADP protects users' sensitive information better (7% more effectiveness against adversaries' inference) than baseline differential privacy methods and achieves better model utility (classification accuracy improves by about 8%) with the same privacy budget.
This study employs ground observation data and hourly ERA5 data from 1980 to 2020, utilizing methods such as trend analysis and M-K tests, to explore the spatiotemporal evolution characteristics and difference analysi...
详细信息
暂无评论