android's widespread adoption as the leading mobile operating system, it has become a prominent target for malware attacks. Many of these attacks employ advanced obfuscation techniques, rendering traditional detec...
详细信息
android's widespread adoption as the leading mobile operating system, it has become a prominent target for malware attacks. Many of these attacks employ advanced obfuscation techniques, rendering traditional detection methods, such as static and dynamic analysis, less effective. Image-based approaches provide an alternative for effective detection that addresses some limitations of conventional methods. This research introduces a novel image-based framework for android malware detection. Using the CICMalDroid 2020 dataset, Dalvik Executable (DEX) files from android Package (APK) files are extracted and converted into grayscale images, with dimensions scaled according to file size to preserve structural characteristics. Various Convolutional Neural Network (CNN) models are then employed to classify benign and malicious applications, with performance further enhanced through a weighted voting ensemble optimized by Bayesian Optimization to balance the contribution of each model. An ablation study was conducted to demonstrate the effectiveness of the six-model ensemble, showing consistent improvements in accuracy as models were added incrementally, culminating in the highest accuracy of 99.3%. This result surpasses previous research benchmarks in android malware detection, validating the robustness and efficiency of the proposed methodology.
The widespread adoption of the android system has sharply increased malicious android applications, posing multifaceted threats to users. Given the problems of the single feature category and high computational overhe...
详细信息
The widespread adoption of the android system has sharply increased malicious android applications, posing multifaceted threats to users. Given the problems of the single feature category and high computational overhead in current android malware detection methods, this paper proposes a detection method based on feature fusion and an improved stacking integration model. First, seven types of numerical features and function call graphs (FCGs) are extracted from android installation package files. Next, we apply five statistical methods to filter the numerical features, calculate four types of graph centrality indicators for risky API (application programming interface) nodes in the FCG, and concatenate them as the features of android applications. These two categories of features correspond to the semantic and structural features of android applications, respectively, which are static and do not need to be extracted with the help of deep learning techniques. Finally, this approach improves the stacking algorithm, constructing an ensemble classification model that considers the performance differences of the base classifier models and improves detection accuracy by weighting their outputs. The test results on the public dataset CICMalDroid2020 reveal that the method can achieve a detection accuracy of 98.16% and demonstrates the effectiveness of the feature fusion and integrated model.
Due to the vast array of android applications, their multifarious functions and intricate behavioral semantics, attackers can adopt various tactics to conceal their genuine attack intentions within legitimate function...
详细信息
Due to the vast array of android applications, their multifarious functions and intricate behavioral semantics, attackers can adopt various tactics to conceal their genuine attack intentions within legitimate functions. However, numerous learning-based methods suffer from a limitation in mining behavioral semantic information, thus impeding the accuracy and efficiency of android malware detection. Besides, the majority of existing learning-based methods are weakly interpretive and fail to furnish researchers with effective and readable detection reports. Inspired by the success of the Large Language Models (LLMs) in natural language understanding, we propose AppPoet, a LLM-assisted multi-view system for android malware detection. Firstly, AppPoet employs a static method to comprehensively collect application features and formulate various observation views. Then, using our carefully crafted multi-view prompt templates, it guides the LLM to generate function descriptions and behavioral summaries for each view, enabling deep semantic analysis of the views. Finally, we collaboratively fuse the multi-view information to efficiently and accurately detect malware through a deep neural network (DNN) classifier and then generate the human-readable diagnostic reports. Experimental results demonstrate that our method achieves a detection accuracy of 97.15% and an F1 score of 97.21%, which is superior to the baseline methods. Furthermore, the case study evaluates the effectiveness of our generated diagnostic reports.
android operating system, renowned for its open-source nature and flexibility, holds the largest global market share, yet faces significant security challenges, particularly from malware threats. Existing studies ofte...
详细信息
android operating system, renowned for its open-source nature and flexibility, holds the largest global market share, yet faces significant security challenges, particularly from malware threats. Existing studies often rely on complex feature engineering for malwaredetection, leading to cumbersome methods prone to noise and lacking effective feature selection mechanisms. Some deep learning approaches also suffer from low efficiency. This paper introduces a lightweight and interpretable android malware detection system called "FEdroid." By focusing on code segments that utilize sensitive APIs, the system simplifies the analysis process and extracts key information, employing XGBoost for cross-feature selection to concentrate on a minimal yet crucial feature set. This approach enhances detection accuracy while reducing device resource usage. Experimental results demonstrate that the system achieved an accuracy of 98.26% and a false negative rate of only 1.86% across 18,653 APK samples, significantly improving detection efficiency and accuracy while minimizing deployment resource dependency. Furthermore, the application of Shapley values for interpretive analysis greatly enhances the transparency and understandability of the classifier model, thereby improving the overall interpretability of the system.
The exponential growth of android applications has resulted in a surge of malware threats, posing severe risks to user privacy and data security. To address these challenges, this study introduces a novel malware dete...
详细信息
The exponential growth of android applications has resulted in a surge of malware threats, posing severe risks to user privacy and data security. To address these challenges, this study introduces a novel malwaredetection approach utilizing an ensemble of Convolutional Neural Networks (CNNs) for enhanced classification accuracy. The methodology incorporates a multi-phase process, starting with the extraction and preprocessing of APK (android app) files. The preprocessing phase involves decompressing, decompiling, and transforming the APK files into bytecode and Dex files. The extracted byte data is converted into 1D vectors and reshaped into 2D grayscale images, enabling efficient feature learning through CNNs. The proposed ensemble of CNN-based models undergoes comprehensive training, validation, and evaluation, demonstrating superior performance compared to existing approaches. We used two popular android datasets to evaluate the performance of our proposed model. Specifically, the model achieves an accuracy of 98.65%, F1-score of 96.43% on the Drebin dataset and attains 97.91% accuracy, 96.73% of F1-score on the AMD dataset. These results confirm the mode's ability to effectively identify androidmalware with high precision and reliability, outperforming traditional techniques. This research not only underscores the potential of our proposed approach in malwaredetection but also sets a foundation for future advancements. Future efforts will focus on real-time malwaredetection, integration with mobile security frameworks, and evaluation across diverse datasets to ensure adaptability to emerging malware threats.
The dynamic and evolving nature of malware applications can lead to deteriorating performance in malwaredetection models, a phenomenon known as the model aging problem. This issue compromises the model's effectiv...
详细信息
The dynamic and evolving nature of malware applications can lead to deteriorating performance in malwaredetection models, a phenomenon known as the model aging problem. This issue compromises the model's effectiveness in maintaining mobile security. Model retraining have proven effective in enhancing performance on previously unseen applications. However, the substantial need for annotated data remains a significant challenge in acquiring accurate ground truth for model retraining. Therefore, this paper introduces anew method to address the model aging problem in android malware detection(AMD). To alleviate the burden of manual annotation, our approach incorporates pseudo-labeled data into the retraining process. Specifically, we introduce a novel method for evaluating the data drift scores of newly emerged samples by learning their data drift characteristics. These scores guide the usage of pseudo-labeled and true-labeled data for retraining the model. Our method significantly reduces the resources required for annotation while maintaining the efficacy of malwaredetection. In long-term datasets, we demonstrate the efficacy of our models through a series of experiments. Results indicate that our method enhances the F-score by approximately 26% in predicting unseen malware over a span of nine years.
With the development of mobile internet, the open android operating system has become the most widely used mobile platform globally, leading to a surge in malware that poses serious threats to user device security. Cu...
详细信息
With the development of mobile internet, the open android operating system has become the most widely used mobile platform globally, leading to a surge in malware that poses serious threats to user device security. Current android malware detection methods mainly rely on a single feature set, making it difficult to comprehensively represent the characteristics of android applications. To address this limitation, this paper proposes an android malware detection method called GBADroid. GBADroid comprehensively characterizes android software by considering multi-view features. Specifically, it first matches against a list of dangerous permissions to identify potential risks and then employs an information gain algorithm and a Bidirectional Gated Recurrent Unit (BiGRU) to extract opcode features. It also constructs a function call graph (FCG) to extract graph features using Graph Sample and Aggregate (GraphSAGE) algorithm. Experimental results show that GBADroid achieves a detection accuracy of 98.73%, demonstrating superior performance compared to existing methods.
android, the world's most widely used mobile operating system, is increasingly targeted by malware due to its open-source nature, high customizability, and integration with Google services. The increasing reliance...
详细信息
android, the world's most widely used mobile operating system, is increasingly targeted by malware due to its open-source nature, high customizability, and integration with Google services. The increasing reliance on mobile devices significantly raises the risk of malware attacks, especially for non-technical users who often grant permissions without thorough evaluation, leading to potentially devastating effects. This paper introduces PermGuard, a scalable framework for android malware detection that maps permissions into exploitation techniques and employs incremental learning to detect malicious apps. It presents a novel technique for constructing the PermGuard dataset by mapping android permissions to exploitation techniques, providing a comprehensive understanding of how permissions can be misused by malware. The dataset consists of 55,911 benign and 55,911 malware apps, providing a balanced and comprehensive foundation for analysis. Additionally, a new strategy using similarity-based selective training reduces the amount of data required for the training of an incremental learning-based model, focusing on the most relevant data to improve efficiency. To ensure robustness and accuracy, the model adopts a test-then-train approach, initially testing on application data to identify weaknesses and refine the training process. The framework's resilience is tested against adversarial attacks, demonstrating its ability to withstand attempts to bypass or deceive detection mechanisms and enhance overall security. Designed for scalability, PermGuard can handle large and continuously growing datasets, making it suitable for real-world applications. Empirical results indicate that the model achieved an accuracy of 0.9933 on real datasets and 0.9828 on synthetic datasets, demonstrating strong resilience against both real and adversarial attacks.
Machine Learning (ML) promises to enhance the efficacy of android malware detection (AMD);however, ML models are vulnerable to realistic evasion attacks-crafting realizable Adversarial Examples (AEs) that satisfy Andr...
详细信息
Machine Learning (ML) promises to enhance the efficacy of android malware detection (AMD);however, ML models are vulnerable to realistic evasion attacks-crafting realizable Adversarial Examples (AEs) that satisfy androidmalware domain constraints. To eliminate ML vulnerabilities, defenders aim to identify susceptible regions in the feature space where ML models are prone to deception. The primary approach to identifying vulnerable regions involves investigating realizable AEs, but generating these feasible apps poses a challenge. For instance, previous work has relied on generating either feature-space norm-bounded AEs or problem-space realizable AEs in adversarial hardening. The former is efficient but lacks full coverage of vulnerable regions, whereas the latter can uncover these regions by satisfying domain constraints but is known to be time consuming. To address these limitations, we propose an approach to facilitate the identification of vulnerable regions. Specifically, we introduce a new interpretation of android domain constraints in the feature space, followed by a novel technique that learns them. Our empirical evaluations across various evasion attacks indicate effective detection of AEs using learned domain constraints, with an average of 89.6%. Furthermore, extensive experiments on different androidmalware detectors demonstrate that utilizing our learned domain constraints in adversarial training outperforms other adversarial training based defenses that rely on norm-bounded AEs or state-of-the-art non-uniform perturbations. Finally, we show that retraining a malware detector with a wide variety of feature-space realizable AEs results in a 77.9% robustness improvement against realizable AEs generated by unknown problem-space transformations, with up to 70x faster training than using problem-space realizable AEs.
The sophistication of androidmalware poses significant threats to user security and privacy. Traditional detection methods struggle with rapid malware evolution and benign application diversity, leading to high false...
详细信息
The sophistication of androidmalware poses significant threats to user security and privacy. Traditional detection methods struggle with rapid malware evolution and benign application diversity, leading to high false positive rates and limited adaptability. This paper introduces a hybrid methodology leveraging advanced machine learning techniques to enhance accuracy and adaptability in android malware detection. It begins with collecting and preprocessing a comprehensive dataset of benign and malicious applications. An efficient Generative Adversarial Network (GAN) is employed to generate synthetic malware samples, effectively augmenting the dataset and enhancing the diversity of the malware samples under study process. To model the intricate relationships between applications, an efficient Graph Neural Network (GNN) process is utilized. Incorporating transformers, sequences of system and API calls are analyzed, harnessing this ability to discern patterns indicative of malicious activities. Additionally, a one-shot learning model tailored for the detection of new malware variants with minimal examples is introduced, enabling rapid adaptation to emerging threats. Federated learning preserves user privacy by training the model across a distributed network. A reinforcement learning model initiates proactive defenses, identifying optimal actions against malware threats. This methodology advances android malware detection, showing over 5.9% improvement in detection accuracy, 4.5% reduction in false positives, and enhanced adaptability to new malware variants. It ensures enhanced security for android users while preserving privacy. Evaluation results highlight its practical applicability in real-time scenarios.
暂无评论