Bytecode is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecode is even harder to understand for programmers and researchers. Bytecode has ...
详细信息
Bytecode is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable source code, bytecode is even harder to understand for programmers and researchers. Bytecode has been widely used in various software tasks such as malware detection and clone detection. In order to understand the meaning of the bytecode more quickly and accurately and further help programmers in more software activities, we propose a bytecode comment generation method (called BCGen) using neural language model. Specifically, to get the structured information of the bytecode, we first generate the control flow graph (CFG) of the bytecode, and serialize the CFG with bytecode semantic information. Then a transformer model combining gate recurrent unit is proposed to learn the features of bytecode to generate comments. We obtain the bytecode by building the Jar packages of the well-known open-source projects in the Maven repository and construct a bytecode dataset to train and evaluate our model. Experimental results show that the BLEU of BCGen can reach 0.26, which outperforms several baselines and proves the effectiveness and practicability of our method. It is concluded that it is possible to generate natural language comments directly from the bytecode. Meanwhile, it is important to take structured and semantic information into account in generating bytecode comments.
The protection and privacy of the 5G-IoT framework is a major challenge due to the vast number of mobile devices. Specialized applications running these 5G-IoT systems may be vulnerable to clone attacks. Cloning appli...
详细信息
The protection and privacy of the 5G-IoT framework is a major challenge due to the vast number of mobile devices. Specialized applications running these 5G-IoT systems may be vulnerable to clone attacks. Cloning applications can be achieved by stealing or distributing commercial Android apps to harm the advanced services of the 5G-IoT framework. Meanwhile, most Android app stores run and manage Android apps that developers have submitted separately without any central verification systems. Android scammers sell pirated versions of commercial software to other app stores under different names. Android applications are typically stored on cloud servers, while API access services may be used to detect and prevent cloned applications from being released. In this paper, we proposed a hybrid approach to the control flow graph (CFG) and a deep learning model to secure the smart services of the 5G-IoT framework. First, the newly submitted APK file is extracted and the JDEX decompiler is used to retrieve Java source files from possibly original and cloned applications. Second, the source files are broken down into various android-based components. After generating control-flowgraphs (CFGs), the weighted features are stripped from each component. Finally, the Recurrent Neural Network (RNN) is designed to predict potential cloned applications by training features from different components of android applications. Experimental results have shown that the proposed approach can achieve an average accuracy of 96.24% for cloned applications selected from different android application stores.
For the automatic test generation in conformance testing, the first step is to derive a single FSM (finite state machine) from the formal specification, which does not loose any semantics of the formal language. The n...
详细信息
For the automatic test generation in conformance testing, the first step is to derive a single FSM (finite state machine) from the formal specification, which does not loose any semantics of the formal language. The nondeterminism, parallelism, and modular structure characteristics of the formal specification language make it difficult to derive such a basis directly from the formal specification. In this paper, characteristics of Estelle specification language are analyzed and a "well-defined specification," a restricted form of an Estelle specification, is proposed based on the analyzed results to enable a direct derivation of a single reduced FSM-which is called a CFG (control flow graph)-from the specification written in Estelle. The derived CFG provides a basis for the automatic test case generation. Algorithms to test whether the specification written in Estelle is well-defined or not, and to generate the CFG from the well-defined specification, are developed. Finally, as an example, the proposed technique is applied to TP0 (Transport Protocol class 0) specification written in Estelle. In applying these algorithms to the Estelle specification, some guidelines are also suggested for the specification which is not "well defined."
An important application of the dynamic program slicing technique is program debugging. In applications such as interactive debugging, the dynamic slicing algorithm needs to be efficient. In this context, we propose a...
详细信息
An important application of the dynamic program slicing technique is program debugging. In applications such as interactive debugging, the dynamic slicing algorithm needs to be efficient. In this context, we propose a new dynamic program slicing technique that is more efficient than the related algorithms reported in the literature. We use the program dependence graph as an intermediate program representation, and modify it by introducing the concepts of stable and unstable edges. Our algorithm is based on marking and unmarking the unstable edges as and when the dependences arise and cease during run-time. We call this algorithm edge-marking algorithm. After an execution of a node x, an unstable edge (x, y) is marked if the node x uses the value of the variable defined at node y. A marked unstable edge (x, y) is unmarked after an execution of a node z if the nodes y and, define the same variable var, and the value of var computed at the node y does not affect the present value of var defined at the node z. We show that our algorithm is more time and space efficient than the existing ones. The worst case space complexity of our algorithm is O(n(2)), where n is the number of statements in the program. We also briefly discuss an implementation of our algorithm. (C) 2002 Elsevier Science B.V. All rights reserved.
Logic vulnerabilities are due to defects in the application logic implementation such that the application logic is not the logic that was expected. Indeed, such vulnerabilities pattern depends on the design and busin...
详细信息
Logic vulnerabilities are due to defects in the application logic implementation such that the application logic is not the logic that was expected. Indeed, such vulnerabilities pattern depends on the design and business logic of the application. There are no specific and common patterns for application logic vulnerabilities in commercial applications. In this study, a method named FINAD is introduced to detect application logic vulnerabilities using an activity flowgraph (AFG) to find the incompatibilities of an implemented application with its design. In this work, the AFG, consisting of the activity diagram (AD) and control flow graph (CFG), is presented for the first time. Investigation of different common types of application logic vulnerabilities indicated that the majority of such vulnerabilities could be detected through conducting a static analysis on an AFG. The FINAD method is independent of the language and can be used for vulnerability detection for any programming language, provided that the AD is available, and the CFG of the program can be created. Implementation of FINAD for PHP language showed its effectiveness in detecting known logic vulnerabilities in CVE vulnerability database.
There are many different types of models to describe various aspects of software from inception to completion. Some of these models are the basis for de ning software metrics. In this paper, we propose a generalized f...
详细信息
There are many different types of models to describe various aspects of software from inception to completion. Some of these models are the basis for de ning software metrics. In this paper, we propose a generalized formal structural model for traditional structured programs such as 3/4 GLs. We provide a formal definition using conventional notations for our proposed model covering information flow, controlflow and combination of information and controlflows.
.Abstract syntax trees (ASTs), control flow graphs (CFGs), and data flow analysis (DFA) are prerequisites for static and dynamic analysis and vulnerability detection for programs;thus, obtaining them is significant. R...
详细信息
.Abstract syntax trees (ASTs), control flow graphs (CFGs), and data flow analysis (DFA) are prerequisites for static and dynamic analysis and vulnerability detection for programs;thus, obtaining them is significant. Recently, many tools related to generating ASTs, CFGs, and DFA have been proposed. However, most tools can only construct ASTs, very few can construct ASTs and CFGs, and almost none can construct all three. The vast majority of AST, CFG, and DFA tools are for other languages (e.g., Java and Python), and while a few are for C/C++, they are implemented in other languages, creating complex working environments, and overreliance on other language-related libraries. To address these shortcomings, we present a DFA tool, Dflow, for C/C++. First, a lexical/grammatical analyzer generated by Flex and Bison is used to analyze the program. Second, an AST is constructed from the results;then, a CFG is obtained from the analysis results and the information from the AST. Finally, based on the AST and CFG, DFA is performed, and the vulnerabilities of simple programs are determined. We test some common vulnerable code and common weakness enumeration slicing code, which show the effectiveness of Dflow in program data flow analysis and vulnerability checking. The results show that our tool can implement ASTs, CFGs, and DFA, and we add some rules to the tool for vulnerability detection. (c) 2021 Institute of Electrical Engineers of Japan. Published by Wiley Periodicals LLC.
We present an efficient interprocedural dynamic slicing algorithm for structured programs. We first propose an intraprocedural dynamic slicing algorithm, and subsequently extend it to handle interprocedural calls. Our...
详细信息
We present an efficient interprocedural dynamic slicing algorithm for structured programs. We first propose an intraprocedural dynamic slicing algorithm, and subsequently extend it to handle interprocedural calls. Our intraprocedural dynamic slicing algorithm uses control dependence graph as the intermediate program representation, and computes precise dynamic slices. The interprocedural dynamic slicing algorithm uses a collection of control dependence graphs (one for each procedure) as the intermediate program representation. and computes precise dynamic slices. We show that our proposed interprocedural dynamic slicing algorithm is more efficient than the existing dynamic slicing algorithms. We also discuss how our algorithm can be extended to efficiently handle recursion, composite data structures and pointers. (C) 2005 Elsevier Inc. All rights reserved.
This article proposes an approach to identifying integer overflow vulnerabilities in software represented by the executable code of x86 architecture. The approach is based on symbolic code execution and initially twof...
详细信息
This article proposes an approach to identifying integer overflow vulnerabilities in software represented by the executable code of x86 architecture. The approach is based on symbolic code execution and initially twofold representation of memory cells. A truncated control transfer graph is constructed from the machine code of the program, the paths in which are layer-by-layer checked for the feasibility of the vulnerability conditions. The proposed methods were implemented in practice and experimentally tested on the various code samples.
Android is the most widely used mobile platform, making it a prime target for malicious attacks. Therefore, it is imperative to effectively circumvent these attacks. Recently, machine learning has been a promising sol...
详细信息
Android is the most widely used mobile platform, making it a prime target for malicious attacks. Therefore, it is imperative to effectively circumvent these attacks. Recently, machine learning has been a promising solution for malware detection, which relies on distinguishing features. While machine learning-based malware scanners have a large number of features, adversaries can avoid detection by using feature-related expertise. Therefore, one of the main tasks of the Android security industry is to consistently propose cutting-edge features that can detect suspicious activity. This study presents a novel feature representation approach for malware detection that combines API-Call graphs (ACGs) with byte-level image representation. First, the reverse engineering procedure is used to obtain the Java programming codes and Dalvik Executable (DEX) file from Android Package Kit (APK). Second, to depict Android apps with high-level features, we develop ACGs by mining API-Calls and API sequences from control flow graph (CFG). The ACGs can act as a digital fingerprint of the actions taken by Android apps. Next, the multi-head attention-based transfer learning method is used to extract trained features vector from ACGs. Third, the DEX file is converted to a malware image, and the texture features are extracted and highlighted using a combination of FAST (Features from Accelerated Segment Test) and BRIEF (Binary Robust Independent Elementary Features). Finally, the ACGs and texture features are combined for effective malware detection and classification. The proposed method uses a customized dataset prepared from the CIC-InvesAndMal2019 dataset and outperforms state-of-the-art methods with 99.27% accuracy.
暂无评论