检索结果-内蒙古大学图书馆

Accurate and Scalable Cross-Architecture Cross-OS binary code search with Emulation

IEEE TRANSACTIONS ON SOFTWARE ENGINEERING 2019年第11期45卷 1125-1149页

作者： Xue, Yinxing Xu, Zhengzi Chandramohan, Mahinthan Liu, Yang Univ Sci Technol China Hefei 230026 Anhui Peoples R China Nanyang Technol Univ Singapore 637553 Singapore

Different from source code clone detection, clone detection (similar code search) in binary executables faces big challenges due to the gigantic differences in the syntax and the structure of binary code that result from different configurations of compilers, architectures and OSs. Existing studies have proposed different categories of features for detecting binary code clones, including CFG structures, n-gram in CFG, input/output values, etc. In our previous study and the tool BinGo, to mitigate the huge gaps in CFG structures due to different compilation scenarios, we propose a selective inlining technique to capture the complete function semantics by inlining relevant library and user-defined functions. However, only features of input/output values are considered in BinGo. In this study, we propose to incorporate features from different categories (e.g., structural features and high-level semantic features) for accuracy improvement and emulation for efficiency improvement. We empirically compare our tool, BinGo-E, with the pervious tool BinGo and the available state-of-the-art tools of binary code search in terms of search accuracy and performance. Results show that BinGo-E achieves significantly better accuracies than BinGo for cross-architecture matching, cross-OS matching, cross-compiler matching and intra-compiler matching. Additionally, in the new task of matching binaries of forked projects, BinGo-E also exhibits a better accuracy than the existing benchmark tool. Meanwhile, BinGo-E takes less time than BinGo during the process of matching.

关键词： binary codes Semantics Tools Feature extraction Cloning Syntactics Emulation binary code search binary clone detection vulnerability matching emulation 3D-CFG

来源：评论

学校读者我要写书评

暂无评论

FlowEmbed: binary function embedding model based on relational control flow graph and byte sequence 29

FlowEmbed: Binary function embedding model based on relation...

引用

29th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2023

作者： Wang, Yongpan Dong, Chaopeng Li, Siyuan Luo, Fucai Su, Renjie Song, Zhanwei Li, Hong Chinese Academy of Sciences Institute of Information Engineering China University of Chinese Academy of Sciences School of Cyber Security China State Grid Fujian Electric Power Company China

ISBN: (纸本)9798350330717

binary function embedding models are applicable to various downstream tasks within IoT device software systems and have demonstrated advantages in numerous binary analysis tasks, such as vulnerability (homologous) function search and compilation optimization option identification. However, current binary function embedding methods either learn embedding based on code sequence, which lack the program semantics of functions (e.g., control flow, etc.) or based on program structure graphs, which omit global sequential information. As a result, these methods fall short in enabling models to learn the complete semantic of function. In this paper, we introduce FlowEmbed, a novel approach that synergistically integrates control flow and global semantic learning to facilitate exhaustive code comprehension. Initially, FlowEmbed harnesses a distinct relational control flow graph combined with the power of BERT and RGCN models to aptly capture the nuances of control flow semantics. Moreover, by deploying the DPCNN model on a byte sequence constructed from function machine code, FlowEmbed adeptly discerns the inherent global sequential semantics of binary functions. Through rigorous evaluations spanning three IoT-related tasks, FlowEmbed's efficacy becomes evident, showcasing notable improvements: a 20.6% improvement in compilation optimization option identification, a 1.8% improvement in binary function similarity analysis, and an 11.9% improvement in homologous function search. Collectively, these results underscore FlowEmbed's superior capability, positioning it as a invaluable asset in a binary analysis application. © 2023 IEEE.

关键词： binary code search binary code similarity detection binary function embedding deep learning static analysis

来源：评论

学校读者我要写书评

暂无评论

引用

44th Annual IEEE-Computer-Society International Conference on Computers, Software, and Applications (COMPSAC)

作者： Tai, Zeming Washizaki, Hironori Fukazawa, Yoshiaki Fujimatsu, Yurie Kanai, Jun Waseda Univ Dept Comp Sci & Engn Tokyo Japan Toshiba Co Ltd Corp Res & Dev Ctr Tokyo Japan

ISBN: (纸本)9781728173030

binary similarity has been widely used in function recognition and vulnerability detection. How to define a proper similarity is the key element in implementing a fast detection method. We proposed a scalable method to detect binary vulnerabilities based on similarity. Procedures lifted from binaries are divided into several comparable strands by data dependency, and those strands are transformed into a normalized form by our tool named Vulnera Bin, so that similarity can be determined between two procedures through a hash value comparison. The low computational complexity allows semantically equivalent code to be identified in binaries compiled from million lines of source code in a fast and accurate way.

关键词： binary analysis static analysis binary code search binary similarity

来源：评论

学校读者我要写书评

暂无评论

Scalable and Accurate binary search Method based on Simhash and Partial Trace 19

Scalable and Accurate Binary Search Method based on Simhash ...

引用

19th IEEE International Conference on Trust, Security and Privacy in Computing and Communications (IEEE TrustCom)

作者： Zhang, Yunan Xu, Aidong Xu Jiang, Yixin CSG Guangdong Prov Key Lab Power Syst Network Secur Elect Power Res Inst Guangzhou Peoples R China

ISBN: (纸本)9781665403924

binary code search has received much attention recently due to its impactful applications, e.g., plagiarism detection, malware detection and software vulnerability auditing. However, developing an effective binary code search tool is challenging due to the gigantic syntax and structural differences in binaries resulted from different compilers, compiler options and malware family. In this paper, we propose a scalable and accurate binary search engine which performs syntactic matching by combining a set of key techniques to address the challenges above. The key contribution is binary code searching technique which combined function filtering and partial trace method to match the function code relatively quick and accurate. In addition, a simhash and basic information based function filtering is proposed to dramatically reduce the irrelevant target functions. Besides, we introduce a partial trace method for matching the shortlisted function accurately. The experimental results show that our method can find similar functions, even with the presence of program structure distortion, in a scalable manner.

关键词： binary code search Malware homology analysis Partial trace Simhash

来源：评论

学校读者我要写书评

暂无评论

引用

38th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI)

作者： David, Yaniv Partush, Nimrod Yahav, Eran Technion Haifa Israel

ISBN: (纸本)9781450349888

We present a scalable approach for establishing similarity between stripped binaries (with no debug information). The main challenge in binary similarity, is to establish similarity even when the code has been compiled using different compilers, with different optimization levels, or targeting different architectures. Overcoming this challenge, while avoiding false positives, is invaluable to the process of reverse engineering and the process of locating vulnerable code. We present a technique that is scalable and precise, as it alleviates the need for heavyweight semantic comparison by performing out-of-context re-optimization of procedure fragments. It works by decomposing binary procedures to comparable fragments and transforming them to a canonical, normalized form using the compiler optimizer, which enables finding equivalent fragments through simple syntactic comparison. We use a statistical framework built by analyzing samples collected "in the wild" to generate a global context that quantifies the significance of each pair of fragments, and uses it to lift pairwise fragment equivalence to whole procedure similarity. We have implemented our technique in a tool called GitZ and performed an extensive evaluation. We show that GitZ is able to perform millions of comparisons efficiently, and find similarity with high accuracy.

关键词： static binary analysis statistical similarity binary code search

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：