Anti-malware research often employs binary code similarity techniques for attacks attribution and for inferring the malware family. Having similar malware samples means that they share common code which was probably w...
详细信息
ISBN:
(纸本)9789819712731;9789819712748
Anti-malware research often employs binary code similarity techniques for attacks attribution and for inferring the malware family. Having similar malware samples means that they share common code which was probably written by the same malicious actor, unless the common code belongs to some public software library. Thus, library code thwarts the analysis process and needs to be identified and excluded from the similarity computation in order to get reliable results. Besides code from third-party software libraries, compiler-specific code, runtime packers and installers have the same effect and should be dealt with. The current paper presents methods for detecting the library code within an existing data collection and for improving the detection as new data is added. The proposed approach is compared with the state of the art, highlighting the existing problems and proposing solutions for overcoming them. Eliminating library code from similarity computation brings both improvements and some drawbacks, which are analyzed in terms of performance and results quality.
暂无评论