With the wide application of new power services, the continuous strengthening of plant network coordination and interaction, resulting in a large extension of data network, network security protection is more difficul...
详细信息
The growing interconnection between software systems increases the need for security already at design time. Security-related properties like confidentiality are often analyzed based on dataflow diagrams (DFDs). Howe...
详细信息
Web applications encompass various aspects of daily life, including online shopping, e-learning, and internet banking. Once there is a vulnerability, it can cause severe societal and economic damage. Due to its ease o...
详细信息
Web applications encompass various aspects of daily life, including online shopping, e-learning, and internet banking. Once there is a vulnerability, it can cause severe societal and economic damage. Due to its ease of use, PHP has become the preferred server-side programming language for web applications, making PHP applications a primary target for attackers. data flow analysis is widely used for vulnerability detection before deploying web applications because of its efficiency. However, the high complexity of the PHP language makes it difficult to achieve precise data flow analysis, resulting in higher rates of false positives and false negatives in vulnerability detection. In this paper, we present Yama, a context-sensitive and path-sensitive interprocedural data flow analysis method for PHP, designed to detect taint-style vulnerabilities in PHP applications. We have found that the precise semantics and clear control flow of PHP opcodes enable data flow analysis to be more precise and efficient. Leveraging this observation, we established parsing rules for PHP opcodes and implemented a precise understanding of PHP program semantics in Yama. This enables Yama to precisely address the high complexity of the PHP language, including type inference, dynamic features, and built-in functions. We evaluated Yama from three dimensions: basic data flow analysis capabilities, complex semantic analysis capabilities, and the ability to discover vulnerabilities in real-world applications, demonstrating Yama’s advancement in vulnerability detection. Specifically, Yama possesses context-sensitive and path-sensitive interprocedural analysis capabilities, achieving a 99.1% true positive rate in complex semantic analysis experiments related to type inference, dynamic features, and built-in functions. It discovered and reported 38 zero-day vulnerabilities across 24 projects on GitHub with over 1,000 stars each, assigning 34 new CVE IDs. We have released the source code of the proto
Machine learning (ML) is increasingly seen as a viable approach for building compiler optimization heuristics, but many ML methods cannot replicate even the simplest of the dataflow analyses that are critical to maki...
详细信息
ISBN:
(纸本)9781713845065
Machine learning (ML) is increasingly seen as a viable approach for building compiler optimization heuristics, but many ML methods cannot replicate even the simplest of the dataflow analyses that are critical to making good optimization decisions. We posit that if ML cannot do that, then it is insufficiently able to reason about programs. We formulate dataflow analyses as supervised learning tasks and introduce a large open dataset of programs and their corresponding labels from several analyses. We use this dataset to benchmark ML methods and show that they struggle on these fundamental program reasoning tasks. We propose PROGRAML - Program Graphs for Machine Learning - a language-independent, portable representation of program semantics. PROGRAML overcomes the limitations of prior works and yields improved performance on downstream optimization tasks.
The ever increasing need to ensure that code is reliably, efficiently and safely constructed has fueled the evolution of popular static binary code analysis tools. In identifying potential coding flaws in binaries, to...
详细信息
Indirect function calls are widely used in building system software like OS kernels for their high flexibility and performance. Statically resolving indirect-call targets has been known to be a hard problem, which is ...
详细信息
ISBN:
(纸本)9781939133441
Indirect function calls are widely used in building system software like OS kernels for their high flexibility and performance. Statically resolving indirect-call targets has been known to be a hard problem, which is a fundamental requirement for various program analysis and protection tasks. The state-of-the-art techniques, which use type analysis, are still imprecise. In this paper, we present a new approach, TFA, that precisely identifies indirect-call targets. The intuition behind TFA is that type-based analysis and data-flowanalysis are inherently complementary in resolving indirect-call targets. TFA incorporates a co-analysis system that makes the best use of both type information and data-flow information. The co-analysis keeps refining the global call graph iteratively, allowing us to achieve an optimal indirect call analysis. We have implemented TFA in LLVM and evaluated it against five famous large-scale programs. The experimental results show that TFA eliminates additional 24% to 59% of indirect-call targets compared with the state-of-the-art approaches, without introducing new false negatives. With the precise indirect-call analysis, we further developed a strengthened fine-grained forward-edge control-flow integrity scheme and applied it to the Linux kernel. We have also used the refined indirect-call analysis results in bug detection, where we found 8 deep bugs in the Linux kernel. As a generic technique, the precise indirect-call analysis of TFA can also benefit other applications such as compiler optimization and software debloating.
Asynchronous message-passing systems are employed frequently to implement distributed mechanisms, protocols, and processes. This paper addresses the problem of precise data flow analysis for such systems. To obtain go...
详细信息
ReScript is a strongly typed language that targets JavaScript, as an alternative to gradually typed languages, such as TypeScript. In this paper, we present a sound type system for data-flowanalysis for a subset of t...
详细信息
Rogue updates, an important type of software supply-chain attack in which attackers conceal malicious code inside updates to benign software, are a growing problem due to their stealth and effective-ness. We design an...
详细信息
Bottom-up program analysis has been traditionally easy to parallelize because functions without caller-callee relations can be analyzed independently. However, such function-level parallelism is significantly limited ...
详细信息
ISBN:
(纸本)9781450371216
Bottom-up program analysis has been traditionally easy to parallelize because functions without caller-callee relations can be analyzed independently. However, such function-level parallelism is significantly limited by the calling dependence - functions with caller-callee relations have to be analyzed sequentially because the analysis of a function depends on the analysis results, a.k.a., function summaries, of its callees. We observe that the calling dependence can be relaxed in many cases and, as a result, the parallelism can be improved. In this paper, we present Coyote, a framework of bottom-up data flow analysis, in which the analysis task of each function is elaborately partitioned into multiple sub-tasks to generate pipelineable function summaries. These sub-tasks are pipelined and run in parallel, even though the calling dependence exists. We formalize our idea under the IFDS/IDE framework and have implemented an application to checking null-dereference bugs and taint issues in C/C++ programs. We evaluate Coyote on a series of standard benchmark programs and open-source software systems, which demonstrates significant speedup over a conventional parallel design.
暂无评论