Currently, web applications are gaining in prevalence. In a web application, an input may not be appropriately validated, making the web application susceptible to cross site scripting (XSS), which poses serious secur...
详细信息
Currently, web applications are gaining in prevalence. In a web application, an input may not be appropriately validated, making the web application susceptible to cross site scripting (XSS), which poses serious security problems for Internet users and websites to whom such trusted web pages belong. A taint inference is a type of information flow analysis technique that is useful in detecting XSS on the client side. However, in existing techniques, two current practical issues have yet to be handled properly. One is URL rewriting, which transforms a standard URL into a clearer and more manageable form. Another is html sanitization, which filters an input against blacklists or whitelists of html tags or attributes. In this paper, we make an analogy between the taint inference problem and the molecule sequence alignment problem in bioinformatics, and transfer two techniques related to the latter over to the former to solve the aforementioned yet-to-be-handled-properly practical issues. In particular, in our method, URL rewriting is addressed using local sequence alignment and html sanitization is modeled by introducing a removal gap penalty. Empirical results demonstrate the effectiveness and efficiency of our method.
Websites rely on server-side html sanitization to defend against the ever-present threat of cross-site scripting attacks. Parsing arbitrary pieces of markup to assess whether they contain an exploit payload is far fro...
详细信息
ISBN:
(纸本)9798350331318;9798350331301
Websites rely on server-side html sanitization to defend against the ever-present threat of cross-site scripting attacks. Parsing arbitrary pieces of markup to assess whether they contain an exploit payload is far from trivial. This complexity leads to divergences between the parsing results of the sanitizer and the user's browser. These so-called parsing differentials open the door for the unexplored category of mutation-based attacks. Here, an attacker abuses the sanitizer's incorrect html parser to either directly bypass it or coerce it to transform benign markup into a dangerous exploit payload. In this work, we study the prevalence of such parsing differentials and their security impact. To this end, we built a generator for html fragments that are difficult to parse and evaluated how 11 sanitizers across five programming languages deal with such inputs. We found that parsing differentials are commonplace, as each assessed sanitizer has at least several functional deficiencies leading to overzealous removal of benign input. Even worse, we were able to automatically bypass all but two of the 11 sanitizers, painting a dire picture of the state of server-side html sanitization.
暂无评论