Is there any way that my HTML securer could be exploited?

Question

I've finally managed to make a function which does the following: Takes a string as input. This can be either an entire HTML document or a HTML "snippet" (even broken). Creates a DOMDocument from this and loops through all nodes. Whenever it encounters any node whose element is outside of a whitelist of basic structural elements, it "marks it for

Accepted Answer

If you want to prevent xss, all of the on* attributes are candidates for removal. Also style might have javascript in various ways in some browsers, as well as href (javascript:). SVG can I think include scripts and so on.Look here for a non-comprehensive list of how these sanitizers would be bypassed, and why it&#8217;s very hard to build a sanitizer yourself.Why not just use a known-good sanitizer like Google Caja, instead of reinventing them? It&#8217;s a lot harder than you seem to think.

Advertisement

Answer