Skip to content
Advertisement

regex to isolate javascript using preg_replace_callback

My [php executed] regex is terrible and I’m struggling with trying to isolate javascript scripting within HTML blocks. I have the following regex that works partially, but it’s run into a problem if there’s the word “on” in the text (as opposed to in a < tag >).

$regex = "/<script.*?>.*?</script.*?>(*SKIP)(*F)|((\bon(.*?=)(.*?))('|")(.*?)(\5))/ism";

$html = preg_replace_callback($regex,
           function ($matches) {
               $mJS = $matches[2] . $matches[5] . myFunction($matches[6]) . $matches[5];
               return $mJS;
           },
           $html);

I think the issue is that the bon…. part needs to be qualified to be inside a < tag > before being considered, but I just don’t know how.

Running the following test…

$html= "<div id='content' onClick='abc()'>Lorem On='abc' ipsum on to</div>
<input id='a' type='range'>
<input id='b' type='range'>
<script>abc();</script>";

Returns…

<div id='content' onClick='****abc()****'>Lorem On='****abc****' ipsum on to</div>
<input id='****a****' type='range'>
<input id='b' type='range'>
<script>abc();</script>

but I wanted…

<div id='content' onClick='****abc()****'>Lorem On='abc' ipsum on to</div>
<input id='a' type='range'>
<input id='b' type='range'>
<script>****abc();****</script>

I have a sandbox running this if you want to have a play: https://onlinephp.io/c/a43b1

Does anyone have any suggestions?

Advertisement

Answer

With help from Bobble Bubble, I’ve been able to get this working…

<?php

function myFunction($tx) {
    return "****$tx****";
}

$html= "<div id='content' onClick='abc()'>Lorem On='abc' ipsum on to</div>
<input id='a' type='range'>
<input id='b' type='range'>
<script>abc();</script>";

$regex = "/(<scriptb[^><]*>)(.*?)(</script>)|bonw+s*=s*K(?|(")([^"]+)"|(')([^']+)')/ism";

$result = preg_replace_callback($regex,
        function ($matches)  {
            if ( isset($matches[1])) $m1=$matches[1]; else $m1="";
            if ( isset($matches[2])) $m2=$matches[2]; else $m2="";
            if ( isset($matches[3])) $m3=$matches[3]; else $m3="";
            if ( isset($matches[4])) $m4=$matches[4]; else $m4="";
            if ( isset($matches[5])) $m5=$matches[5]; else $m5="";
            $mJS = $m1.$m4 . myFunction($m2.$m5) .$m3.$m4;
            return $mJS;
        },$html);


echo "Result=$result";
echo "nn";
?>

See https://onlinephp.io/ for a running executable.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement