Skip to content
Advertisement

Scraping tag with certain keyword using Simple HTML Dom Parser

I’m attempting to scrape a <script> tag from a set of webpages using Simple HTML Dom. At first, I was scraping it by providing the numerical order of the tag I needed:

$script = $html->find('script', 17); //The tag I need is typically the 18th <script> tag on the page

I’ve come to realize that the order differs depending on the page (and it’s just not a scalable way of doing this since it could change at any time). How can I instead search for a keyword within the tag that I need and then pull back the full tag? For example, the tag I need always contains the string “PRODUCT_METADATA”.

Thanks in advance for any ideas!

Advertisement

Answer

I ended up using the below code to search all script tags for my keyword:

$scripts = $html->find('script');
    foreach($scripts as $s) {
        if(strpos($s->innertext, 'PRODUCT_METADATA') !== false) {
            $script = $s;
        }
    }
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement