Skip to content
Advertisement

scraping using PHP Simple HTML DOM Parser

I want to use PHP simple HTML DOM parser to scrape from a website. Source code is so random like that :

JavaScript

Instead of putting directly “Details. (Lob., Co v.)” inside < p> < /p> , it’s put using < font> and < i>. When I use this code

JavaScript

I find “Details. (Lob.,” it stops when finding < i > or < font >. How can I extract the whole line “Details. (Lob., Co v.)”

Thank you for your answer

Advertisement

Answer

You can use strip_tags() function to remove the unnecessary tags. after removing unnecessary tags, you can use DOM parser.

The strip_tags() function strips a string from HTML, XML, and PHP tags.

string strip_tags ( string $str [, string $allowable_tags ] )

You can read more about strip_tags() function on php.net

Example:

JavaScript

Result:

JavaScript
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement