Skip to content
Advertisement

PHP- Recursive Regex to get complete Div Class with it’s inner content

I have searched but cannot find a solution that works. I have tried using DOM but the result is not identical (different spaces and tag elements – minor differences but I need identical for further pattern searches on the source) to the source, hence I would like to try regex. Is this possible (I know it isn’t best solution but would like to try it)? For example is it possible to return all of the div class “want-this-entire-div-class” including inner:

JavaScript

The following stops after the first div>

preg_match(‘/<div class=”want-this-entire-div-class”(.*?)</div>/s’, $html, $match); Thanks

Advertisement

Answer

One way to tackle this is with a state machine. You enumerate all the possible states, then take action depending on what state you are in. In this case its

  1. line to ignore
  2. target open div
  3. line to add
  4. extra open div
  5. extra close div
  6. target close div

I dont expect this is robust, but it does work for the given example:

JavaScript
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement