Skip to content
Advertisement

Use Regex to Split Paragraphs that are not wrapped in div or Table

I am trying to insert some text after every paragraph in my content.

I explode my content by </p> It is done using following code:

    $Paragraphs = explode( '</p>', $Content);
    foreach($Paragraphs as $Paragraph){
        // Some code
    }

Now my $Content looks like:

<p></p>
<p></p>
<p></p>
<div><p></p></div>
<p></p>
<p></p>
<div><p></p></div>

I want to split if <p> isn’t wrapped inside <div> or <table> of anything else.

You can say that the </p> should have a <p> after it.

I read Regex can be helpful in achieveing it.

Here’s the basic regex I built:

$Pattern = '/<p(|s+[^>]*)>(.*?)</ps*>/';

if(preg_match_all($Pattern, $Content, $keywords)){

}

This regex currently removes the

itself from the array, it keeps content inside p but not the

itself, and it doesn’t check for it being either having a

before it or

after it.

Advertisement

Answer

If i understood your problem you have a string with tags such as:

$string = "
<p> Sometext 1 </p>
<p> Sometext 2 </p>
<p> Sometext 3 </p>
<div><p> Sometext Inside A Div </p> </div>
";

And you want to add another element right after each p that is not contained in any other element. And you want to do that purely through PHP, correct ?

In my opinion your best option is using DOMDocument.

Take a look at the solution below:

$doc = new DOMDocument();
$doc->loadHTML($string);
foreach ($doc->getElementsByTagName('p') as $idx => $item) {
    if($item->parentNode->nodeName == 'body') {
        $fragment = $doc->createDocumentFragment();
        $fragment->appendXML('<div> <div> <img src="image.jpg"/> </div> </div>');
        $item->parentNode->insertBefore($fragment, $item->nextSibling);
    }
}    

echo $doc->saveHTML();

Basically i am taking your string converting it into an HTML DOM then i iterate through all the p elements and if their parent is body then i create a document fragment which will append XML raw data to create your deeply nested structure without creating each element individually. Finnaly i insert the newly create fragment after each iterated p element.

The output will look something like this:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html>
    <body>
        <p> Sometext 1 </p>
        <div> 
            <div> 
                <img src="image.jpg"> 
            </div> 
        </div>
        <p> Sometext 2 </p>
        <div> 
            <div> 
                <img src="image.jpg"> 
            </div> 
        </div>
        <p> Sometext 3 </p>
        <div> 
            <div> 
                <img src="image.jpg"> 
            </div> 
        </div>
        <div>
            <p> Sometext Inside A Div </p> 
        </div>
    </body>
</html>
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement