Skip to content
Advertisement

php DOMDocument preg_replace fail detect

Basically, I want to replace content with hyperlink when detected matching keyword tag. the replace need to be outside of caption/image/figure/figcaption/iframe/a of existing content, because putting hyperlink inside these will causing format breaking.

my php

JavaScript

Now I facing 2 issue

  1. want to exclude replace hyperlink tag into …
    but it fail on regex..

currently it display like this… enter image description here

  1. this DOMDocument loadHTML method will add in extra paragraph tag randomly at any places… Although I can process the output by removing ALL the paragraph tag, but it also means the final content is not original anymore. Some input content by default have some paragraph tag, so this action will end up making existing p tag gone too..

  2. (solved) want to preg_replace as clickable hyperlink to display at browser. but echo $output showing the pure raw hyperlink syntax, unable to click..

  • update on issue2, value saved into $node->nodeValue are escaped and causing pure plain text. I add in this to unescape it, echo html_entity_decode($output); and it now display correctly.

Desired output

JavaScript

Advertisement

Answer

I tried very, VERY hard to implement a DOMDocument+Xpath solution, but I came unstuck while trying to disqualify the text node within the square-tagged caption block. I couldn’t manage to isolate the whole caption block to be able to exclude it. In the end, here is a caveman’s regex approach to serve as a band-aid until someone smarter can solve this problem properly.

The regex matches the blacklisted tags in the text and discards them; it only replaces text that is not disqualified.

Code: (Demo)

JavaScript
User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement