I’m trying to find a certain string that can occur inside a comment block. That string can be a word, but it can also be part of a word. For instance, suppose I’m looking for the word “codex”, then this word should be replace with “bindex” but even when it’s part of a word, like “codexing”. This should be changed to “bindexing”.
The trick is, that this should only happen when this word is inside a comment block.
/* Lorem ipsum dolor sit amet, codex consectetur adipiscing elit. */ This word --> codex should not be replaced /* Lorem ipsum dolor sit * amet, codex consectetur * adipiscing elit. */ /** Lorem ipsum dolor sit * amet, codex consectetur * adipiscing elit. */ // Lorem ipsum dolor sit amet, codex consectetur adipiscing elit. # Lorem ipsum dolor sit amet, codex consectetur adipiscing elit. ------------------- Below "codex" is part of a word ------------------- /* Lorem ipsum dolor sit amet, somecodex consectetur adipiscing elit. */ /* Lorem ipsum dolor sit * amet, codexing consectetur * adipiscing elit. */ And here also, this word --> codex should not be replaced /** Lorem ipsum dolor sit * amet, testcodexing consectetur * adipiscing elit. */ // Lorem ipsum dolor sit amet, __codex consectetur adipiscing elit. # Lorem ipsum dolor sit amet, codex__ consectetur adipiscing elit.
What I have so far is this code:
$text = preg_replace ( '~(//|#|/*).*?(codex).*?~', '$1 bindex', $text);
As you can see in this example, this isn’t really working the way I’d like. It doesn’t replace the word when it’s inside a multiline /* */
comment block, And sometimes it removes all the text that was in front of the word “codex” as well.
How can I improve my regex so that it meets my requirements?
Advertisement
Answer
Since you’re dealing with multi-line text here you should be using s
modifier (DOTALL) to match text across multiple line. Also forward slash doesn’t need to be escaped.
Try this code:
$text = preg_replace ( '~(//|#|/*).*?(codex).*?~s', '$1 bindex', $text );