Skip to content
Advertisement

Make this regex code apply to words with an apostrophe

I use this code to select 2 words around either side of a word.

((w+W+){0,2}WORDHERE(W+w+){0,2})

But it treats apostrophe-separated words as “two words”.

For example, with the input text:

you’re not WORDHERE is the best

the worst WORDHERE surely didn’t win

The result is:

you’re not WORDHERE is the best

the worst WORDHERE surely didn‘t win

How can I make this code understand that words with an apostrophe should be treated as a single word?

Advertisement

Answer

In the pattern that you use [^srn]+ matches any char except a whitespace or newline an could possibly also match ''''

If you want to match apostrophe-separated words where the apostrophe is not at the start or at the end, you might use:

(?:w+(?:'w+)? ){0,2}WORDHERE(?: w+(?:'w+)?){0,2}

Explanation

  • (?: Non capture group
    • w+(?:'w+)? Match 1+ word chars, optionally match a ' and 1+ word chars followed by a space
  • ){0,2} Close group and repeat 0-2 times
  • WORDHERE Match literally
  • (?: Non capture group
    • w+(?:'w+)? Same as the previous pattern, only the space is now at the beginning
  • ){0,2} Close group and repeat 0-2 times

Regex demo

User contributions licensed under: CC BY-SA
1 People found this is helpful
Advertisement