How would I match all “quote blocks” in plaintext e-mail in PHP PCRE?

Question

I'm trying to match all the quotes in the following example e-mail message: That means I want to match these three strings: And: And: I don't understand how I can do this, since if I use the s flag to span multiple lines, which is required for this, I cannot refer to ^ and $ to mean "beginning of line"

Accepted Answer

With preg_match_all:preg_match_all('~^> .*(?:R> .*)*~m', $txt, $matches);$result = $matches[0];(where R is an alias for several newline sequences)With preg_split:$result = preg_split('~^(?!> ).*R?~m', $txt, -1, PREG_SPLIT_NO_EMPTY);that splits the string on each line that doesn&#8217;t start with > .To trim the newline at the end of each block, you can start this pattern with an optional R? => ~R?^(?!> ).*R?~m or like that ~(?:R?^(?!> ).*)+R?~m to eventually grab several lines at a time.About R: R is by default an alias for (?>rn|n|x0b|f|r|x85) (any non-utf8 8bits characters sequences for a newline). In utf8 mode, with the u modifier or starting the pattern with (*UTF8)(*BSR_UNICODE), two other characters oustide of the ASCII range are added to the list: the line separator (U+2028), the paragraph separator (U+2029).It&#8217;s handy when you don&#8217;t know which newline sequence is used in the string but slower than writing the exact newline sequence if you know it. You can restrict R to (?>rn|n|r) with the directive (*BSR_ANYCRLF) at the start of the pattern.

Advertisement

Answer