I have a string like aa | bb | "cc | dd" | 'ee | ff' and I’m looking for a way to split this to get all the values separated by the | character with exeption for | contained in strings.
The idea is to get something like this [a, b, "cc | dd", 'ee | ff']
I’ve already found an answer to a similar question here : https://stackoverflow.com/a/11457952/11260467
However I can’t find a way to adapt it for a case with multiple separator characters, is there someone out here which is less dumb than me when it come to regular expressions ?
Advertisement
Answer
This is easily done with the (*SKIP)(*FAIL) functionality pcre offers:
(['"]).*?1(*SKIP)(*FAIL)|s*|s*
In PHP this could be:
<?php $string = "aa | bb | "cc | dd" | 'ee | ff'"; $pattern = '~(['"]).*?1(*SKIP)(*FAIL)|s*|s*~'; $splitted = preg_split($pattern, $string); print_r($splitted); ?>
And would yield
Array
(
[0] => aa
[1] => bb
[2] => "cc | dd"
[3] => 'ee | ff'
)
See a demo on regex101.com and on ideone.com.