Skip to content
Advertisement

Regex split string on a char with exception for inner-string

I have a string like aa | bb | "cc | dd" | 'ee | ff' and I’m looking for a way to split this to get all the values separated by the | character with exeption for | contained in strings.

The idea is to get something like this [a, b, "cc | dd", 'ee | ff']

I’ve already found an answer to a similar question here : https://stackoverflow.com/a/11457952/11260467

However I can’t find a way to adapt it for a case with multiple separator characters, is there someone out here which is less dumb than me when it come to regular expressions ?

Advertisement

Answer

This is easily done with the (*SKIP)(*FAIL) functionality pcre offers:

(['"]).*?1(*SKIP)(*FAIL)|s*|s*

In PHP this could be:

<?php

$string = "aa | bb | "cc | dd" | 'ee | ff'";

$pattern = '~(['"]).*?1(*SKIP)(*FAIL)|s*|s*~';

$splitted = preg_split($pattern, $string);
print_r($splitted);
?>

And would yield

Array
(
    [0] => aa
    [1] => bb
    [2] => "cc | dd"
    [3] => 'ee | ff'
)

See a demo on regex101.com and on ideone.com.

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement