Skip to content
Advertisement

PHP: Explode comma outside of brackets

Below is a string I’ve tried to explode only on comma’s outside of the first set of brackets.

Wheat Flour (2%) [Wheat Flour, Wheat Gluten, Calcium Carbonate, Iron, Niacin (B3), Thiamin (B1), Ascorbic Acid], Water, Yeast, Salt, Vegetable Oils (Palm, Rapeseed, oils (sunflower, rapeseed)), Soya Flour

1st Attempt

preg_split("/[[]|()]+/", "Wheat Flour (2%) [Wheat Flour, Wheat Gluten, Calcium Carbonate, Iron, Niacin (B3), Thiamin (B1), Ascorbic Acid], Water, Yeast, Salt, Vegetable Oils (Palm, Rapeseed, oils (sunflower, rapeseed)), Soya Flour", -1, PREG_SPLIT_NO_EMPTY);

Which returns:

[0] => Wheat Flour 
[1] => 2%
[2] => Wheat Flour, Wheat Gluten, Calcium Carbonate, Iron, Niacin 
[3] => B3
[4] => , Thiamin 
[5] => B1
[6] => , Ascorbic Acid
[7] => , Water, Yeast, Salt, Vegetable Oils 
[8] => Palm, Rapeseed
[9] => , Soya Flour

2nd Attempt

preg_split('/|(?![^(]*))/', "Wheat Flour (2%) [Wheat Flour, Wheat Gluten, Calcium Carbonate, Iron, Niacin (B3), Thiamin (B1), Ascorbic Acid], Water, Yeast, Salt, Vegetable Oils (Palm, Rapeseed, oils (sunflower, rapeseed)), Soya Flour");

Returns:

[0] => Wheat Flour (2%) [Wheat Flour, Wheat Gluten, Calcium Carbonate, Iron, Niacin (B3), Thiamin (B1), Ascorbic Acid], Water, Yeast, Salt, Vegetable Oils (Palm, Rapeseed), Soya Flour

The first attempt is the closest I’ve been able to get to the below output I’m trying to get.

[0] => "Wheat Flour (2%) [Wheat Flour, Wheat Gluten, Calcium Carbonate, Iron, Niacin (B3), Thiamin (B1), Ascorbic Acid]"
[1] => "Water"
[2] => "Yeast"
[3] => "Salt"
[4] => "Vegetable Oils (Palm, Rapeseed, oils (sunflower, rapeseed))"
[5] => "Soya Flour"

Advertisement

Answer

You may use this PCRE regex for splitting:

(?:(((?:[^()]*|(?-1))*))|([(?:[^][]*|(?-1))*]))(*SKIP)(*F)|h*,h*

RegEx Demo

Code:

$s = 'Wheat Flour [Wheat Flour, Wheat Gluten, Calcium Carbonate, Iron, Niacin (B3), Thiamin (B1), Ascorbic Acid], Water, Yeast, Salt, Vegetable Oils (Palm, Rapeseed, oils (sunflower, rapeseed)), Soya Flour';
$re = '~(?:(((?:[^()]*|(?-1))*))|([(?:[^][]*|(?-1))*]))(*SKIP)(*F)|h*,h*~';

print_r(preg_split($re, $s));

Output:

Array
(
    [0] => Wheat Flour [Wheat Flour, Wheat Gluten, Calcium Carbonate, Iron, Niacin (B3), Thiamin (B1), Ascorbic Acid]
    [1] => Water
    [2] => Yeast
    [3] => Salt
    [4] => Vegetable Oils (Palm, Rapeseed, oils (sunflower, rapeseed))
    [5] => Soya Flour
)

RegEx Explained:

  • (?:: Start non-capture group
    • (((?:[^()]*|(?-1))*)): Recursive pattern to match a possibly nested (...) substring
    • |: OR
    • ([(?:[^][]*|(?-1))*]): Recursive pattern to match a possibly nested [...] substring
  • ):
  • (*SKIP)(*F): Skip and Fail this match i.e. retain this data in split result
  • |: OR
  • h*,h*: Match a comma surrounded with 0 or more whitespaces on either side
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement