I have the following input (only for example, real input contains much more crazy data)
$values = [ '32/34, 36/38, 40/42, 44/46', '40/42/44/46/48', '58/60', '39-42', '40-50-60', '24-25,26,28,30', '36 40,5 44', ];
and want to split it by separators like /
or ,
but keep pairs of values. This should be done only, if separator does not occur multiple times, so the result should look like:
'32/34, 36/38, 40/42, 44/46' => [ '32/34', '36/38', '40/42', '44/46' ] '40/42/44/46/48' => [ '40', '42', '44', '46', '48' ] '58/60' => [ '58/60' ] '39-42' => [ '39-42' ] '40-50-60' => [ '40', '50', '60' ] '24-25,26,28,30' => [ '24-25', '26', '28', '30' ] '36 40,5 44' => [ '36', '40,5', '44' ]
What I have so far is
$separator = '^|$|[s,/-]'; $decimals = 'd+(?:[,.][05])?'; foreach ($values as $value) { preg_match_all('/' . '(?<=' . $separator . ')' . '(?:' . '(?P<var1>(' . $decimals . ')[/-](?-1)|(?-1))' . ')(?=' . $separator . ')' . '/ui', $value, $matches); print_r($matches); }
But this fails for 40/42/44/46/48
which returns
[var1] => Array ( [0] => 40/42 [1] => 44/46 [2] => 48 )
But each number should be returned separately. Modifying regex to '(?P<var1>(' . $decimals . ')([/-])(?-2)|(?-2))(?!3)'
is better, but still returns wrong result
[var1] => Array ( [0] => 40 [1] => 42 [2] => 44 [3] => 46/48 )
How should the correct regex look like?
Advertisement
Answer
As stated in comments above, I know that a 100% match is not possible, because of user input. But I’ve found a regex which fits most of my use cases:
(?<=^|$|[s,/-])(?:(?P<var1>(?<![/-])(?!(?:(d+(?:[,.][05])?)[/-]){2}(?-1))(d+(?:[,.][05])?)[/-](?-1)|(?-1)))(?=^|$|[s,/-])