I am trying to extract [[String]]
with regular expression. Notice how a bracket opens [ and it needs to close ]. So you would receive the following matches:
[[String]]
[String]
String
If I use [[^]]+]
it will just find the first closing bracket it comes across without taking into consideration that a new one has opened in between and it needs the second close. Is this at all possible with regular expression?
Note: This type can either be String, [String] or [[String]] so you don’t know upfront how many brackets there will be.
Advertisement
Answer
You can use the following PCRE compliant regex:
(?=(([(?:w++|(?2))*])|bw+))
See the regex demo. Details:
(?=
– start of a positive lookahead (necessary to match overlapping strings):(
– start of Capturing group 1 (it will hold the “matches”):([(?:w++|(?2))*])
– Group 2 (technical, used for recursing):[
, then zero or more occurrences of one or more word chars or the whole Group 2 pattern recursed, and then a]
char|
– orbw+
– a word boundary (necessary since all overlapping matches are being searched for) and one or more word chars
)
– end of Group 1
)
– end of the lookahead.
See the PHP demo:
$s = "[[String]]"; if (preg_match_all('~(?=(([(?:w++|(?2))*])|bw+))~', $s, $m)){ print_r($m[1]); }
Output:
Array ( [0] => [[String]] [1] => [String] [2] => String )