I am trying to extract [[String]] with regular expression. Notice how a bracket opens [ and it needs to close ]. So you would receive the following matches:
[[String]][String]String
If I use [[^]]+] it will just find the first closing bracket it comes across without taking into consideration that a new one has opened in between and it needs the second close. Is this at all possible with regular expression?
Note: This type can either be String, [String] or [[String]] so you don’t know upfront how many brackets there will be.
Advertisement
Answer
You can use the following PCRE compliant regex:
(?=(([(?:w++|(?2))*])|bw+))
See the regex demo. Details:
(?=– start of a positive lookahead (necessary to match overlapping strings):(– start of Capturing group 1 (it will hold the “matches”):([(?:w++|(?2))*])– Group 2 (technical, used for recursing):[, then zero or more occurrences of one or more word chars or the whole Group 2 pattern recursed, and then a]char|– orbw+– a word boundary (necessary since all overlapping matches are being searched for) and one or more word chars
)– end of Group 1
)– end of the lookahead.
See the PHP demo:
$s = "[[String]]";
if (preg_match_all('~(?=(([(?:w++|(?2))*])|bw+))~', $s, $m)){
print_r($m[1]);
}
Output:
Array
(
[0] => [[String]]
[1] => [String]
[2] => String
)