I’m cleaning up some wordpress short codes in my code and I’m looking for a solution that would extract the right values no matter the order of the values.
Example:
[Links label="my_label" url="my_url" external="other_value"]
If I want to extract my_label, my_url and other_value, I would use the following structure:
preg_match_all('/[Links label="(.*?)" url="(.*?)" external="(.*?)"]/', $content, $output_array);
The problem is that I sometimes have a different order like this:
[Links url="my_url" external="other_value" label="my_label"]
My previous preg_match_all doesn’t work with this. I have tried to put each pattern between (…) or use | but I don’t get the expected result. I have seen solutions here to identify strings but I need more than identifying strings, I need to extract values.
It’s probably something trivial for a regex expert.
Thanks
Advertisement
Answer
If the properties could also be a different amount in any order and should start with [Links , you can make use of the G anchor. The key is in capture group 1, the value in capture group 2.
(?:[Links|G(?!^))(?=[^][]*])h+([^s=]+)="([^s"]+)"
Explanation
(?:Non capture group[LinksMatch[Links|OrG(?!^)Assert the position at the end of the previous match, not at the start
)Close non capture group(?=[^][]*])Positive lookahead, assert a]at the righth+Match 1+ horizontal whitespace chars(Capture group 1[^s=]+Match 1+ times any char except=or a whitespace char
)Close group 1="Match literally(Capture group 2[^s"]+Match 1+ times any char except"or a whitespace char
)"Close group 2 and match"
Example
$re = '/(?:[Links|G(?!^))(?=[^][]*])h+([^s=]+)="([^s"]+)"/m'; $str = '[Links label="my_label" url="my_url" external="other_value"]'; preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0); print_r($matches);
Output
Array
(
[0] => Array
(
[0] => [Links label="my_label"
[1] => label
[2] => my_label
)
[1] => Array
(
[0] => url="my_url"
[1] => url
[2] => my_url
)
[2] => Array
(
[0] => external="other_value"
[1] => external
[2] => other_value
)
)