I’m cleaning up some wordpress short codes in my code and I’m looking for a solution that would extract the right values no matter the order of the values.
Example:
[Links label="my_label" url="my_url" external="other_value"]
If I want to extract my_label, my_url and other_value, I would use the following structure:
preg_match_all('/[Links label="(.*?)" url="(.*?)" external="(.*?)"]/', $content, $output_array);
The problem is that I sometimes have a different order like this:
[Links url="my_url" external="other_value" label="my_label"]
My previous preg_match_all doesn’t work with this. I have tried to put each pattern between (…) or use | but I don’t get the expected result. I have seen solutions here to identify strings but I need more than identifying strings, I need to extract values.
It’s probably something trivial for a regex expert.
Thanks
Advertisement
Answer
If the properties could also be a different amount in any order and should start with [Links
, you can make use of the G
anchor. The key is in capture group 1, the value in capture group 2.
(?:[Links|G(?!^))(?=[^][]*])h+([^s=]+)="([^s"]+)"
Explanation
(?:
Non capture group[Links
Match[Links
|
OrG(?!^)
Assert the position at the end of the previous match, not at the start
)
Close non capture group(?=[^][]*])
Positive lookahead, assert a]
at the righth+
Match 1+ horizontal whitespace chars(
Capture group 1[^s=]+
Match 1+ times any char except=
or a whitespace char
)
Close group 1="
Match literally(
Capture group 2[^s"]+
Match 1+ times any char except"
or a whitespace char
)"
Close group 2 and match"
Example
$re = '/(?:[Links|G(?!^))(?=[^][]*])h+([^s=]+)="([^s"]+)"/m'; $str = '[Links label="my_label" url="my_url" external="other_value"]'; preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0); print_r($matches);
Output
Array ( [0] => Array ( [0] => [Links label="my_label" [1] => label [2] => my_label ) [1] => Array ( [0] => url="my_url" [1] => url [2] => my_url ) [2] => Array ( [0] => external="other_value" [1] => external [2] => other_value ) )