How can I explode the following string:
+test +word any -sample (+toto +titi "generic test") -column:"test this" (+data id:1234)
into
Array('+test', '+word', 'any', '-sample', '(', '+toto', '+titi', '"generic test"', ')', '-column:"test this"', '(', '+data', 'id:1234', ')')
I would like to extend the boolean fulltext search SQL query, adding the feature to specify specific columns using the notation column:value
or column:"valueA value B"
.
How can I do this using preg_match_all($regexp, $query, $result)
, i.e., what is the correct regular expression to use?
Or more generally, what would be the most appropriate regular expression to decompose a string into words not containing spaces, where spaces within text between quotes is not considered spaces, for the sake of defining a word, and (
and )
are considered words, independent of being surrounded by spaces. For example xxx"yyy zzz"
should be considered a single world. And (aaa)
should be three words (
, aaa
and )
.
I have tried something like /"(?:\\.|[^\\"])*"|S+/
, but with limited/no success.
Can anybody help?
Advertisement
Answer
I think PCRE verbs can be used to achieve your goal:
preg_split('/".*?"(*SKIP)(*FAIL)|((|))| /', '+test +word any -sampe (+toto +titi "generic test") -column:"test this" (+data id:1234)',-1, PREG_SPLIT_DELIM_CAPTURE | PREG_SPLIT_NO_EMPTY)
https://3v4l.org/QnpB9
https://regex101.com/r/pw1mEd/1
https://3v4l.org/dNMkf (with test data)