Skip to content
Advertisement

Capturing groups in string using preg_match

I got in trouble parsing a text file in codeigniter, for each line in file I need to capture groups data…the data are: – progressive number – operator – manufacturer – model – registration – type

Here you are an example of the file lines

JavaScript

To parse each line I’m using the following code:

JavaScript

The code above doesn’t work…I think because some groups of data are composed by more then one words…for example “SIRIO S.P.A.” Any hint to fix this? Thanks a lot for any help

Advertisement

Answer

You should not use w for capturing the data as some of the characters in your text like &, ., - and / are not part of word characters. Moreover some of them are space separated, so you should replace w{1,} with S+(?: S+)* which will capture your text properly into groups you have made.

Try changing your regex to this and it should work,

JavaScript

Check this demo

Explanation of what S+(?: S+)* does in above regex.

  • S+S is opposite of s meaning it matches any non-whitespace (won’t match a space or tab or newline or vertical space or horizontal space and in general any whitespace) character. Hence S+ matches one or more visible characters
  • (?: S+)* – Here ?: is only for turning a group as non-capture group and following it has a space and S+ and all of it is enclosed in parenthesis with * quantifier. So this means match a space followed by one or more non-whitespace character and whole of it zero or more times as * quantifier is used.

So S+(?: S+) will match abc or abc xyz or abc pqr xyz and so on but the moment more than one space appears, the match stops as there is only a single space present in the regex before S+

Hope my explanation is clear. If still any doubt, please feel free to ask.

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement