Skip to content
Advertisement

PHP (regex): Extract column names from SQL-Query

I am struggling with a regular expression in PHP:

I have a number of SQL-queries. Now, I’d like to extract the variable names (column names) within these queries. It should return an array (preg_match_all() )

As the queries are rather simple no perfect solution is needed. The only conditions I would like to check are the following:

  • return all strings (a-zA-Z0-9) of any length that
  • start right after one of the following expressions: “WHERE “, “AND “, “OR “, “WHERE (“, “AND (“, “OR (” and
  • end right before one of the following expressions: “=”, “>”, “<“, “LIKE”, “NOT”)
  • all these expressions should be case-insensitive.

I was playing around with the following online regex-tester: https://www.phpliveregex.com/#tab-preg-match-all. However, the closest I got was this, as I dont know how to get multiple matches:

preg_match_all('/^[where|and|or|(]* ([a-zA-Z0-1]*)[ |=|<|<=|>|>=|or|in]*.*$/', $input_lines, $output_array);

Some example SQL-queries:

                                                       SHOULD RETURN
WHERE var1=1 AND var2=1                                => var1; var2
WHERE var2 IN (1, 2, 3) OR var3=2                      => var2; var3
WHERE (var4 like '%test%' and var5 NOT LIKE '%test%')  => var4; var5
WHERE VAR6 = 'test' AnD var7='test2'                   => VAR6; var7

Advertisement

Answer

Here we go (of course not a 100% perfect):

(?:(?i)WHERE|AND|OR) (?([a-zA-Z0-9]+)

See a demo on regex101.com.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement