Skip to content
Advertisement

Is LEFT JOINING two tables with no reference to each other ON string conditions a bad idea?

I’m migrating database between 2 systems using PHP and MySQL.

In the old one I have 3 tables of interest:

t1
id (int)

t2
id (int)
t1_id (int)
d (string)

t3
id (int)
t1_id (int)
ds (string)
e (int)

In the new one I have only t1 and t2

t2.d can have e.g. “abc” or “def” t3.ds can have “abc” or “def” or “abc, def”

I have created the following query:

SELECT
    t2...,
    t3.e
FROM t2
LEFT JOIN t3
    ON t2.id = t3.id
    AND t3.ds LIKE CONCAT("%", t2.d, "%")
WHERE t2.id = ?

The query does work, but I am worried about the performance of this JOIN when I have lots of entries (to migrate I obviously iterate over each of the entries from t1, each having multiple entries in t2 and t3).

So at the bottom, I go back to the question – is it worth to join them like that or should I use a different approach, like separate query or data manipulation on PHP level?

Here’s MySQL EXPLAIN if that’s any relevant (unfortunately it doesn’t mean too much to me, so I appreciate any help):

id | select_type | table | type | possible_keys        | key           | key_len | ref                    | rows  | Extra
1  | SIMPLE      | modi  | ref  | PRIMARY,order_number | order_number  | 4       | const                  | 3     | Using temporary; Using filesort
1  | SIMPLE      | ai    | ref  | detail_number        | detail_number | 4       | max.modi.detail_number | 1     | NULL
1  | SIMPLE      | edi   | ALL  | NULL                 | NULL          | NULL    | NULL                   | 26389 | Using where; Using join buffer (Block Nested Loop)

Advertisement

Answer

If you are concerned about performance, then do not store lists in a string. You should have a junction table, with one row per element of the list. In other words, 'abc, def' is a no-no. Another table with two rows, one for 'abc' and another for 'def' is the way to go.

Because you have a data structure that is not optimized for SQL, there is little you can do from a performance perspective. The like is probably about as good as you can do.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement