Skip to content
Advertisement

MySQL (RDS) queries are running in my process list long after my scripts have finished

First up I know something is wrong fundamentally with the way I am running my queries. I would like to ignore the root cause of this and in this question just ask the following:

I have a PHP script running on Lambda (Larval Vapor). The lambda function times out at 60 seconds while running a batch of insert, update and delete queries, however in the MySQL process list I see these queries running for a long long time afterwards (tens of minutes).

Where are these queries coming from if the PHP script has timed out? My guess is they are being stored in the QUERY CACHE but I am just guessing. I would love to know a little more about how this works.

The issue I am having at the moment is the php script times out and my application therefor thinks it has failed. On a later function we run the same queries again but of-course they are still running on the RDS instance. We therefore end up just layering the same queries over the top of each other – PHP has no idea that the SQL server has any queries running…

Advertisement

Answer

The SHOW PROCESSLIST does not show the query cache. The query cache doesn’t store queries, it stores query results. I.e., data.

What you see in SHOW PROCESSLIST are running queries. When the client connects, the MySQL Server creates a thread in the server. That’s what is executing the each query for that connection. When the client closes, the thread in the server is terminated.

If your Lambda script has terminated abnormally, for example by the AWS time limit, the MySQL server may not realize immediately that the client has gone away, so the server allows the query to keep running until it’s finished. But there will be no client to fetch the results. Eventually the server will time out that thread and terminate it.

If you use Lambdas, you have to take special care to make sure your queries are well-optimized and won’t run for more than the Lambda execution time limit, which if I remember is 5 minutes.

If you have a query that runs longer than 5 minutes, it’s not a good candidate for running in a Lambda.


Re your comment:

It’s quite easy for an SQL query to run for 30 minutes or more, if it is scanning a large amount of data in an inefficient manner. Sometimes SQL queries are so inefficient that it’s difficult to make them run in less than 30 minutes!

MySQL does have a feature to terminate a query after a time limit, but I think that’s the wrong solution. If you force the server to limit the time for a query, it doesn’t mean you get the results in less time, it means the query is cancelled, and you don’t get any results. Likewise if it’s an UPDATE, you aren’t waiting for a set of rows as the result, but it will still be cancelled. In other words, the UPDATE you wanted won’t happen.

The better solution is to optimize the query, so it normally runs quickly enough to give you the results before you would want it to time out.

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement