I am trying to debug some bizarre behaviour of my PHP application. It is running Laravel 6 + AWS SQS. The program downloads call recordings from a VoIP provider’s API using a job. The API has a heavy rate limit of 10req/minute, so I’m throttling the requests on my side. The job is configured to try to complete within 24 hours using retryUntil
method. However, the job disappears from the queue after 4 tries. It doesn’t fail. The job’s failed
method never gets executed (I’ve put logging and Sentry::capture in there). It’s not on the failed_jobs table. The last log says “Cannot complete job, retrying in … seconds”, which is right before the release
call. However, the job simply disappears from the queue and never gets executed again.
I am logging the number of attempts, max tries, timeoutAt, etc. Everything seems to be configured properly. Here’s (the essence of) my code:
public function handle() { /** @var Track $track */ $track = Track::idOrUuId($this->trackId); $this->logger->info('Downloading track', [ 'trackId' => $track->getId(), 'attempt' => $this->attempts(), 'retryUntil' => $this->job->timeoutAt(), 'maxTries' => $this->job->maxTries(), ]); $throttleKey = sprintf('track.download.%s', $track->getUser()->getTeamId()); if (!$this->rateLimiter->tooManyAttempts($throttleKey, self::MAX_ALLOWED_JOBS)) { $this->downloadTrack($track); $this->rateLimiter->hit($throttleKey, 60); } else { $delay = random_int(10, 100) + $this->rateLimiter->availableIn($throttleKey); $this->logger->info('Throttling track download.', [ 'trackId' => $track->getId(), 'delay' => $delay, ]); $this->release($delay); } } public function retryUntil(): DateTimeInterface { return now()->addHours(24); } public function failed(Exception $exception) { $this->logger->info('Job failed', ['exception' => $exception->getMessage()]; Sentry::captureException($exception); }
Advertisement
Answer
I found the problem and I’m posting it here for anyone who might struggle in the future. It all came down to a simple configuration. In AWS SQS the queue I am working with has a configured DLQ
(Dead-Letter Queue) and Maximum receives
set to 4. According to the SQS docs
The Maximum receives value determines when a message will be sent to the DLQ. If the ReceiveCount for a message exceeds the maximum receive count for the queue, Amazon SQS moves the message to the associated DLQ (with its original message ID).
Since this is an infra configuration, it overwrites any Laravel parameters you might pass to the job. And because the message is simply removed from the queue, the processing job does not actually fail, so the failed
method is not executed.