Send TSTP Signal or set -t timeout parameter when killing process

kcoleman_hb · September 26, 2022, 6:27pm

I am seeing issues with our sidekiq process where jobs are stuck after sidekiq isn’t shutdown correctly. According to the sidekiq wiki this can happen due to sidekiq not being shutdown correctly. Batches · mperham/sidekiq Wiki · GitHub

- If you find that batches are stuck with Pending jobs, especially right around a deployment, verify you are gracefully restarting Sidekiq as designed: send TSTP as early as possible, TERM as late as possible, and never use kill -9.

- Seeing "positive pending" batches but can't find those pending jobs? They are likely in a super_fetch private queue. This can happen if your deploys are misconfigured and creating orphaned jobs. Check your -t shutdown timeout value (default: 25) and make sure your deploy tool is giving Sidekiq at least N+5 (i.e. 30) seconds before killing the process.

MichaelW · September 30, 2022, 4:28pm

Hey kcoleman_hb!

When currently running containers are shutdown, we use the Docker defaults to stop the container. This means that we run a docker stop command, which issues a SIGTERM to PID 1, followed by a SIGKILL to all running processes after 10 seconds if the container still has not stopped.

If you jobs are taking more than 10 seconds to stop and re-queue, we’ve seen another client successfully use an approach where you use the before_release commands in .aptible.yml to quiet your Sidekiq workers ahead of the step in a deploy where any container would be asked to stop. This mean you can take up to 30 minutes (the timeout on before_release) to be sure your jobs have stopped properly, and then continue with deploy. The only downside of that approach is you have to be careful to un-quiet your workers if your before_release commands fail after quieting. If your jobs are being orphaned/need more reliable execution, you may also want to take a look at Sidekiq Pro’s super_fetch as well.

— Michael

Topic		Replies	Views
Increase container shutdown graceperiod for graceful deployments	0	557	September 29, 2023
Graceful exit on deploy deployment	15	3355	June 22, 2017
Using delayed_jobs gem in Rails app deployment	6	4588	January 24, 2018
Measuring deploy times deployment	2	1033	April 30, 2016
How to be notified when container is restarted due to aptible termination	3	1215	May 17, 2017

Send TSTP Signal or set -t timeout parameter when killing process

Related topics