The problem#
When writing a shell script that starts background jobs, sometimes running those jobs past the lifetime of the script doesn't make sense. (Of course, sometimes background jobs really should keeping going after the script completes, but that's not the case this post is concerned with.) In the case that either the background jobs are used to do some background computation relevant to the script or the script can conceptually be thought of as a collection of processes, it makes sense for killing the script to also kill any background jobs it started.
The solution#
At the start of the script, add
cleanup() {
# kill all processes whose parent is this process
pkill -P $$
}
for sig in INT QUIT HUP TERM; do
trap "
cleanup
trap - $sig EXIT
kill -s $sig "'"$$"' "$sig"
done
trap cleanup EXIT
If you really want to kill only jobs and not all child processes,
use the kill_child_jobs()
function from all.sh
or
look at the other versions in the kill-child-jobs
repository.
The details#
Running cleanup on exit#
This StackExchange answer gives example code to run a
cleanup
function on exit, even if exiting due to being killed by a
signal that would normally halt the script immediately (obviously except
for SIGKILL
):
for sig in INT QUIT HUP TERM ALRM USR1; do
trap "
cleanup
trap - $sig EXIT
kill -s $sig "'"$$"' "$sig"
done
trap cleanup EXIT
With how to run the cleanup
function solved, we just have to figure
out what to put in it.
Which processes to kill?#
At a closer look, I realized the problem was underspecified.
The solution I ended up using kills all immediate child processes of the script by killing all processes whose parent is the script:
pkill -P $$
Another option is killing all descendants using
rkill
:
rkill $$
Alternatively, the OS provides various process grouping mechanisms, and you can kill all processes of the same group. These have various complications, including that if you don't make sure your script is the root of its own process group (or session, etc.), then that group may include parent processes of the script, not just child processes.
Lastly, as a variant of the first option, we can kill all child jobs using the shell's concept of "jobs" which are not quite the same as child processes.
Background jobs#
Unix shells have a feature called
job control for working with sub-processes they create.
Normally a background job is created by running a command with &
at
the end:
/path/to/cmd some_arg &
The jobs
builtin can be used to inspect the active jobs:
$ sleep 100 &
[1] 1730089
$ jobs
[1]+ Running sleep 100 &
As using &
is the only way for a shell script to have an immediate
child process unless the shell has run the process and is waiting for
it to complete, generally the set of child processes and set of jobs is
the same excepting foreground jobs which are generally short-running
tasks, so it's probably not a big deal to let them complete. (Actually,
bash
includes foreground jobs in the list of jobs, although other
shells do not.)
But complicating matters, bash
and zsh
have
a builtin disown
which allows for a job to
be removed from the list of jobs without killing it, and therefore the
set of jobs can be different from the set of background child processes.
(And, to further add to the confusion, ksh
has a builtin named
disown
but it has different semantics which do
not include actually removing the job from the jobs list.)
See this StackExchange answer for a more in-depth
explanation.
Which gives another possible interpretation of killing all child jobs: it's possible that what we want is to kill all jobs which have not been disowned. Unfortunately, this quickly runs into differences between shells.
Killing all jobs in all shells#
There is a short straightforward way to kill all jobs that
works in both
bash
and zsh
:
while kill %% 2>/dev/null; do sleep 0; done
(The original without the sleep 0
works in zsh
but makes
bash
hang.)
Unfortunately, this doesn't actually kill the jobs in dash
(the default shell for scripts in Debian and Ubuntu) and, worse, hangs
in BusyBox's ash
(the default shell on many embedded
platforms).
Another alternative that almost works is
kill $(jobs -p)
It works in bash
, but dash
has a bug that
requires the workaround of writing the output of jobs -p
to a file and reading that file back. Then it works in every shell I
tested except zsh
where the jobs
builtin does not have a
-p
option, requiring this workaround of parsing the output
of jobs
instead, which doesn't work in bash
or dash
.
Combining those two solutions gives all.sh
which works in
all of the shells I tested in, but is much longer than the solutions
that work in a subset of the shells.
Testing scripts#
As figuring all this out involved a lot of running a bunch of scripts in a bunch of shells and checking their output, I wrote a script to do so and output the following table:
Legend:
β=script does not halt (after 1 second timeout)
X=disown unsupported by shell
β =all children killed
π=all children still running
βοΈ=expected result (job killed, disowned child alive)
bash |
sh |
ash |
dash |
zsh |
ksh |
|
---|---|---|---|---|---|---|
all.sh |
βοΈ | Xβ | Xβ | Xβ | βοΈ | β |
bash.sh |
βοΈ | Xπ | βXβ | Xπ | βοΈ | β |
dash.sh |
βοΈ | Xβ | Xβ | Xβ | π | β |
noop.sh |
π | Xπ | Xπ | Xπ | π | π |
pkill-P.sh |
β | Xβ | Xβ | Xβ | β | β |
zsh.sh |
ββοΈ | Xπ | βXβ | Xπ | βοΈ | β |
test-all.sh
prints the table and loops over
the kill_child_jobs
implementations and shells and runs
test-kill-child-jobs.sh
for each
combination. That script uses the specified shell to run
make-and-kill-child-jobs.sh
which starts two jobs and disowns one of them before calling the
specified kill_child_jobs
implementation. The jobs it starts are
wait_for_pid_exit.sh
which is just a simple
loop that constantly checks if the specified PID is dead (and therefore
it outlived that process). test-kill-child-jobs.sh
interprets the
output of the script to determine which jobs outlived the script and
prints the summary string to go into table.
Comments
Have something to add? Post a comment by sending an email to comments@aweirdimagination.net. You may use Markdown for formatting.
There are no comments yet.