Process Management

Introduction to processes
Basic process control with your shell
More advanced process control

Introduction to processes

A process is basically an invocation of a program. Whenever you type `ls', a new process is created (and destroyed when it's finished). The shell you typed `ls' in is a process. The xterm the shell is running in is a process. Your window manager is one or more processes. X is a process.

Your window manager (usually) provides lots of different ways to create processes (run programs), plus some ways of managing them (you can close or destroy the window the process is using, which often results in the process terminating as well). Your shell provides more: it also allows processes to be suspended, restarted, put "in the background", etc.

But sometimes this isn't enough, which is where tools such as ps, top, nice, kill, etc. come in.

Basic process control with your shell

Back in the "good old days" when everybody used character-based terminals, it was convenient to be able to "drop out" of a running program and return to the shell, run a few other commands, and return to the program you were using. Thus most shells recognise a "suspend" character (typically Ctrl-Z) which allows one to temporarily stop a running program and return to the shell. When one is ready to return to the program, one simply types `fg' (foreground) and the last suspended program is resumed.

Since one can have multiple programs suspended, it is useful to be able to get a list of them. That's where the `jobs' command comes in:

wharvey@bowman% jobs
[1]  + Suspended           vi Mmakefile
[2]    Suspended           info make
[3]  - Exit 1              cvs diff -u |&
       Suspended           less
[4]    Suspended           info texinfo
[5]    Suspended           vi ~/TODO
[8]    Suspended           info -f /usr/info/gcc
[9]    Suspended           info -f ~/mercury/mercury/info/mercury

The first column gives the "job number". The `+' in the second column marks the most recently suspended job, while the `-' marks the next most recently suspended job. The third column gives the process status, while the fourth lists the command executed to create the job. Note that job three consists of two processes connected by a pipe, and the status of each is given independently.

Of course a list of jobs isn't much good unless you can can do more than just resume the last one, so it should be no surprise that you can indeed do more than this. `%n' resumes job number `n'. `%' and `%%' are synonyms for `fg' and resume the "current" job. `%-' resumes the "previous" job. You can also use `fg %n' (and `fg %', `fg %-', etc.) if you like.

If you have no more use for a job and wish to terminate it, you can use the `kill' command: `kill %n' kills job `n'. This is similar to typing `Ctrl-C' when the program is in the foreground, except that sometimes a program ignores a `Ctrl-C' but will recognise a `Ctrl-Z'.

You can also tell the shell to make a process continue to run "in the background" --- that is, if a program does not need to interact with the user, it can be "detached" from the terminal and continue to run while you do other things (though if it produces lots of output and you haven't redirected it elsewhere, it still gets printed on the screen, which can make doing other things with that shell awkward). To make job `n' run in the background, type `bg %n'.

Note that if you wish to have the program run in the background from the beginning, you can just put a `&' at the end of the command and the shell will start it in the background for you. Still, I often forget, and it's convenient to be able to just hit `Ctrl-Z' and type `bg' to fix it.

One kind of program you'll often want to run in the background are X programs (assuming you don't start it directly from the window manager). For instance I have an alias `xvi' which runs `xterm' with appropriate options so it opens a new window and starts editing a file. If this `xterm' were just run in the background (with `&'), then each of these edit sessions would appear in my jobs list (as `Running'). I often start a dozen or so of these edit sessions from the one shell, and since I typically don't want to suspend or resume them (after all, they're "running" in a separate window), I'd prefer it if they didn't clutter up the job list. The trick here is to start such programs in a "sub shell". E.g. `(netscape &)', `(exmh &)'.

More advanced process control

Sometimes the process management facilities of the shell are not sufficient; for example when the process you're wanting to manage is not a job created by a shell, or the shell which created the job is no longer around or not accessible. This is where tools such as `ps' and `top' start becoming really useful.


`ps' is like `jobs', except that it can list more detailed information, and can give information about all processes running on the machine, not just those jobs being managed by the shell. `ps' has many options, and I encourage you to read the man page to find out about some of them. But here's the ones I use the most:

l       "long" format
u       "user" format
a       include processes owned by all users, not just your own
x       include processes without a controlling terminal

An example extract of the "long" format:

   100   776   325     1   0   0   1500     0 wait4       SW   1  0:00 (login)
    40   776  8386     1   0   0    928   168 wait_for_co S   p5  5:37 /home/wha
     0   776 15257     1   0   0 112680 17428 do_select   S   p2 28:02 /usr/lib/
     0   776 18531   325   0   0   1600     0 read_chan   SW   1  0:00 (tcsh)
   100   776   433   419   0   0   1704     0 read_chan   SW  p2  0:00 (tcsh)
   100   776   436   420  13   0   1704   580 sigsuspend  S   p5  0:01 (tcsh)
   100   776   435   424   0   0   1932     0 do_select   SW  p4  0:06 (ssh)

The same processes using the "user" format:

wharvey    325  0.0  0.0  1500     0   1 SW  Jul  6   0:00 (login)
wharvey    433  0.0  0.0  1704     0  p2 SW  Jul  6   0:00 (tcsh)
wharvey    435  0.0  0.0  1932     0  p4 SW  Jul  6   0:06 (ssh)
wharvey    436  0.0  0.1  1704   580  p5 S   Jul  6   0:01 (tcsh)
wharvey   8386  0.1  0.0   928   168  p5 S   Jul 30   5:37 /home/wharvey/goofey
wharvey  15257  0.0  4.5 112680 17448  p2 S   Jul  9  28:02 /usr/lib/netscape/net
wharvey  18531  0.0  0.0  1600     0   1 SW  Jul 12   0:00 (tcsh)

If you find your machine thrashing, often you can find out why by running a `ps aux' or `ps alx' and looking at the `RSS' fields to see which processes are consuming lots of memory. `ps' can also be extremely useful for diagnosing problems with system services: e.g. it could be that you're having trouble with NFS because the `rpc.mountd' daemon isn't running.


`top' is useful for monitoring those processes consuming the most CPU at any given time, as well as various other system statistics:

11:27pm  up 27 days,  8:05, 47 users,  load average: 0.00, 0.00, 0.00
186 processes: 158 sleeping, 2 running, 0 zombie, 26 stopped
CPU states:  0.3% user,  5.1% system,  0.0% nice, 94.6% idle
Mem:  387688K av, 290820K used,  96868K free,  41808K shrd, 120544K buff
Swap: 521676K av, 267072K used, 254604K free                 69656K cached

19002 wharvey   12   0   776  776   564 R       0  4.3  0.2   0:00 top
  335 root       7   0 24688  15M  1288 R       0  1.1  3.9 611:30 X
    1 root       0   0   108   68    52 S       0  0.0  0.0   0:04 init
    2 root       0   0     0    0     0 SW      0  0.0  0.0   0:16 kflushd
    3 root       0   0     0    0     0 SW      0  0.0  0.0  30:07 kswapd
   65 root       0   0    84   48    36 S       0  0.0  0.0   0:03 kerneld
  188 bin        0   0    80    0     0 SW      0  0.0  0.0   0:00 portmap
  192 root       0   0     0    0     0 SW      0  0.0  0.0   0:05 rpciod


The `proc' filesystem can be a great source of information about processes (as well as many other system resources). All of the information available to `ps', `top', etc. can also be found here --- in its raw form. There's one subdirectory per process under `/proc', plus a whole bunch of other files and directories containing system-wide information.

wharvey@bowman% ls -Flag /proc/8386
total 0
dr-xr-xr-x   3 wharvey  staff   0 Aug  3 00:24 ./
dr-xr-xr-x 199 root     root    0 Jul  6 15:21 ../
-r--r--r--   1 wharvey  staff   0 Aug  3 00:24 cmdline
-r--r--r--   1 wharvey  staff   0 Aug  3 00:24 cpu
lrwx------   1 wharvey  staff   0 Aug  3 00:24 cwd -> /home/wharvey/src/-r--------   1 wharvey  staff           0 Aug  3 00:24 environ
lrwx------   1 wharvey  staff   0 Aug  3 00:24 exe -> /home/wharvey/goofey*
dr-x------   2 wharvey  staff   0 Aug  3 00:24 fd/
pr--r--r--   1 wharvey  staff   0 Aug  3 00:24 maps|
-rw-------   1 wharvey  staff   0 Aug  3 00:24 mem
lrwx------   1 wharvey  staff   0 Aug  3 00:24 root -> //
-r--r--r--   1 wharvey  staff   0 Aug  3 00:24 stat
-r--r--r--   1 wharvey  staff   0 Aug  3 00:24 statm
-r--r--r--   1 wharvey  staff   0 Aug  3 00:24 status
wharvey@bowman% cat /proc/partitions
major minor  #blocks  name

   3     0    6353235 hda
   3     1    2048256 hda1
   3     2          1 hda2
   3     5    4160803 hda5
   3     6     128488 hda6
wharvey@bowman% ls -Flag /proc/net
total 0
dr-xr-xr-x   3 root     root    0 Aug  3 00:34 ./
dr-xr-xr-x 199 root     root    0 Jul  6 15:21 ../
-r--r--r--   1 root     root    0 Aug  3 00:34 arp
-r--r--r--   1 root     root    0 Aug  3 00:34 dev
-r--r--r--   1 root     root    0 Aug  3 00:34 dev_mcast
-r--r--r--   1 root     root    0 Aug  3 00:34 dev_stat
-r--r--r--   1 root     root    0 Aug  3 00:34 netlink
-r--r--r--   1 root     root    0 Aug  3 00:34 netstat
-r--r--r--   1 root     root    0 Aug  3 00:34 raw
-r--r--r--   1 root     root    0 Aug  3 00:34 route
dr-xr-xr-x   2 root     root    0 Aug  3 00:34 rpc/
-r--r--r--   1 root     root    0 Aug  3 00:34 rt_cache
-r--r--r--   1 root     root    0 Aug  3 00:34 snmp
-r--r--r--   1 root     root    0 Aug  3 00:34 sockstat
-r--r--r--   1 root     root    0 Aug  3 00:34 tcp
-r--r--r--   1 root     root    0 Aug  3 00:34 udp
-r--r--r--   1 root     root    0 Aug  3 00:34 unix


`kill' does more than kill processes or jobs. It is actually a generic tool for sending signals to processes. It's just that the default signal happens to be `TERM' (terminate)...

There are many signals one can send to a process:


`INT', `QUIT', `TERM' and `KILL' are all different ways of terminating a process, with slightly different semantics. `HUP' also by default terminates a process, but many processes trap this signal and perform some special operation instead (e.g. re-read configuration files). `STOP' and `CONT' can be used to temporarily suspend and then resume a process (like `Ctrl-Z' and `fg'/`bg' in a shell).

More information about the various signals can be found in the man pages: `man 7 signal'.


Sometimes you want processes to run with different priorities. For example, suppose you have a CPU-intensive program which is going to take several hours to complete its task, and you (or others) will be trying to get other work done in the mean time. Unless the long-running job is extremely urgent, it is better to let interactive jobs have a higher priority. The process scheduler will do this automatically to a limited extent, but a better response can be achieved if you designate the long-running jobs as having a lower priority. This can be done using the `nice' command when starting the job. E.g. `nice make' will execute `make' with a lower priority.

You can specify how nice to make the job with an appropriate command-line argument. Note that many shells implement `nice' as a built-in, and that the syntax can differ from the system-supplied executable. E.g. `nice -10' means the opposite if invoked under `tcsh' when compared to `/bin/nice': it tries to *raise* the job's priority (which only works if you have root privileges).

You may wish to modify a job's priority after it starts, if you forgot to `nice' it, or if you didn't realise how long it was going to run for. You can use the `renice' command for this, but you need to know the process ID (ps and top are your friends). If the job contains multiple processes, you will have to renice them all individually (child processes normally inherit their parent's priority, but if the parent's priority is modified later, this does not affect the child).

Note that once a process's priority is lowered, it cannot be raised again, except by the super user.


Some other tools useful for monitoring one's system are `uptime' and `xload'.