Processes in an Uninterruptible Sleep (D) State

  • 7002725
  • 21-Feb-2009
  • 08-Nov-2012

Environment

SUSE Linux Enterprise Server 11
SUSE Linux Enterprise Server 10
SUSE Linux Enterprise Server 9
Novell Open Enterprise Server 11 (OES 11) Linux
Novell Open Enterprise Server 2 (OES 2)
Novell Open Enterprise Server 1 (OES 1)

Situation

Slow application response time. Slow server performance. Processes stuck in a "D" state.

                                            vv
#==[ Checking Health of Processes ]=================#
# egrep " D| Z" /var/log/nts_x64-7_090210_1117/psout.vNO31896
root       707 23884  0.0  0.0      0     0 Z    00:00:00 [ncs-resourced.p]<defunct>
root      3816  2949  0.0  0.0   6012   656 D    00:02:52 hald-addon-storage
root     23830     1  0.0  0.0      0     0 D    00:01:03 [MPK Thread]
root     23831     1  0.0  0.0      0     0 D    00:00:09 [MPK Thread]
root     23839     1  0.0  0.0      0     0 D    00:00:07 [MPK Thread]
root     23844     1  0.0  0.0      0     0 D    00:00:00 [MPK Thread]
root     23884     1  0.0  0.1  32348  4240 D    00:00:00 /usr/bin/python /opt/novell/ncs/bin/ncs-resourced.py /etc/opt/novell/ncs
                                            ^^

Resolution

Processes in a "D" or uninterruptible sleep state are usually waiting on I/O. The ps command shows a "D" on processes in an uninterruptible sleep state. The vmstat command also shows the current processes that are "blocked" or waiting on I/O. The vmstat and ps will not agree on the number of processes in a "D" state, so don't be too concerned. You cannot kill "D" state processes, even with SIGKILL or kill -9. As the name implies, they are uninterruptible. You can only clear them by rebooting the server or waiting for the I/O to respond. It is normal to see processes in a "D" state when the server performs I/O intensive operations.

If performance becomes an issue, you may need to check the health of your disks. Make sure your firmware and kernel disk drivers are updated.

In the example above, there is heavy disk activity shown in the "io" columns and the server is currently swapping to disk. The example more likely represents a memory issue, rather than a disk I/O issue.

There are two ways to find more about the processes in D state.

1. ps -eo ppid,pid,user,stat,pcpu,comm,wchan:32
This prints a list of all processes where in the last column either a '-' is displayed when the process is running or the name of the kernel function in which the process is sleeping if the process is currently sleeping. This includes also processes which are interruptible. Processes that are in uninterruptible sleep can be determined via the fourth column which would then show a D.

2. echo w > /proc/sysrq-trigger
This command produces a report and a list of all processes in D state and a full kernel stack trace to /var/log/messages. This shows much more information than the first option described above.