Environment
Novell Open Enterprise Server 2 (OES 2) Linux Support Pack 3
SUSE Linux Enterprise Server 10 Service Pack 3
SUSE Linux Enterprise Server 10 Service Pack 4
SUSE Linux Enterprise Server 10 Service Pack 3
SUSE Linux Enterprise Server 10 Service Pack 4
Situation
/var/log/messages contain such as this:
kernel: scsi(0:0:1) UNDERRUN status detected 0x15-0x800. resid=0x5c fw_resid=0x0 cdb=0xa3 os_underflow=0x0
These repeat and are causing concern. It seems to have started after applying the current patches
kernel: scsi(0:0:1) UNDERRUN status detected 0x15-0x800. resid=0x5c fw_resid=0x0 cdb=0xa3 os_underflow=0x0
These repeat and are causing concern. It seems to have started after applying the current patches
Resolution
There's two types of UNDERRUNs you'll typically see, ones where the sender/receiver agree on the residual amount, are are perfectly valid: Sep 29 23:32:40 hkda2ls0005 kernel: scsi(2:0:5) UNDERRUN status detected 0x15-0x800. resid=0x46 fw_resid=0x46 cdb=0x12 os_underflow=0x0 Sep 29 23:32:40 hkda2ls0005 kernel: scsi(2:0:5) UNDERRUN status detected 0x15-0x800. resid=0x2e fw_resid=0x2e cdb=0x12 os_underflow=0x0 Sep 29 23:32:40 hkda2ls0005 kernel: scsi(2:0:57) UNDERRUN status detected 0x15-0x800. resid=0x58 fw_resid=0x58 cdb=0x25 os_underflow=0x0 Sep 29 23:32:40 hkda2ls0005 kernel: scsi(2:0:57) UNDERRUN status detected 0x15-0x800. resid=0x56 fw_resid=0x56 cdb=0x12 os_underflow=0x0 and others where the residuals don't match, where the command is returned with a 'failed, yet retry-able' status: Sep 9 11:35:44 hkda2ls0005 kernel: scsi(3:0:93) UNDERRUN status detected 0x15-0x0. resid=0x0 fw_resid=0x73000 cdb=0x28 os_underflow=0x80000 Sep 9 11:35:44 hkda2ls0005 kernel: scsi(3:0:0:93) Dropped frame(s) detected (73000 of 80000 bytes)...retrying command. Sep 9 11:35:44 hkda2ls0005 kernel: sd 3:0:0:93: SCSI error: return code = 0x00070000 Sep 9 11:35:44 hkda2ls0005 kernel: end_request: I/O error, dev sdci, sector 15659008 Sep 9 11:35:44 hkda2ls0005 kernel: device-mapper: multipath: Failing path 69:96. These can happen for a variety of reasons, mostly hardware/cabling problem related.