Large file copy slow to iSCSI SAN.

  • 3699166
  • 28-Sep-2006
  • 27-Apr-2012

Environment

Novell NetWare 6.5 Support Pack 5
Novell Open Enterprise Server (OES) Support Pack 1 NetWare
Cisco 3560 gigabit switch
ISCSIHAM.HAM NetWare iSCSI HAM Driver - Version 1.05.00 December 15, 2005

Situation

Netware iSCSI initiator was connected to this 3rd party iSCSI SAN target.
Disk I/O performance seemed to be sluggish.
Copying a large file to the target device appeared to hang up the server.
Volume acts like it's not accessible any longer. Users cannot map drives or access the volume.
Current Disk Requests in Monitor showed over 1000.

Resolution

Applied new ISCSI code to address problem with DeviceBlockSize and CHAP.
Need ISCSI.HAM Version 1.05.03 July 26, 2006 or newer.

Additional Information

Troubleshooting Steps
1. Obtained a core dump during apparent hang condition. Pending IOs were at 1000. By using the command NSS /ZLSSIOSTATUS like shown below to verify the state of IO at the NSS layer.
2. Gathered the ISCSI REPORT. This is done by typing at console: ISCSI REPORT and this generates a log file in SYS:\ISCSI.TXT
3. Gathered a LAN trace between server and iSCSI SAN.
4. Applied updated Winsock code from NW65SP5UPD1.EXE patch.
NOTES
ISCSI.TXT showed the following:
0x000010FB="[!] initiator_get_connection no connection available hacb=0xB2E31B
This error means that the connection was not available the time the HACB was sent down so it has to queue it up and returns this error which is not really a critical error. Generally means that it's a little busy with some write requests.
[DeviceHandle=0x00000000]
VenderID="CYBERNET"
ProductID="iSAN Vault "
RevisionLevel="0214"
DeviceAddress=0x026500E0
DeviceLanHandle=33554434
DeviceType=0x0
DeviceLun=0
DeviceSCSIID=2
MaximumFragmentSize=8192
MaximumNumberOfFragments=64
MaximumTransferSize=8192
DeviceRequests=60963
DeviceRequestsQueued=30181
DeviceRequestsAborted=0
DeviceUnitSize=512
DevicePreferredUnitSize=512
DeviceBlockSize=16
DeviceCapacity=-100663296
TotalCapacitySize=""
The DeviceBlockSize=128 once applying updated ISCSI code to address CHAP authentication defect where it would negotiate this down to 16 which makes iSCSI send small blocks of data, making it much less efficient.
NSS /ZLSSIOSTATUS
Async IO Information
Write count queue level = 1000
Pending Write IOs on queue = 49169 NOTE: Number of requests held up in NSS
Current Outstanding Write IOs = 1000