SAN attached server takes long to boot after move or physical modification.

  • 7014073
  • 09-Nov-2013
  • 14-Nov-2013

Environment

Novell Open Enterprise Server 2 (OES 2) Linux
Novell Cluster Services

Situation

After moving server(s) to new location and seating all of the cables, the servers would take a much longer time to start up and would not join the cluster.  Additionally, most of the time SAN storage could not be seen/accessed.

Errors observed in /var/log/boot.msg include:
<4>913 [RAIDarray.mpp]Waiting for 600 seconds to discover LUNS on Current Owning Path
<4>913 [RAIDarray.mpp]Waiting for 570 seconds to discover LUNS on Current Owning Path
<4>913 [RAIDarray.mpp]Waiting for 540 seconds to discover LUNS on Current Owning Path
<4>855 [RAIDarray.mpp]mppSys_ValidatePath : symbol_get did not return a function pointer
<4>855 [RAIDarray.mpp]mppSys_ValidatePath : symbol_get did not return a function pointer
...
<4>913 [RAIDarray.mpp]Waiting for 180 seconds to discover LUNS on Current Owning Path
<4>913 [RAIDarray.mpp]Waiting for 150 seconds to discover LUNS on Current Owning Path
<4>913 [RAIDarray.mpp]Waiting for 120 seconds to discover LUNS on Current Owning Path
<4>913 [RAIDarray.mpp]Waiting for 90 seconds to discover LUNS on Current Owning Path
<4>913 [RAIDarray.mpp]Waiting for 60 seconds to discover LUNS on Current Owning Path
<4>913 [RAIDarray.mpp]Waiting for 30 seconds to discover LUNS on Current Owning Path
Errors observed in /var/log/messages include:
kernel: 94 [RAIDarray.mpp]INFOX_DS43A:1:1:0 Selection Retry count exhausted
kernel: 495 [RAIDarray.mpp]INFOX_DS43A:1:1:0 Cmnd failed-retry on a new path. vcmnd SN
   460396 pdev H2:C0:T0:L0 0x00/0x00/0x00 0x00010000 mpp_status:6
kernel: 494 [RAIDarray.mpp]INFOX_DS43A:1:0:0 Cmnd-failed try alt ctrl 0. vcmnd SN 46041
   0 pdev H1:C0:T0:L0 0x05/0x94/0x01 0x08000002 mpp_status:1
kernel: 494 [RAIDarray.mpp]INFOX_DS43A:1:0:0 Cmnd-failed try alt ctrl 0. vcmnd SN 46040
   8 pdev H1:C0:T0:L0 0x05/0x94/0x01 0x08000002 mpp_status:1


Resolution

Verify that the fiber cables from server to SAN are good, and plugged into the correct ports on the SAN controller(s).

Additional Information

In one case, it was observed that both fiber cables from a server were plugged into SAN controller "A", while the server was configured to prefer SAN controller "B".  As such, the server would repeatedly try to use the preferred connection before finally giving up.  Ensuring that the cables were plugged into 2 different SAN controllers ensured the preferred path was available and the boot time decreased dramatically.