Environment
Novell Cluster Services 1.8.4
Novell Open Enterprise Server 2 (OES 2) Linux Support Pack 2
Novell Open Enterprise Server 2 (OES 2) Linux Support Pack 2
Situation
Applied January 2011 Scheduled Maintenance for OES2 to a three node cluster, two of the nodes restarted all services successfully but the third node failed to load nss as a result the Cluster Volume Broker (cvb) module failed to load caused any resource that tried to load on the node to go comatose with the following message seen in /var/log/messages
Mar 3 11:09:26 slnxmad02 kernel: NSSLOG ==> [Error] comnPool.c[402]
Mar 3 11:09:26 slnxmad02 kernel: Mar 3, 2011 11:09:26 am NSS<COMN>-4.12a-689:
Mar 3 11:09:26 slnxmad02 kernel: Could not change pool PNAME04 to the ACTIVE state.
Mar 3 11:09:26 slnxmad02 kernel: Status=20833 zfsPool.c[1758].
Mar 3 11:09:26 slnxmad02 kernel: Use 'NSS /ErrorCode=20833' to obtain more information.
exectuing lsmod on the node shows that cvb module is not being loaded anywhere, It should be seen in
clstrlib 732312 10 cma,cmsg,crm,cvb,css,vipx,sbd,gipc,vll,sbdlib
and other places
Looking through /var/log/boot.msg shows the following error
Check for ndp devicec/dev/ndp
Device /dev/ndp not ready.
.......... ABORTING /etc/init.d/novell-nss ..........
Mar 3 11:09:26 slnxmad02 kernel: NSSLOG ==> [Error] comnPool.c[402]
Mar 3 11:09:26 slnxmad02 kernel: Mar 3, 2011 11:09:26 am NSS<COMN>-4.12a-689:
Mar 3 11:09:26 slnxmad02 kernel: Could not change pool PNAME04 to the ACTIVE state.
Mar 3 11:09:26 slnxmad02 kernel: Status=20833 zfsPool.c[1758].
Mar 3 11:09:26 slnxmad02 kernel: Use 'NSS /ErrorCode=20833' to obtain more information.
exectuing lsmod on the node shows that cvb module is not being loaded anywhere, It should be seen in
clstrlib 732312 10 cma,cmsg,crm,cvb,css,vipx,sbd,gipc,vll,sbdlib
and other places
Looking through /var/log/boot.msg shows the following error
Check for ndp devicec/dev/ndp
Device /dev/ndp not ready.
.......... ABORTING /etc/init.d/novell-nss ..........
Resolution
Tid 7004877 discusses making the following changes
UDEV:
In case of lots of ndpapp errors in /var/log/messages, reconfigure /etc/sysconfig/udev so it matches:
UDEVD_MAX_CHILDS=1024
UDEVD_MAX_CHILDS_RUNNING=1024
Additionally the following change was made to /etc/sysconfig/boot
changed "run_parallel" to "no"
After restarting the server nss loaded and so to did cvb, resources now successfully load on the node.
UDEV:
In case of lots of ndpapp errors in /var/log/messages, reconfigure /etc/sysconfig/udev so it matches:
UDEVD_MAX_CHILDS=1024
UDEVD_MAX_CHILDS_RUNNING=1024
Additionally the following change was made to /etc/sysconfig/boot
changed "run_parallel" to "no"
After restarting the server nss loaded and so to did cvb, resources now successfully load on the node.