nss fails to load after applying January 2011 Scheduled Maintenance for OES2

  • 7008088
  • 09-Mar-2011
  • 27-Apr-2012

Environment

Novell Cluster Services 1.8.4
Novell Open Enterprise Server 2 (OES 2) Linux Support Pack 2

Situation

Applied January 2011 Scheduled Maintenance for OES2 to a three node cluster, two of the nodes restarted all services successfully but the third node failed to load nss as a result the Cluster Volume Broker (cvb) module failed to load caused any resource that tried to load on the node to go comatose with the following message seen in /var/log/messages

Mar  3 11:09:26 slnxmad02 kernel: NSSLOG ==> [Error] comnPool.c[402]
Mar  3 11:09:26 slnxmad02 kernel:      Mar 3, 2011  11:09:26 am  NSS<COMN>-4.12a-689:
Mar  3 11:09:26 slnxmad02 kernel:      Could not change pool PNAME04 to the ACTIVE state.
Mar  3 11:09:26 slnxmad02 kernel: Status=20833 zfsPool.c[1758].
Mar  3 11:09:26 slnxmad02 kernel: Use 'NSS /ErrorCode=20833' to obtain more information.

exectuing lsmod on the node shows that cvb module is not being loaded anywhere, It should be seen in

clstrlib              732312  10 cma,cmsg,crm,cvb,css,vipx,sbd,gipc,vll,sbdlib

and other places

Looking through /var/log/boot.msg shows the following error


Check for ndp devicec/dev/ndp
Device /dev/ndp not ready.
.......... ABORTING /etc/init.d/novell-nss ..........









Resolution

Tid 7004877 discusses making the following changes

UDEV:

In case of lots of ndpapp errors in /var/log/messages, reconfigure /etc/sysconfig/udev so it matches:
UDEVD_MAX_CHILDS=1024
UDEVD_MAX_CHILDS_RUNNING=1024

Additionally the following change was made to /etc/sysconfig/boot

changed "run_parallel" to "no"

After restarting the server nss loaded and so to did cvb, resources now successfully load on the node.