OES2 SP3 server hangs on "turning off swap"

  • 7009555
  • 12-Oct-2011
  • 30-Apr-2012

Environment

Novell Open Enterprise Server 2 (OES 2) Linux Support Pack 2
Novell Open Enterprise Server 2 (OES 2) Linux Support Pack 3
SUSE Linux Enterprise Server 10 Service Pack 3
NSS Volumes reside on iSCSI targets

Situation

When rebooting or halting an OES2 SP2 or SP3 server, the server will hang when going down.  The last thing seen on the console is "turning off swap".
The only possible way to recover from this is to power off the server.
You can verify if this is your issue by trying to stop the iSCSI daemon: "rcopen-iscsi stop".  If it says "skipped" instead of actually stopping, this is your issue.


Resolution

To permanently fix this, the following changes need to be made.

To the /etc/init.d/novell-nss file, the stop) section currently looks like this:

stop)
       exit 3
       ;;
#     echo "Shutting down Novell Storage Services (NSS)"
#     . /opt/novell/nss/sbin/stopnss.bsh
#     rc_status -v
#     ;;

Modify this file by either removing the exit 3 and subsequent semi-colons, or remarking those lines out, then un-remarking the next 4 lines, so the section now looks like this:

stop)
       echo "Shutting down Novell Storage Services (NSS)"
      . /opt/novell/nss/sbin/stopnss.bsh
      rc_status -v
      ;;

After that is done, the /opt/novell/nss/sbin/stopnss.bsh file will need to be manually created.  Run the following to accomplish this:
1.  vi /opt/novell/nss/sbin/stopnss.bsh
2.  Press the "INSERT" key to be able to write info to the file.
3.  Copy this entire block of text into the file:


#! /bin/bash

# Do not unmount the pools if the NODE is a member of a cluster.
# The cluster file "/admin/Novell/Cluster/NodeState.xml" will have the node
# status in the file in the form of XML tags <cluster>. If the node is part
# of a cluster, we should see  "<cluster>****</cluster>" in the said file.

declare -r CLUSTER_STATUS_FILE="/admin/Novell/Cluster/NodeState.xml"
declare -r MEMBER_STATUS="<cluster>"
declare -r NSSCMD="/dev/nsscmd"
declare -r POOLSDEACTCMD="/PoolDeactivate=ALL"

function umountpools
{
    echo "Unmounting all NSS pools and volumes ..."

    umount -a -t nsspool
    echo ${POOLSDEACTCMD} > ${NSSCMD}
}


# See if the CLUSTER_STATUS_FILE exists; if not, proceed to unmount the pools.
if [ ! -e ${CLUSTER_STATUS_FILE} ];
then
    umountpools
    exit $?
fi

# Read the CLUSTER_STATUS_FILE line by line to see the MEMBER_STATUS.
# Do NOT unmount the pools if the MEMBER_STATUS is found.
while read line
do
    if [[ $line =~ ${MEMBER_STATUS} ]];
    then
        echo "This Node is part of the cluster;"\
            " so this script can not be used to unmount NSS pools."
        exit 3
    fi
done < ${CLUSTER_STATUS_FILE}

# We reached this place since this NODE is not part of a cluster.
umountpools

4.  Write and close the file.  Hit "esc", then the "colon" key, then type in "wq" (without the quotes) and hit enter.

5.  Change the file to be executable. Type in:

chmod 755 /opt/novell/nss/sbin/stopnss.bsh

At this point, the only thing left to do is test and make sure that the server goes down correctly and doesn't hang.