Environment
Novell Open Enterprise Server 11 (OES 11) Linux
Novell Cluster Services
SUSE Linux Enterprise Server 10
SUSE Linux Enterprise Server 11
SUSE Linux Enterprise Desktop 10
SUSE Linux Enterprise Desktop 11
Situation
Changing a Service from starting automatically to manually implies that this service also changes from being stopped automatically to manually.
Resolution
- insserv -r [service name]
- chkconfig [service name] off
- yast runlevel, then search for [service name] and disable it.
(replace the [service name] with the actual name of the service).
This however has the consequence that this service from now it is strongly recommended to stop the service manually prior to rebooting or shutting down the server.
For instance disabling Novell Cluster Services (novell-ncs) on a Novell Open Enterprise Server (OES) may cause the server to suffer from "Split Bain" crashes when failing to stop the service manually prior to initiate a system reboot or shutdown.
Cause
From this moment on, when the service is active when the system is being brought down, the service is killed in the final phase of the shutdown.
For instance, for novell-ncs:
When this service is configured to start automatically, these symbolic links are in place:
- /etc/init.d/rc3.d/K01novell-ncs
- /etc/init.d/rc3.d/S14novell-ncs
- /etc/init.d/rc5.d/K01novell-ncs
- /etc/init.d/rc5.d/S14novell.ncs
These basically cause the Novell Cluster Services to start as one of the last services in runlevel 3 and 5 and to stop the service as one of the first when the system is shutting down.
When disabling the novell-ncs service from starting automatically these symlinks are removed. From that moment on services like the network and multipathd are potentially stopped before novell-ncs is halted.
As Novell Cluster Services relies on these services for it's split brain mechanisms, this can cause "GIPC link is down" message and the server to suffer from kernel cores during shutdown caused by a poison pill or Novell Cluster Services initiated suicide.
Therefor it is recommended to stop all services, that were not started automatically, manually before initializing a system shutdown or reboot.
Additional Information
In case a secondary node notices it can not update the SBD partition for the given fault tollerence (by default 8) it commits suicide and reboots.
In case that the Master node does not update it's slot on the SBD partition and that node does not send any broadcasts over the LAN for the given fault tolerance (in a default setup 8) the Master node is deemed offline and one of the secondary nodes assumes the role of the new master node. The new master node then increases the epoch of the cluster and starts broadcasting the new panning id.
In case the old master node re-attaches the SBD or LAN with the old clusters epoch and panning ID, this node receives a poison pill from the new, current master node.
More details on split brains can be found in "The Gory details of Heartbeats, Split Brains and Poison Pills".
When performing "chkconfig novell-ncs off", it doesn't automatically shut down novell-ncs. Just like executing "chkconfig novell-ncs on" doesn't automatically start novell-ncs.