OMW Cluster services are unable to failover – what can be done to resolve this issue?

  • KM00788784
  • 13-Mar-2014
  • 27-Oct-2016

Summary

This document explains how to use ovowclusermove.vbs to help fail over the OMW cluster service (Operations Manager for Windows) Operations Manager for Windows OMW 8.10 Operations Manager for Windows OMW 8.16 Operations Manager for Windows OMW 9.00

Question

When OMW cluster services fail to failover to the secondary node what actions can be done?

Answer

OMW Clusters can take some time to fail over within the cluster. This can be caused by the OMW services not stopping. Typically, OvEpMessageActionServer has been successfully stopped but is then restarted a number of times. Other services, like OvStoreProvider does not stop because the OvEpMessageActionServer has been restarted a number of times.
 
In order to resolve this issue, HP has introduced a script called %OvInstallDir%\contrib\OVOW\Failover\ovowclustermove.vbs with patch OMW_00138 (OMW 8.19.060)/OMW_00139 (OMW 9.00.060). The ovowclustermove.vbs script is started on the active node within the cluster. It sets the state of the OMW services to disabled, which means OvEpMessageActionServer and OvStoreProvider cannot restart. Once the failover has finished, the ovowclustermove.vbs script then enables the OMW services. This means once a failover back to the original node is started, services are in an enabled state so they can start.
 
The script requires the OMW Cluster Resource Group Name as a parameter:
 
cscript "%OvInstallDir%\contrib\OVOW\Failover\ovowclustermove.vbs" <OMWClusterResourceGroupName>
 
To find the "OMW Cluster Resource Group Name" run "%OvInstallDir%\bin\win64\ovclusterinfo -a" on the active side of the cluster. The output will look like:
 
#Cluster
type         Microsoft Cluster Server (MSCS)
name         OMWCluster
status       Up
nodes        node1 node2
groups       Cluster Group Available Storage OMW9 Server SQL Server (OVOPS_OMW9) ...
 
 
#Group OMW9 Server <- This is the resource name
state        Online
nodes        node1 node2
local state Offline
virtual IP   16.23.28.79
active node node1 <- This is the active node
 
The resource name is "OMW9 Server" and the active cluster node is "node1".
 
Below shows an example from running %OvInstallDir%\contrib\OVOW\Failover\ovowclustermove.vbs:
 
C:\>cscript "%OvInstallDir%\contrib\OVOW\Failover\ovowclustermove.vbs" "OMW9 Server"
Microsoft (R) Windows Script Host Version 5.8
Copyright (C) Microsoft Corporation. All rights reserved.
Trying to locate omw9 server cluster resource group and active node...
omw9 server resource group found, active node is: node1
Local computer name: node1
Disabling OMW services...
Running command: sc config "OvAutoDiscovery Server" start= disabled
Executed with status code 0
Running command: sc config "OvDnsDscr" start= disabled
Executed with status code 0
Running command: sc config "OvEpMessageActionServer" start= disabled
Executed with status code 0
Running command: sc config "OvEpStatusEngine" start= disabled
Executed with status code 0
Running command: sc config "OvMsmAccessManager" start= disabled
Executed with status code 0
Running command: sc config "OvOWReqCheckSrv" start= disabled
Executed with status code 0
Running command: sc config "OvowWmiPlatProv" start= disabled
Executed with status code 0
Running command: sc config "OvPmad" start= disabled
Executed with status code 0
Running command: sc config "OvSecurityServer" start= disabled
Executed with status code 0
Running command: sc config "OVServiceLogger" start= disabled
Executed with status code 0
Running command: sc config "OvStoreProv" start= disabled
Executed with status code 0
Starting failover ...
Executed with status code 0
Enabling OMW services...
Running command: sc config "OvAutoDiscovery Server" start= demand
Executed with status code 0
Running command: sc config "OvDnsDscr" start= demand
Executed with status code 0
Running command: sc config "OvEpMessageActionServer" start= demand
Executed with status code 0
Running command: sc config "OvEpStatusEngine" start= demand
Executed with status code 0
Running command: sc config "OvMsmAccessManager" start= demand
Executed with status code 0
Running command: sc config "OvOWReqCheckSrv" start= demand
Executed with status code 0
Running command: sc config "OvowWmiPlatProv" start= demand
Executed with status code 0
Running command: sc config "OvPmad" start= demand
Executed with status code 0
Running command: sc config "OvSecurityServer" start= demand
Executed with status code 0
Running command: sc config "OVServiceLogger" start= demand
Executed with status code 0
Running command: sc config "OvStoreProv" start= demand
Executed with status code 0
 
Once the script has finished, confirm that the failover has been successful using "%OvInstallDir%\bin\win64\ovclusterinfo -a".
 
The enhancement request QCIM1A173655 describes how this can be automated.