How to check the Service Guard Package that OML did in the configuration ?

  • KM01417772
  • 25-Feb-2015
  • 25-Feb-2015

This document has not been formally reviewed for accuracy and is provided "as is" for your convenience.

Summary

The Next paragraph explain some topics about how to check the Service Guard package that install OML. osname=Red Hat 6.4 OPC_INSTALLED_VERSION=09.11.120 OPC_INSTALLED_VERSION=11.14.014

Question

Topics to check  the Service Guard Package installation that OML did in the configuration.

Answer

OML execute the Service guard configuration for the database resource in an OML Service Guard configuration, since OML install created the Service guard we don’t know what are monitored to check the resources are working fine, the next paragraph explain about it.
To know what are monitored to check that the resources are working fine OMU HA monitoring is performing the following:

1. Checks done by /opt/OV/bin/OpC/utils/ha/ha_mon_oracle
- is the virtual IP active (by checking the output of "ip addr show" command)
- does connectivity to database work (this is done by calling /opt/OV/bin/OpC/install/opc_dflt_lang and checking its return status)

2. Checks done by /opt/OV/bin/OpC/utils/ha/ha_mon_cb
- are ovbbccb and the server instance of ovbbccb running; if needed the server instance of ovbbccb is started

3. Checks done by /opt/OV/bin/OpC/utils/ha/ha_mon_ovserver
- are the following processes running: opcactm, opcmsgm, opcttnsm, opcforwm, opccsad, opcbbcdist, opcdispm, ovoareqsdr, opcmsgrb
 
4. If you need to check Oracle, like kill oracle listener or Stop listener?
OMU HA monitoring in a basic scenario also checks a connectivity to the database by running /opt/OV/bin/OpC/install/opc_dflt_lang. In a decoupled scenario, where the database is separated from the server resource group, OMU HA monitoring works a bit differently as far as checking the Oracle part is concerned, but if the customer uses a basic scenario, where one service group controls both the server and the database.
If the listener is not running, either because of being stopped or killed, opc_dflt_lang returns exit code 1. This means OMU HA monitoring will detect this as a problem.

> Kill database process

That rather depends on the process, which is killed. If ora_mmnl_openview is killed, for example, it gets restarted by Oracle itself and opc_dflt_lang works fine (before and after ora_mmnl_openview is restarted) and OMU HA monitoring will consider the situation OK. Killing ora_q000_openview will also not cause a problem. But if ora_dbw0_openview gets killed, for example, opc_dflt_lang returns exit code 1 (meaning a problem) and Oracle doesn't attempt any restarts. Please note that killing a database process might also cause an abort of one or more OMU server processes.

> Kill OML process

The following processes are monitored: opcactm, opcmsgm, opcttnsm, opcforwm, opccsad, opcbbcdist, opcdispm, ovoareqsdr, opcmsgrb. Killing any of these could trigger a failover. Killing any other process, opcsvcm for example, will not. But even killing either opcactm, opcmsgm, opcttnsm, opcforwm, opccsad, opcbbcdist, opcdispm, ovoareqsdr or opcmsgrb will most likely not cause a failover after all, as ovcd will most probably detect that the process is not running before the cluster does and will probably try to restart it - by default, ovcd (L-Core component) tries to restart the processes it controls 5 times in they have been running for at least a minute.