GroupWise High Availability (gwha) in a Linux Cluster a Quick Start Guide

  • 7014855
  • 05-Apr-2014
  • 10-Nov-2015

Environment

Novell GroupWise 2012 Support Pack 2

Situation

Concepts:

By default for a cluster resource (GroupWise resource included) , if the node or nss pool on the node go down , then of course the resource fails over , as configured, to another node. However by default, if a GroupWise agent, like the POA, goes down, and if the node and node pool are ok, the GroupWise POA will not restart automatically. This is normal behavior.

 However you can set up the GroupWise resource, in a cluster environment, to automatically restart a POA, DVA, MTA, or GWIA if they go down with the GroupWise "High Availability Service" (gwha).

 There are some configuration changes that have to be made to make the gwha service work in the clustered environment with GroupWise.

The GroupWise High Availability service relies on the GroupWise Monitor Agent to detect when a GroupWise agent is no longer running. The Monitor Agent notifies the GroupWise High Availability service of any problem, then the GroupWise High Availability service immediately issues the command to start the problem agent. The GroupWise High Availability Service runs as root, as configured in the /etc/xinetd.d/gwha file.

A single Monitor Agent can service multiple instances of the GroupWise High Availability service on multiple servers, as long as all instances use the same user name and password (discussed later) to communicate with the Monitor Agent.

 Although you need a GroupWise High Availability service running on each Linux server where there are GroupWise agents, you need only one Monitor Agent to monitor all agents in your GroupWise system.

 The Monitor Agent uses the --hauser and --hapassword switches to communicate with the GroupWise High Availability service on port 8400.

Resolution

Action Items: 

 Note: A local Linux user called "hauser" is used in these instructions.  You can choose whatever name you want.

 1.     Go to a terminal as “root” and issue this command to Create the gwha user as a local Linux user on the first of the cluster nodes :

a.)  useradd -d /home/hauser -s /bin/bash -c "FHauser LHauser" hauser

b.)  passwd hauser

       c.)  After creating the above user make certain you can login as the user successfully, by going to the terminal and issuing the command "su hauser" and verify you can log in with the password you specified.  No quotes.

       d.)  When successful with the “hauser” login, logout of this account with “exit”.  No quotes.

       e.)  Create this same “hauser” with this same procedure on every node in the cluster where the GroupWise resource could potentially failover or migrate to.

2.     Install the GroupWise Monitor Agent

Concept :  Monitor Agent software is installed to all nodes in the cluster, but only runs on 1 node at a time :

a.)     You can check if you have already have the Monitor Agent installed by issuing the following command at a terminal as “root”, on all nodes in the cluster :

a.     /etc/init.d/grpwise-ma status

b.     If you get an error “No such file or directory” you do not have it installed, go to Step # 3.

c.     If you get a status of “running” you have it installed, shut it down with “/etc/init.d/grpwise-ma stop” (no quotes) then skip to Step # 3

d.     If you get a status of “unused”, skip to Step # 3.

3.     Go To: https://www.novell.com/documentation/groupwise2012/gw2012_guide_interop/data/bxfkhaj.html

a.     Read the short paragraph in the section “Installing and Configuring the Linux Monitor Agent on Each Node in Your Cluster”

b.     Change to the /opt/novell/groupwise/software/ installation directory (or wherever you have the GroupWise installation directory files), and as “root”

c.     run ./install,

d.     Do Steps 1, 2, and 3 in the section “Running the Linux Monitor Installation Program on the Preferred Node”, in this same section for Step # 4, go to the hot link as listed - “Installing and Configuring the Linux Monitor Agent”, and at that location, START with Step # 5 and then do Steps 5 thru 9 ONLY.  Also on this same Linux server modify and save the following file  "/etc/init.d/grpwise-ma" , with the following switch and values.  Remove the # symbol in front of this line and edit accordingly

                                  i.    MA_OPTIONS="--hauser hauser --hapassword <passwordYouSpecifiedInStep1b> --hapoll 30"

                                 ii.    Note:  Quotes are used in the above syntax.

e.     Go back to the previous documentation Web URL as listed in Step 3d and continue where you left off on Step # 4, starting with the text - “Pay special attention to the cluster resource information on the System Options page”.  Complete the steps in this section.  Disregard the last bullet list item just before Step #5.

f.      Now do the steps in the section - “Running the Linux Monitor Agent Installation Program on Subsequent Nodes”, Remember this step will be done on ALL nodes in the cluster, one at a time

                                  i.    DO NOT do Step # 5 in this section.  Use of SSL is not covered in this document.  After you are done with this section, Exit the GroupWise installation program.

g.     At this point you need to copy 1 file from the Linux server you initially installed the Monitor Agent to in Step # 3d, to ALL nodes now.  Copy the file “rcgrpwise-ma” in the /usr/sbin/ directory to the same directory on each node in the cluster.  You can use this command at a terminal as “root”  :

                                  i.    scp /usr/sbin/rcgrpwise-ma root@<YourNode2DomainName>:/usr/sbin

                                 ii.    Do the same as above for node3, node4, etc

h.     At this point you need to copy 1 file from the Linux server you initially installed the Monitor Agent to in Step # 3d, to ALL nodes now.  Copy the file “grpwise-ma” in the /etc/init.d/ directory to the same directory on each node in the cluster.  You can use this command at a terminal as “root” :

                                  i.    scp /etc/init.d/grpwise-ma root@<YourNode2DomainName>:/etc/init.d

                                 ii.    Do the same as above for node3, node4, etc

i.      Do the steps in the following section: “Testing the Linux Monitor Agent Installation on Each Node”, however do not do the steps listed in the section: “Configuring the Monitor Agent Cluster Resource To Load and Unload the Linux Monitor Agent”.  This document will go over what is needed later in example Load and Unload scripts.

4.     Test whether GWHA daemon is listening using the command "netstat -tnlp | grep 8400"

5.     Make sure to have an “HTTP User Name” and “HTTP Password” defined for the GroupWise Agents to be monitored in ConsoleOne (Properties of the MTA, POA, and GWIA objects).

a.     For the MTA and POA: Under the GroupWise tab , Agent Settings :

                                  i.    HTTP User Name and HTTP Password set under section “HTTP Monitor Settings”

b.     For the GWIA :  Under the GroupWise tab, Optional Gateway Settings :

                                  i.    HTTP User Name and HTTP Password set under section “HTTP Monitor Settings”

6.     The following are my example GroupWise Resource Unload and Load scripts.  Because I have a lot of information with regard to loading and unloading the GroupWise Agents and because there is a limit in size of the Load and Unload scripts, I have placed the GroupWise load and unload commands in a separate batch files (gwstart, gwstop) that is called by the GroupWise Cluster Resource Load and Unload scripts.  These are just examples; they work in my test environment fine.

 

7.     Note in the Unload script the comments with regard to “gwha” and “xinetd”, they explain the reason for the placement of these commands and in the script and what they do.

 

8.     Copy the comments and the 2 commands under them (in RED) into your Unload script of your GroupWise Resource NOW, place this under the command “ncsfuncs”, at the top of your Unload script :

 

#!/bin/bash

. /opt/novell/ncs/lib/ncsfuncs

# Unload the xinetd daemon, GWHA used in this NCS system

# This is needed so the GroupWise agents can unload, otherwise

# xinetd and gwha would just restart them

ignore_error /sbin/chkconfig -s gwha off

ignore_error /etc/init.d/xinetd stop

/root/gwstop

ignore_error ncpcon unbind --ncpservername=CLUSTER-DATA-SERVER --ipaddress=10.10.10.10

ignore_error del_secondary_ipaddress  10.10.10.10

ignore_error nss /pooldeact=DATA

exit 0

 

9.     Changes that you need to make to your GroupWise Resource Load script, are in RED, here is mine to show as an example of what works, placed just above the “gwstart” and “exit 0” commands , do it NOW :

#!/bin/bash

. /opt/novell/ncs/lib/ncsfuncs

exit_on_error nss /poolact=DATA

exit_on_error ncpcon mount DATA=254

exit_on_error add_secondary_ipaddress  10.10.10.10

exit_on_error ncpcon bind --ncpservername=CLUSTER-DATA-SERVER --ipaddress=10.10.10.10

ignore_error /sbin/chkconfig –s gwha on

ignore_error /etc/init.d/xinetd start

/root/gwstart

exit 0

10.   Here is my example GroupWise Start and Stop batch files, You will need to ADD to your GroupWise Cluster Resource LOAD and UNLOAD scripts ( or if you use a batch file like me ) the commands to START and STOP grpwise-ma as noted (in RED) in the “gwstart” and “gwstop” script files , do it NOW.

 

-  GWSTART - :

#!/bin/bash
. /opt/novell/ncs/lib/ncsfuncs
exit_on_error /etc/init.d/grpwise start Domain1
sleep 10
exit_on_error /etc/init.d/grpwise start gwdva
sleep 10
exit_on_error /etc/init.d/grpwise start Post1.Domain1
sleep 10
exit_on_error /etc/init.d/grpwise start GWIA.Domain1
sleep 10
exit_on_error /etc/init.d/grpwise-ma start

 

-  GWSTOP - :

#!/bin/bash
. /opt/novell/ncs/lib/ncsfuncs
ignore_error /etc/init.d/grpwise stop Domain1
sleep 10
ignore_error /etc/init.d/grpwise stop dva

Sleep 10
ignore_error /etc/init.d/grpwise stop Post1.Domain1
sleep 10
ignore_error /etc/init.d/grpwise stop GWIA.Domain1
sleep 10
ignore_error /etc/init.d/grpwise-ma stop

 

11.   Open a browser and go to url "http://<ipAddressOfMonitorAgentServer>:8200". You should see the agents up and running and in the listening status.

12.   Test by bringing down one of the agents. Note that after about 45  seconds , the agent should start automatically.

a.     If you are not sure how to stop your GroupWise agent for a test, do an “rcgrpwise status” at a Linux terminal as “root” to find out the names of your GroupWise agent objects so that you can then Unload one of them for a test :

 

Additional Information

Assumptions: 

 1.     It is assumed that this document is not intended to be a complete step by step guide to setup GroupWise in a cluster in the Linux environment.  This document is intended to show the minimal requirements and necessary configuration to allow the GroupWise High Availability Service (GWHA) to function properly with an existing GroupWise system that is already installed in the Linux cluster.  For more detailed complete information you can review the Novell Documentation with regard to clustering GroupWise on Linux :

a.     https://www.novell.com/documentation/groupwise2012/gw2012_guide_interop/data/bwc325u.html

 

2.     If you want complete information about implementing GroupWise Monitor in a Linux Cluster you can go to:

a.     https://www.novell.com/documentation/groupwise2012/gw2012_guide_interop/data/bwe3c4q.html

 

Note:  This is not “Best Practices” to have all of the below GroupWise agents running on 1 node, but is used as an instructional example only.

 

3.     It is assumed for the purposes of this document that the GroupWise resource in the cluster has 1 GroupWise domains, 1 Post Office, a DVA, and a GWIA and the GroupWise Monitor Agent. 

 

4.     It is assumed that your cluster is running on SLES11 / OES11, and that your existing GroupWise cluster resource (GroupWise MTA, DVA, POA, GWIA) is able to load, unload, migrate and fail over correctly in the cluster.

 

 

5.     It is assumed that the cluster has a shared nss volume called DATA created and the mount point is “/media/nss/DATA/”, where there is a sub-directory called /mail/ and under it is located the directories for the GroupWise domain (domain2) and Post Office (post1).

 

6.      It is assumed in this document that you do not have the GroupWise Monitor Agent installed yet.  If by chance you do have it installed then continue with the next steps that you have not yet accomplished that are listed.