How to Set Up Reload Disaster Recovery

  • 7019488
  • 17-May-2013
  • 29-Aug-2017

Environment

Reload (all versions)

Situation

The post office and/or domain server goes down or is inaccessible for some reason and users need access to email.

Resolution

This is where Reload's Disaster Recovery feature becomes a life saver.  Reload's Disaster Recovery is literally a "push button solution"; however, there is some configuration that must be done to be ready for that.  This document guides the administrator through all the configurations steps, enabling disaster recovery, basic troubleshooting, and post disaster recovery steps.

This document makes some assumptions and has some limitations:


  • The administrator knows how to bind multiple IP addresses to a NIC.  The solution described in this document relies on that concept.  If an administrator doesn't wish to (or can't) bind multiple IPs to a NIC, DR can still be configured but it isn't ideal.  This solution makes the transition to DR completely transparent to the users, requiring absolutely no configuration on the client.
  • DNS is the preferred solution, even though this technically could be accomplished with hosts files.  It would be up to the reader to take these concepts and apply them to a scenario using hosts files if that is what is desired.
  • The administrator or others within the customer's organization knows how to add to and edit records in DNS.
  • The steps given are in SLES 11, although they should be very close to any other supported version of SLES.
  • This guide does not include advanced instructions for creating a GWIA on the Reload server.  The reader can access those instructions from the KB article, "How Do I Configure a GWIA on My Reload Server?"; however, it should be noted that we recommend as a best practice (where possible) that the GWIA be on its own domain on a separate server.  That way, should a domain server and/or a post office server become unavailable, the Internet communication isn't impacted.  It is much easier to re-install GWIA should its server go down.
  • This guide steps through the DR of a post office. The same general concepts apply to domains.
There are five main steps in configuring Disaster Recovery:

1.  Assigning and binding IP addresses for each Reload post office profile.

2.  Creating DNS records for each post office and domain.

3.  Configuring each MTA and POA in ConsoleOne with a DNS name.

4.  Configuring the Reload POA and MTA.

5.  Installing ConsoleOne on the Reload Server.

I.  Assigning/Binding IP Addresses for Each Reload Post Office Profile

1.  Acquire a list  of available IP addresses and assign one to each Reload post office profile.  None of these should use the main IP address of the Reload server.

2.  Bind those IP addresses.  This can be done for the server's NIC in YaST under Network Settings | Overview | Edit.  Each IP is assigned an alias name.  It doesn't matter what name is assigned, only that it makes it easier for you to remember.

II.  Creating DNS Records for Each Post Office and Domain

There should be an A record for every post office and domain in DNS, which will give each IP address for every post office a DNS alias or name.  If multiple post offices reside on the same server, then multiple available IP addresses will need to be bound to that server's NIC.

Should a post office server become unavailable (necessitating the need for DR mode to be enabled), all the administrator has to do is change the A record for the post office.  The post office's IP address will temporarily be replaced with the DR POA's IP address on the Reload server that represents the backup of that post office.

This way, the users don't even know anything happened.  They just launch the client as usual.  Since the client is configured to look for a DNS name rather than an IP address, it will connect now with the DR POA and the users think that they're connected to the live post office.

When the original post office is ready to be brought back into production, typically administrators will wait until after normal working hours and have existing users exit GroupWise.  They'll turn off the DR POA on Reload, change the A record to reflect the production post office server again, run the Reload Migration tool, and then launch the production post office POA.  At this point, users can get back into GroupWise.

This is why this configuration is the preferred method.  It eliminates the need for any configuration on the client.  This drastically speeds up enabling users to connect to the DR POA.  In reality, within a minute users can be back into GroupWise when a post office goes down.

III.  Configuring Each POA and MTA In ConsoleOne With a DNS Name

Go into ConsoleOne and edit the properties of each POA and MTA object and assign the DNS name you've created in DNS for each of those agents' post offices or domains.  This is assigned in the Network Address setting of the agent in the TCP / IP Address field.

While editing the POA object(s), make a note of its message transfer port (MTP).  The corresponding Reload POA for this post office will need to use this same port so that it can communicate with the production MTA.

While editing the MTA object(s), make a note of its message transfer port (MTP).  This will be the MTP Out port that the Reload POA uses so that it can communicate with the production MTA.

IV. Configuring MTP Communication Between MTAs and POAs


MTAs and POAs should always be configured to communicate via MTP (Message Transfer Protocol vs.message  file queuing and message file scanning)  to one another. MTP communication is not only a faster means of communication between MTAs and POAs, but it is essential disaster recovery purposes.

Make sure that the MTAs and the POAs have an MTP port specified. Typically the MTP port for the MTA is 7100, and the MTP port for the POA is 7101.

In ConsoleOne call up the Link Configuration utility and make sure that the Domain to Post Office communication is TCP. When this is done, the MTA and the POA will communicate with one another via MTP.

V.  Configuring the Reload POA and MTA


This can be done from either the Reload web interface or from it's administration console.  Only the basic settings needed to run DR are covered here.

Web Interface

1.  From the home screen, click on the profile you wish to configure.

2.  Click on the Configure tab.

3.  Click on Configure Disaster Recovery [ FAILOVER ].

TCP / IP Address:  Put in the IP address you assigned to this POA on the Reload server.  It is unique from the production POA's IP address.

Client / Server Port:  This should be the same port the production POA uses.  This way, the user doesn't have to do any configuring of the client.  In fact, if done correctly, the user is completely unaware that the client is connecting to the DR POA on the Reload server.

Inbound Message Transfer Port (MTP):  Enter the message transfer port number that you took note of when in ConsoleOne editing the POA object's network address.  It should match.  This is the port the DR  POA listens on to receive messages from the MTA and this setting should match the production POA.  Default setting is port 7101.

Use Disaster Recovery Mode POA HTTP Port:  This is optional.  It's for the GroupWise POA monitoring feature.

Outbound Message Transfer Port (Domain MTA) (MTP) Address:  Enter the IP address of the production MTA.  As well, enter the message transfer port number that you took note of when in ConsoleOne editing the MTA object's network address.  This is the port the MTA listens on to receive outbound messages from the Reload DR POA. Default setting is port 7100.

4.  Click on the Failover Settings button.  Ensure that all settings are enabled here.


Administration Console

1.  Select:  Profiles | POST OFFICE PROFILES | [profile] | Disaster-Recovery | Configure | F [FAILOVER] | Disaster Recovery POA Settings.

2.  Configure the Disaster Recovery POA

Address:  Put in the IP address you assigned to this POA on the Reload server.  It is unique from the production POA's IP address.

Client:  This should be the same port the production POA uses.  This way, the user doesn't have to do any configuring of the client.  In fact, if done correctly, the user is completely unaware that the client is connecting to the DR POA on the Reload server.

Allow:  This enables the optional HTTP Port setting. It's for the GroupWise POA monitoring feature.

HTTP:  Disaster Recovery POA HTTP port (optional).

MTPIN:  Enter the message transfer port number that you took note of when in ConsoleOne editing the POA object's network address.  It should match.  This is the port the DR  POA listens on to receive messages from the MTA and this setting should match the production POA.  Default setting is port 7101.

MTA:  Production domain MTA TCP / IP or DNS address.

MTPOUT:  Enter the message transfer port number that you took note of when in ConsoleOne editing the MTA object's network address.  This is the port the MTA listens on to receive outbound messages from the Reload DR POA. Default setting is port 7100.

VI.  * Installing ConsoleOne on the Reload Server

1.  Establish graphical connectivity to the Reload server.

2.  Install ConsoleOne to the Reload server.

3.  Install the GroupWise Snapins.

4.  Test connecting ConsoleOne to a domain backed up by Reload.

5.  In ConsoleOne, browse to the /[domain profile path]/connect/current directory.

* Your Reload consultant may be able to provide services to pre-establish ConsoleOne;or GWAVA can provide consulting services to help with this.


Documenting and Testing Disaster Recovery

At this point, all the hard work has been done.  It is strongly recommended that your disaster recovery plan be documented and tested. 

Document Your DR Plan
Reload provides a great sample disaster recovery document during its installation.  It can be found at /opt/beginfinite/reload/web/custom/drplan/drplan.pdf.  A link to that document is provided on the home screen of the Reload web interface.

Test Your DR Setup
It is assumed that this will be done during a scheduled outage and that users are properly notified.

1.  Change the A record for the post office you are taking down.  Set its IP address to the DR POA IP address for the backup profile for this post office.

2.  Unload the production post office POA.

3.  Go to the Reload web interface and click on the ambulance button for the profile in question.  The icon will change within at least a minute to an ambulance with bright yellow lights shining.  This tells you that the DR POA has been loaded.

  • This disables the backups for the profile while it is in DR mode.
  • All queued backup jobs for this profile will be removed from the job queue.
  • The Access Mode and Restored Mode POAs will be unloaded (if loaded).
  • The DR POA will be loaded against the most current backup.

4.  Launch a GroupWise client that normally connects to the production post office that has been taken down.

Troubleshooting Issues


This, by no means, is meant to be an exhaustive list of troubleshooting steps, but it covers some of the basics.

Client Cannot Connect to POA
If the client cannot connect, first try a dnslookup on the DNS name for the post office from the client workstation.  Make sure it returns the IP address of the Reload server's DR POA.  This ensures you changed the A record and have DNS properly configured.  It is also possible that the DNS cache is corrupted.

Make sure the workstation can communicate with the Reload server.  There might be some network issues to resolve.

DR POA Fails to Load
Refer to the KB article, "Access / Restore / or DR Mode POA Will Not Load", for a list of troubleshooting steps.

Messages Can't Be Sent and/or Received Outside the DR Post Office
Check the MTA.  Does it show the post office in question closed?

Check your Reload DR POA configuration.  Correct MTP In Port?  Correct MTP Out Port?  These should match the production POA's settings.

Try to telnet from the domain server to the DR POA:  telnet  [post office DNS name] [POA port].  If successful, you won't get any errors and you'll just see a blinking cursor below the command line.

  • Successful?  Then DNS is configured properly and there are no DNS cache issues.
  • Connection refused or other error?  We know that DNS is configured properly because the client connected.  This means that the MTA server isn't able to see it.

Go into ConsoleOne and check your POA object:  Does it have the correct DNS name?  If so, bring down the MTA and rebuild the domain database.

Refer to Novell documentation on troubleshooting MTA closed locations.

Post Disaster Recovery

1.  Run the migration tool.  This gets the messages you sent or received during the DR period back to the production post office.  This can only be done from the Reload administration console.

a)  Recovery | POST OFFICE PROFILE | [post office] | Migrate

b)  Start at "Step #1" - highlight it and press ENTER to begin the pre-migration process.

c)  After Step 1 has completed, have the users exit the client and press ENTER on "Step #2".  This takes you to the Disaster Recovery POA Settings menu.  Go to "Unload" and press ENTER.  This unloads the DR POA but leaves Reload in disaster recovery mode.

d)  Once Step 2 has completed, go back to the migration menu and initiate Step #3 (full migration).  Wait until this process has completed before moving to the next task (turning off DR).

2.  Turn off Disaster Recovery (click on the ambulance button from the web interface).

3.  Verify that Reload has properly re-enabled your backup schedules and that the DR POA has unloaded.

4.  Change the DNS A record for the production POA.  It should now reflect the IP address of the actual post office or live POA.

5.  Load the production post office POA.

Additional Information

This article was originally published in the GWAVA knowledgebase as article ID 2141.