GroupWise MTA(s) not communicating with other MTA's despite all links being open, in a cluster environment

  • 7022238
  • 27-Oct-2017
  • 27-Oct-2017

Environment


GroupWise 2012 Support Pack 2
GroupWise 2014 R2 Support Pack 2
SUSE Linux Enterprise Server 11 Service Pack 4 (SLES 11 SP4)
Open Enterprise Server 2015 (OES 2015) Linux

Situation

In a SLES11 / OES2015 server environment, When the customer loads the GroupWise MTA (Message Transfer Agent) from an NCS cluster load script, the MTA has the following problems:

 - The MTA, despite ALL links being OPEN, does not communicate with other MTA's to transfer messages.  And other MTA's cannot communicate with this problem MTA.
 
 - Sent GroupWise messages to other domain Post Offices would stay at a status of pending.
 
 - The MTA Log will only contain it's configuration information at the top of the verbose MTA log and will not
 log any activity at all.
 
 - When attempting to go to the Web console of the MTA with the issue, you either cannot login at all to the
 MTA web console or if you can when you go to "Log Files", you see the "Event Logs" list box show no log files and the list box shows very narrow in width.
 
 - We initially can see the error in the verbose MTA log : "Send failed, error = [B300]", but after
 making the change to the below file, this error went away.  

However we then could see a new error 
 in the MTA log of "SCA: <problemDomainName>: I/O error while scanning input queue" .
 
Troubleshooting:

-  set the SLES11 /etc/security/limits.conf file to be :

root     hard nofile 65535

root     soft nofile 65535

*     hard nofile 65535

*     soft nofile 65535

- Noted that 5 of 7 GroupWise domain MTA's in the cluster did not have this issue, found out that
the 2 problem MTA's had links to 125 other GroupWise MTA's, where the 5 MTA's with no problem, had links
to only about 10 other GroupoWise MTA's.

Resolution

This was placed in the GroupWise Cluster resource load script , which resolved the problem:
  
   exit_on_error ulimit -n 65535

Cause

Insufficient open file handles in a Linux cluster environment.

Additional Information

When the GroupWise MTA loads , it makes and api call to the linux operating system to increase the
number of open file handles to 200000.  This call is not working, this is a defect reported
to GroupWise Development.  

Until this is fixed you can try to set the number of open file handles
in the /etc/sysconfig/grpwise script :  GROUPWISE_MAX_OPEN_FILE_HANDLES="200000" .  If this file is
not present you can make this change in the /etc/init.d/grpwise script or you can use the ulimit 
switch as described in the resolution section of this document..