GroupWise Internet Agent (GWIA) not receiving internet mail and there is a delay in GWIA sending a 220 response to the Spam appliance fronting the GWIA

  • 7021797
  • 27-Sep-2017
  • 28-Sep-2017

Environment

GroupWise 2012 Support Pack 2
GroupWise 2014 R2 Support Pack 2

Situation

GroupWise 2012 customer, all of a sudden is not receiving inbound SMTP messages on the GroupWise
Internet Agent (GWIA).  A Barracuda firewall / spam hardware appliance happens to front the GWIA both inbound
and outbound.

Resolution

Symptoms and Findings :

1.  Since the messages would never hit the GWIA, we looked at the activity log of Barracuda and can
see an attempt to transfer the smtp messages to GWIA, but the messages would have a status of 
"queued" which means that there as a problem in communicating with the GWIA, so it queued the messages
for later delivery.

2.  We did a test on the configuration on how Barracuda talks to GWIA, it fails with an error of :
  "Error performing SMTP test".  We confirmed the correct ip address for GWIA in the Barracuda configuration.
  
3.  We looked at a GWIA "verbose" log and can see no errors.  It was not until we put the GWIA in
diagnostic log level that we see many errors frequently of :


17:32:47 EF88 NgwResQuery(95.1.168.192.in-addr.arpa, 1, 12)
17:32:47 EF88 Querying server (# 1) address = 192.168.1.62
17:32:52 EF88 timeout
17:32:52 EF88 Querying server (# 1) address = 192.168.1..62
17:33:02 EF88 timeout
17:33:02 EF88 Querying server (# 1) address = 191.168.1.62
17:33:15 F3AF timeout
17:33:15 F3AF NgwResQuery: send error
17:33:15 F3AF NgwResQuery failed

4.  This problem and GWIA log errors occured whether we sent an internet e-mail inbound into the customer
GroupWise system or if we did a telnet to port 25 directly on the GWIA linux server.

5.  We discovered that when we did a telnet to port 25, it would take about 70 to 90 seconds before
we would get back the 220 response from the GWIA, that would normally show for GW2012 :

  Note: substitute your GWIA server hostname

     "220 bperez6.lab.novell.com Ready"
     
  Since Barracuda was not getting the proper timely response from the GWIA, it gave up and just 
  queued the messages for later delayed delivery when it felt there was not a problem in communication
  with the GWIA.
  
6.  It was confirmed in a network packet trace taken on the Linux GWIA server, during the problem
of attempted mail delivery, that there was no tcp/ip network communication problem between the
Barracuda server and the GWIA server.  But a 75 second delay was seen immediately after the Barracuda server creates the connection to GWIA and it is waiting for a 220 response from the GWIA to continue.  However the 75 second delay caused Barracuda to think there was a failure in communication with GWIA so it gave up until the next
time this occurred over and over again.

7.  GroupWise GWIA object configuration was checked and no problem was found with
the overall GWIA configuration and it's access control database information (gwac.db), including
blacklist server references that could have been involved, but were not.

8.  We could see the ip address listed in the error in point # 3 above, was a Windows server
that happened to have a DNS process running on it.  It should not be querying this specific invalid
DNS server ip address  It was found that the SLES11 / OES11 server /etc/resolv.cfg had a line in it that says 
"nameserver <ipAddress>", to this same ip address when it should not.  

We modified this GWIA server's resolv.cfg file to have the correct nameserver ip address and we started the SLE11/OES11 nameserver daemon with "rcnovell-named start", it was not running. 

 Once we did this and restarted the GWIA also , the problem went away and the GW2012 GWIA no longer made attempts to query the invalid nameserver ip address.

Cause

GroupWise 2012 GWIA was querying the nameserver in the SLES11 server /etc/resolv.cfg file , over and over again, and not getting a proper response.  Thereby using connection resources.