Slow Exchange Jobs because of long enterMailbox waits

  • 7019119
  • 15-Oct-2015
  • 07-Aug-2017

Environment


Retain 3.x
Exchange Module

Situation


Archive jobs against Exchange have really low throughput and in the logs we see long wait times when Retain attempts to enterMailbox and when the folder list begins to be returned.

In the RetainWorker log you'll see something like this:

2015-09-25 12:00:07,256 TRACE [RTWQuartzScheduler_Archive_Worker-1] com.gwava.caapi.MailboxArchivingStats: enterMailbox: JDoe@RETAIN.GWAVAUTAH
2015-09-25 12:00:10,828 DEBUG [http-bio-48080-exec-2] com.gwava.jobs.stats.OutputJobStatistics: processing running jobs...
2015-09-25 12:00:21,857 DEBUG [http-bio-48080-exec-13] com.gwava.jobs.stats.OutputJobStatistics: processing running jobs...
2015-09-25 12:00:32,886 DEBUG [http-bio-48080-exec-15] com.gwava.jobs.stats.OutputJobStatistics: processing running jobs...
2015-09-25 12:00:43,900 DEBUG [http-bio-48080-exec-5] com.gwava.jobs.stats.OutputJobStatistics: processing running jobs...
2015-09-25 12:00:54,929 DEBUG [http-bio-48080-exec-3] com.gwava.jobs.stats.OutputJobStatistics: processing running jobs...
2015-09-25 12:01:05,958 DEBUG [http-bio-48080-exec-2] com.gwava.jobs.stats.OutputJobStatistics: processing running jobs...
2015-09-25 12:01:16,988 DEBUG [http-bio-48080-exec-13] com.gwava.jobs.stats.OutputJobStatistics: processing running jobs...
2015-09-25 12:01:28,017 DEBUG [http-bio-48080-exec-15] com.gwava.jobs.stats.OutputJobStatistics: processing running jobs...
2015-09-25 12:01:39,030 DEBUG [http-bio-48080-exec-2] com.gwava.jobs.stats.OutputJobStatistics: processing running jobs...
2015-09-25 12:01:50,060 DEBUG [http-bio-48080-exec-5] com.gwava.jobs.stats.OutputJobStatistics: processing running jobs...
2015-09-25 12:02:01,089 DEBUG [http-bio-48080-exec-3] com.gwava.jobs.stats.OutputJobStatistics: processing running jobs...
2015-09-25 12:02:12,118 DEBUG [http-bio-48080-exec-13] com.gwava.jobs.stats.OutputJobStatistics: processing running jobs...
2015-09-25 12:02:12,399 INFO  [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.archiveimpl.process.ExchangeUser:
2015-09-25 12:02:12,399 INFO  [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.autodiscover.AutoDiscover: NOT in discoveredURL cache. We'll check the autodiscover server (if we know it)...
2015-09-25 12:02:12,399 INFO  [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.autodiscover.AutoDiscover: Finding autodiscover service for internet domain: testsystem.org
2015-09-25 12:02:12,414 DEBUG [RTWQuartzScheduler_Archive_Worker-1] com.gwava.address.exchange.impl.AbstractLDAPOperation: CN=PAPPHPDMAIL10,CN=Autodiscover,CN=Protocols,CN=PAPPHPDMAIL10,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=Test System,CN=Microsoft Exchange,CN=Services is name
2015-09-25 12:02:12,414 DEBUG [RTWQuartzScheduler_Archive_Worker-1] com.gwava.address.exchange.impl.AbstractLDAPOperation: CN=PAPPHPDMAIL11,CN=Autodiscover,CN=Protocols,CN=PAPPHPDMAIL11,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=Test System,CN=Microsoft Exchange,CN=Services is name
2015-09-25 12:02:12,414 DEBUG [RTWQuartzScheduler_Archive_Worker-1] com.gwava.address.exchange.impl.AbstractLDAPOperation: CN=PAPPHPDMAIL12,CN=Autodiscover,CN=Protocols,CN=PAPPHPDMAIL12,CN=Servers,CN=Exchange Administrative Group (FYDIBOHF23SPDLT),CN=Administrative Groups,CN=Test System,CN=Microsoft Exchange,CN=Services is name
2015-09-25 12:02:12,414 DEBUG [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.autodiscover.AutoDiscover: Trying these url [url=https://ad.test.sys/autodiscover/autodiscover.xml,class=scp3, url=https://testsystem.org/autodiscover/autodiscover.xml,class=staticSSL, url=https://autodiscover.testsystem.org/autodiscover/autodiscover.xml,class=staticSSL, url=http://autodiscover.testsystem.org/autodiscover/autodiscover.xml,class=staticRedir]
2015-09-25 12:02:12,414 TRACE [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.autodiscover.AutoDiscover: Try https://ad.test.sys/autodiscover/autodiscover.xml
2015-09-25 12:02:12,742 INFO  [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.archiveimpl.process.ExchangeUser: discovered: https://ad.test.sys/ews/exchange.asmx
2015-09-25 12:02:13,896 DEBUG [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.archiveimpl.process.ExchangeUser: ServerVersion (major.minor): 14.3
2015-09-25 12:02:14,162 TRACE [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.InterceptedExchangePortImpl: InterceptedExchangePortImpl, request Serverversion: Exchange2010_SP1
2015-09-25 12:02:14,177 DEBUG [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.archiveimpl.process.ExchangeUser: Discovered endpoint: https://ad.test.sys/ews/exchange.asmx
2015-09-25 12:02:14,193 INFO  [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.archiveimpl.EWSFunctions: XmlStreaming impl: com.ctc.wstx.stax.WstxInputFactory
2015-09-25 12:02:14,193 INFO  [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.archiveimpl.EWSFunctions: file:/D:/Program%20Files/Apache%20Software%20Foundation/Tomcat%207.0/lib/webservices-rt.jar!/com/sun/xml/stream/ZephyrParserFactory.class
2015-09-25 12:02:14,193 TRACE [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.RetainExchangeWebserviceFactory: invoking retainews method: getFolder
2015-09-25 12:02:14,458 TRACE [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.RetainExchangeWebserviceFactory: invoking retainews method: getFolder
2015-09-25 12:02:14,474 TRACE [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.RetainExchangeWebserviceFactory: invoking retainews method: getFolder
2015-09-25 12:02:14,489 TRACE [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.folders.ExchangeFolderInfo: folderClass: IPF.Appointment is Calendar: true
2015-09-25 12:02:14,505 DEBUG [RTWQuartzScheduler_Archive_Worker-1] com.gwava.ews.folders.ExchangeFolderCache: adding folder to the chache: Calendar - namePath :\Doe, John\Calendar, \Root\Calendar, id: AAMkADEzODc0MTI3LTgxODItNGJkYi1iYWZmLTgwM2MwMTk5ODBmMwAuAAAAAAAXyQoixXwhR6212LgnnIFeAQCwjlFYbxhnRocXCHH2LaMtAAAAUuYqAAA=

There will be a long wait, that can be minutes long, between when Retain attempts to enter the user mailbox and when the folder list is returned. In a well configured system it will take only a few hundred milliseconds to enter the mail box.

We have seen this on customer systems where the logon domain name is different from the email domain name.

Jobs run normally if they are run against a single user or journaling mailbox.

In Retain 4.x there is enhanced logging that makes this easier to diagnose. 3.5.x enhanced logging is available as a hotfix.

Resolution


One solution was to add the email domain to the DNS as a Forward Lookup Zone we were able to reduce the wait from 2 minutes to 2 seconds. However, with enhanced logging we see that Retain still fails in the initial attempt to enter the mailbox. It is merely failing faster, mailnly because it can find that the email domain has no autodiscover SCP. This may cause interference with the Exchange server.

Another possible solution is to change the logon domain to the email domain, though this will require adding that to AD Domains and Trusts as an alternative UPN suffix.

see also
Autodiscover: How Retain Connects to Your Exchange Mailboxes
Multi-domain Login of Exchange Users

Archive Jobs Running Very Slowly

Additional Information

This article was originally published in the GWAVA knowledgebase as article ID 2630.