Analyzing Retain 4 Worker Logs

  • 7019358
  • 19-Feb-2016
  • 07-Aug-2017

Environment


Retain 4.x

Situation


How do I read Retain 4.x worker logs?

Resolution

The RetainWorker logs are somewhat different in Retain 4.x compared to Retain 3.x.

Job Begin Block:

The job will begin with a block of text similar to this wiht details about the job about to be run:
16:25:51, 848[RTWQuartzScheduler_Archive_Worker-2] [INFO ] RetainDredger: Job starting Dredge Everything
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:     Mail System : gw
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:     Profile Settings applied to this job :
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:         Profile Name : Everything
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:         Mailbox Type : Users, Resources
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:         Item Type : Mail, Appointment, Note, Task
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:         Item Source : All
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:         Message Status : Opened Doesn't Matter|Read Doesn't Matter|Private Doesn't Matter|Personal Doesn't Matter|Confidential Doesn't Matter
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:     Scope
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:         Date Range to Scan : All Messages (ignore date)
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:         Duplicate Check : Try to publish all messages
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:         Set Storage Flags : Don't set Purge flag, Don't set Retention flag, Set Item Store flag
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:     Miscellaneous
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:         Store Attachments : Don't store MIME.822 attachments
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:         Store/index Internet Headers : true
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:         Include Routing Properties : true
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:         Include Shared Folders : all
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:     Advanced
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:         Advanced Criteria : 0
16:25:51, 849[RTWQuartzScheduler_Archive_Worker-2] [DEBUG] RetainDredger:         Folder Scope : all

Mailbox Entry Block:

Each mailbox will have a login sequence:
15:45:23, 836[RTWQuartzScheduler_Archive_Worker-1] [TRACE] RetainDredger: =============================================================================================
15:45:23, 838[RTWQuartzScheduler_Archive_Worker-1] [INFO ] RetainDredger: Logging in as : alpha to dredge their mailbox.
15:45:23, 838[RTWQuartzScheduler_Archive_Worker-1] [TRACE] ProcessFilters: Profile.getDateType(): All Messages (ignore date)
15:45:23, 838[RTWQuartzScheduler_Archive_Worker-1] [INFO ] RetainDredger:    - Query Range Used:

15:45:23, 838[RTWQuartzScheduler_Archive_Worker-1] [INFO ] RetainDredger:       - GTE:1970-01-01 00:00:00
15:45:23, 838[RTWQuartzScheduler_Archive_Worker-1] [INFO ] RetainDredger:       - LTE:2016-02-17 22:45:23
15:45:23, 838[RTWQuartzScheduler_Archive_Worker-1] [TRACE] GroupWiseUser: Login: get new GWSoapClient...
15:45:24, 055[RTWQuartzScheduler_Archive_Worker-1] [TRACE] RetainLoginListener: About to Login
15:45:24, 055[RTWQuartzScheduler_Archive_Worker-1] [TRACE] RetainLoginListener: Initial gwscon is sfgw14.sfgw14po at 10.1.4.214.7191
15:45:24, 055[RTWQuartzScheduler_Archive_Worker-1] [TRACE] RetainLoginListener: Initial redirect is nil so there was no bounce
15:45:24, 070[RTWQuartzScheduler_Archive_Worker-1] [TRACE] EasySoap: GW-Version: 14.0.1 , build: 120664, AttachmentOverHTTPSupported: true
15:45:24, 070[RTWQuartzScheduler_Archive_Worker-1] [TRACE] RetainUser: Begin login of alpha
15:45:24, 071[RTWQuartzScheduler_Archive_Worker-1] [TRACE] RetainServerCommunication: RetainServerCommunication.obtainHTTPResponseCodeAndData() - About to post to : http://10.1.4.60:48080/RetainServer/Server/ServerController2.jsp Params:{emailSystem=gw, password=TVlDVU9LUUNCTlJXQ0haRFRBVQ==, mailbox=alpha, method=checkIfUserExists, abSystem=gw, domain=sfgw14, display=Alpha Datus, uuid=56E72D00-02BF-0000-A473-666733343035, archiverID=5908A19EDD36E005B77A670BE2C7EA03, email=alpha@sfgw14.gwava.net, po=sfgw14po}
15:45:24, 165[RTWQuartzScheduler_Archive_Worker-1] [TRACE] HttpUIDResolver: checkIfUserExists response: OK: USER EXISTS ::: 56E72D00-02BF-0000-A473-666733343035
15:45:24, 165[RTWQuartzScheduler_Archive_Worker-1] [TRACE] RetainUser: End login of alpha
15:45:24, 165[RTWQuartzScheduler_Archive_Worker-1] [TRACE] StatsUpdater: updating job status information name: lastmailboxnumber
15:45:24, 165[RTWQuartzScheduler_Archive_Worker-1] [TRACE] RetainServerCommunication: RetainServerCommunication.obtainHTTPResponseCodeAndData() - About to post to : http://10.1.4.60:48080/RetainServer/Server/ServerController2.jsp Params:{emailSystem=gw, jobid=BA938A6A5BE5995A1A3009D94A924CB3, password=TVlDVU9LUUNCTlJXQ0haRFRBVQ==, method=svStatus, proc_id=null, f_order=1, lastmailboxnumber=0, archiverID=5908A19EDD36E005B77A670BE2C7EA03}
15:45:24, 174[RTWQuartzScheduler_Archive_Worker-1] [TRACE] StatsUpdater: updating job status information name: lastmailbox
15:45:24, 175[RTWQuartzScheduler_Archive_Worker-1] [TRACE] RetainServerCommunication: RetainServerCommunication.obtainHTTPResponseCodeAndData() - About to post to : http://10.1.4.60:48080/RetainServer/Server/ServerController2.jsp Params:{emailSystem=gw, jobid=BA938A6A5BE5995A1A3009D94A924CB3, password=TVlDVU9LUUNCTlJXQ0haRFRBVQ==, lastmailbox=alpha, method=svStatus, proc_id=null, f_order=1, archiverID=5908A19EDD36E005B77A670BE2C7EA03}
15:45:24, 186[RTWQuartzScheduler_Archive_Worker-1] [TRACE] StatsUpdater: updating job status information name: lastpo
15:45:24, 186[RTWQuartzScheduler_Archive_Worker-1] [TRACE] RetainServerCommunication: RetainServerCommunication.obtainHTTPResponseCodeAndData() - About to post to : http://10.1.4.60:48080/RetainServer/Server/ServerController2.jsp Params:{emailSystem=gw, jobid=BA938A6A5BE5995A1A3009D94A924CB3, password=TVlDVU9LUUNCTlJXQ0haRFRBVQ==, method=svStatus, proc_id=null, f_order=1, lastpo=sfgw14po, archiverID=5908A19EDD36E005B77A670BE2C7EA03}
15:45:24, 203[RTWQuartzScheduler_Archive_Worker-1] [TRACE] JobStatusTracker: Starting the stat mailbox run update for user: 56E72D00-02BF-0000-A473-666733343035
15:45:24, 204[RTWQuartzScheduler_Archive_Worker-1] [ERROR] JobStatusTracker: startMailboxRun called with job run not set.
15:45:24, 207[RTWQuartzScheduler_Archive_Worker-1] [TRACE] MailboxArchivingStats: enterMailbox: alpha
15:45:24, 213[RTWQuartzScheduler_Archive_Worker-1] [TRACE] GWProcessFilters: Date filter: All Messages (ignore date)
15:45:24, 214[RTWQuartzScheduler_Archive_Worker-1] [TRACE] GWProcessFilters: AND
 source IS_OF  sent personal received draft
 @type IS_OF  Mail Appointment Note Task

Message Block:

An individual message will be bracketed between lines. Even for the simplest messages the amount of data provided is overwhelming:
16:14:12, 060[RTWQuartzScheduler_Archive_Worker-1] [DEBUG] Archiver: =============================================================
16:14:12, 060[RTWQuartzScheduler_Archive_Worker-1] [TRACE] Archiver: Processing ID: 56B1ED81.sfgw14.sfgw14po.100.1756436.1.451.1@1:7.sfgw14.sfgw14po.100.0.1.0.1@16
16:14:12, 060[RTWQuartzScheduler_Archive_Worker-1] [DEBUG] Archiver: Archive Message: 56B1ED80.sfgw14.sfgw14po.200.2000072.1.A9B8.1
16:14:12, 060[RTWQuartzScheduler_Archive_Worker-1] [DEBUG] Archiver: Location:\Beta Datus Home\Sent Items
16:14:12, 060[RTWQuartzScheduler_Archive_Worker-1] [DEBUG] Archiver: Subject:test message noon
16:14:12, 060[RTWQuartzScheduler_Archive_Worker-1] [DEBUG] Archiver: Date:2016-02-03 19:07:29
16:14:12, 060[RTWQuartzScheduler_Archive_Worker-1] [DEBUG] Archiver: Sender:beta@sfgw14.gwava.net
16:14:12, 060[RTWQuartzScheduler_Archive_Worker-1] [DEBUG] Archiver: Recipient(s):alpha@sfgw14.gwava.net
16:14:12, 060[RTWQuartzScheduler_Archive_Worker-1] [DEBUG] Archiver: Number of attachments: 0
16:14:12, 060[RTWQuartzScheduler_Archive_Worker-1] [TRACE] Archiver: ProcessFolderPath called, type is: all
16:14:12, 060[RTWQuartzScheduler_Archive_Worker-1] [TRACE] ArchiveMailboxBase: Reset LastArchivedItemRetentionDate to 1. New folder: \Beta Datus Home\Sent Items, old: \Beta Datus Home\Mailbox
16:14:12, 061[RTWQuartzScheduler_Archive_Worker-1] [TRACE] Archiver: itemType: Mail
16:14:12, 061[RTWQuartzScheduler_Archive_Worker-1] [TRACE] ArchiveMetaData: hashMapToQueryString: archiver=GW0+local&senderUUID=5BABE100-02BF-0000-A473-666733343035&subject=test+message+noon&senderDisplay=Beta+Datus&delivered=1454526449&uuid=5BABE100-02BF-0000-A473-666733343035&userid=beta&dirPathIDStr=%5CRoot%5CSentItems&folderID=19.sfgw14.sfgw14po.100.0.1.0.1%4030&archiverID=5908A19EDD36E005B77A670BE2C7EA03&itemTypeString=Mail&jobid=BA938A6A5BE5995A1A3009D94A924CB3&password=TVlDVU9LUUNCTlJXQ0haRFRBVQ%3D%3D&dirPath=%5CBeta+Datus+Home%5CSent+Items&from=beta%40sfgw14.gwava.net&categories=&personalSubject=&friendlyname=Dredge+Everything&expireDays=0&created=1454526448&msgid=56B1ED80.sfgw14.sfgw14po.200.2000072.1.A9B8.1&priority=Standard&isSubmail=false&boxTypeString=sent&gw.umsgid=56B1ED81.sfgw14.sfgw14po.100.1756436.1.451.1%401%3A7.sfgw14.sfgw14po.100.0.1.0.1%4016&recipients=0%3AAlpha+Datus%3Aalpha%40sfgw14.gwava.net%3A56E72D00-02BF-0000-A473-666733343035%3B&item.class=com.novell.groupwise.ws2.Mail&attachmentCount=0&internet=false&status=accepted%2Copened%2Cread
16:14:12, 061[RTWQuartzScheduler_Archive_Worker-1] [TRACE] ArchiveMetaData: hashMapToQueryString: archiver=GW0+local&senderUUID=5BABE100-02BF-0000-A473-666733343035&subject=test+message+noon&senderDisplay=Beta+Datus&delivered=1454526449&filesize=0&uuid=5BABE100-02BF-0000-A473-666733343035&userid=beta&dirPathIDStr=%5CRoot%5CSentItems&folderID=19.sfgw14.sfgw14po.100.0.1.0.1%4030&archiverID=5908A19EDD36E005B77A670BE2C7EA03&itemTypeString=Mail&finalpart=no&jobid=BA938A6A5BE5995A1A3009D94A924CB3&password=TVlDVU9LUUNCTlJXQ0haRFRBVQ%3D%3D&dirPath=%5CBeta+Datus+Home%5CSent+Items&from=beta%40sfgw14.gwava.net&categories=&contentType=bodytext%2Fplain&personalSubject=&friendlyname=Dredge+Everything&expireDays=0&created=1454526448&msgid=56B1ED80.sfgw14.sfgw14po.200.2000072.1.A9B8.1&priority=Standard&isSubmail=false&boxTypeString=sent&compressedSize=20&filename=test+message+noon.txt&gw.umsgid=56B1ED81.sfgw14.sfgw14po.100.1756436.1.451.1%401%3A7.sfgw14.sfgw14po.100.0.1.0.1%4016&recipients=0%3AAlpha+Datus%3Aalpha%40sfgw14.gwava.net%3A56E72D00-02BF-0000-A473-666733343035%3B&item.class=com.novell.groupwise.ws2.Mail&attachmentCount=0&hash=E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855&internet=false&status=accepted%2Copened%2Cread
16:14:12, 061[RTWQuartzScheduler_Archive_Worker-1] [TRACE] RetainServerCommunication: Sending data and waiting for response.
16:14:12, 061[RTWQuartzScheduler_Archive_Worker-1] [TRACE] HTTPWrapper2: using ApacheClient with TimedHttpMethodRetryHandler...
16:14:12, 061[RTWQuartzScheduler_Archive_Worker-1] [TRACE] HTTPWrapper2: headerstream.length: 1153 dataStream: 20
16:14:12, 083[RTWQuartzScheduler_Archive_Worker-1] [TRACE] ArchiveEmail: sent mail textbody with subject: test message noon, server-response: Archive: OK: All is well. Result=AddedEMails: 1, emailID=2763, parentID=8
16:14:12, 083[RTWQuartzScheduler_Archive_Worker-1] [TRACE] ArchiveEmail: +++msgCount: 1
16:14:12, 083[RTWQuartzScheduler_Archive_Worker-1] [TRACE] RetainServerCommunication: Sending data and waiting for response.
16:14:12, 083[RTWQuartzScheduler_Archive_Worker-1] [TRACE] HTTPWrapper2: using ApacheClient with TimedHttpMethodRetryHandler...
16:14:12, 084[RTWQuartzScheduler_Archive_Worker-1] [TRACE] HTTPWrapper2: headerstream.length: 1246 dataStream: 409
16:14:12, 105[RTWQuartzScheduler_Archive_Worker-1] [TRACE] ArchiveEmail: EmailID: 2763, parentID: 8, stubbingStage: 0
16:14:12, 105[RTWQuartzScheduler_Archive_Worker-1] [DEBUG] Archiver: Message archived succesfully!
16:14:12, 105[RTWQuartzScheduler_Archive_Worker-1] [TRACE] Archiver: lastArchivedMailDeliveredDate vs. mail.deliveredDate (using delivered-date): Wed Dec 31 17:00:00 MST 1969 vs: Wed Feb 03 12:07:29 MST 2016
16:14:12, 106[RTWQuartzScheduler_Archive_Worker-1] [DEBUG] Archiver: =============================================================

Error Block:

An error will look something like this, depending on the exact nature of the error:
16:15:49, 881[RTWQuartzScheduler_Archive_Worker-1] [ERROR] GroupWiseUser: Login
com.gwava.gweasysoap.GWSoapAPIException: User Database is temporarily disabled , Code: d714
    at com.gwava.gweasysoap.EasySoap.fire(EasySoap.java:356)
    at com.gwava.gweasysoap.EasySoap.login1(EasySoap.java:317)
    at com.gwava.gweasysoap.EasySoap.<init>(EasySoap.java:104)
    at com.gwava.gweasysoap.GWSoapClient.<init>(GWSoapClient.java:24)
    at com.gwava.gw.GroupWiseUser.login(GroupWiseUser.java:149)
    at com.gwava.caapi.process.RetainDredger.processMailboxes(RetainDredger.java:403)
    at com.gwava.caapi.process.RetainDredger.processNormalJob(RetainDredger.java:300)
    at com.gwava.gw.jobs.LiveGroupWise.executeWork(LiveGroupWise.java:111)
    at com.gwava.jobs.RetainArchiveJob.performDredge(RetainArchiveJob.java:567)
    at com.gwava.jobs.RetainArchiveJob.execute(RetainArchiveJob.java:235)
    at com.gwava.jobs.RetainArchiveJob.execute(RetainArchiveJob.java:448)
    at org.quartz.core.JobRunShell.run(JobRunShell.java:216)
    at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)
16:15:49, 906[RTWQuartzScheduler_Archive_Worker-1] [ERROR] StandardErrorHandleStrategy: reportError: WorkerJob :: com.gwava.caapi.process.RetainDredger.processMailboxes:472 :: com.gwava.archive.exceptions.RetainWorkerException: com.gwava.gweasysoap.GWSoapAPIException: User Database is temporarily disabled , Code: d714,
 currentUser: disabled
    at com.gwava.caapi.process.RetainDredger.processMailboxes(RetainDredger.java:470)
    at com.gwava.caapi.process.RetainDredger.processNormalJob(RetainDredger.java:300)
    at com.gwava.gw.jobs.LiveGroupWise.executeWork(LiveGroupWise.java:111)
    at com.gwava.jobs.RetainArchiveJob.performDredge(RetainArchiveJob.java:567)
    at com.gwava.jobs.RetainArchiveJob.execute(RetainArchiveJob.java:235)
    at com.gwava.jobs.RetainArchiveJob.execute(RetainArchiveJob.java:448)
    at org.quartz.core.JobRunShell.run(JobRunShell.java:216)
    at org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:549)
Caused by: com.gwava.gweasysoap.GWSoapAPIException: User Database is temporarily disabled , Code: d714
    at com.gwava.gweasysoap.EasySoap.fire(EasySoap.java:356)
    at com.gwava.gweasysoap.EasySoap.login1(EasySoap.java:317)
    at com.gwava.gweasysoap.EasySoap.<init>(EasySoap.java:104)
    at com.gwava.gweasysoap.GWSoapClient.<init>(GWSoapClient.java:24)
    at com.gwava.gw.GroupWiseUser.login(GroupWiseUser.java:149)
    at com.gwava.caapi.process.RetainDredger.processMailboxes(RetainDredger.java:403)
    ... 7 more

Simple searching for "[ERROR]" and finding the first one in the stack will most often allow you to start diagnosing the issue.
You can use any text editor that will handle large files such as NotePad++, TextPad, glogg, or LTFViewr.

Pattern Finding:

However, sometimes the issue is a bit more complex and we need to find patterns. We need to boost the signal to noise ratio by focusing on the individual elements so we can begin to discern patterns in the data.
We need to reduce the data to just the most basic: When did the job begin and end, which mailbox and what message is being processed, was it successful and were there any errors.

I prefer to use TextPad as it can bookmark lines, which can be copy (Edit/Copy Other/Bookmarked Lines) and pasted into a new document to make seeing patterns much easier.

Job

To determine the beginning of a job, search for "Job starting":
16:25:51, 848[RTWQuartzScheduler_Archive_Worker-2] [INFO ] RetainDredger: Job starting Dredge Everything

To determine the end of a job, search for "Job ending":
17:05:11, 659[RTWQuartzScheduler_Archive_Worker-2] [INFO ] RetainDredger: Job ending : Dredge Everything

Mailbox

To detemine which mailboxes were entered, search for "Logging in as":
15:45:23, 838[RTWQuartzScheduler_Archive_Worker-1] [INFO ] RetainDredger: Logging in as : alpha to dredge their mailbox.

To determine when a mailbox was logged out of, search for "StatsUpdater: archived", which tells you how many messages were archived and also shows up at the end of the job:
16:05:52, 939[RTWQuartzScheduler_Archive_Worker-1] [DEBUG] StatsUpdater: archived 1773 msgs for this mailbox. Job total Errors: 0

Message

We need just one instance of a message, it might be handy to have as much data as possible and the most unique line is the hash.

To determine what messages were send seach for "&hash=" which also allows you to search the archive to see if the message was actually archived to disk:
15:46:11, 675[RTWQuartzScheduler_Archive_Worker-1] [TRACE] ArchiveMetaData: hashMapToQueryString: archiver=GW0+local&subject=Server+Utilization&senderDisplay=Retain+Reporting+%3Cretain40%40gwava.net%3E&delivered=1447052414&filesize=293&uuid=56E72D00-02BF-0000-A473-666733343035&userid=alpha&dirPathIDStr=%5CRoot%5CMailbox&folderID=7.sfgw14.sfgw14po.100.0.1.0.1%4016&archiverID=5908A19EDD36E005B77A670BE2C7EA03&itemTypeString=Mail&finalpart=no&jobid=BA938A6A5BE5995A1A3009D94A924CB3&password=TVlDVU9LUUNCTlJXQ0haRFRBVQ%3D%3D&dirPath=%5CAlpha+Datus+Home%5CMailbox&from=retain40%40gwava.net&categories=&contentType=bodytext%2Fplain&personalSubject=&friendlyname=Dredge+Everything&expireDays=0&created=1447052413&msgid=563FE20D.sfgw14.sfgw14po.200.200009E.1.3152.1&priority=Standard&isSubmail=false&boxTypeString=received&compressedSize=217&filename=Server+Utilization.txt&gw.umsgid=563FE20D.sfgw14.sfgw14po.100.167336B.1.596.1%401%3A7.sfgw14.sfgw14po.100.0.1.0.1%4016&recipients=0%3AAlpha+Datus%3Aalpha%40sfgw14.gwava.net%3A56E72D00-02BF-0000-A473-666733343035%3B&item.class=com.novell.groupwise.ws2.Mail&attachmentCount=1&hash=FE7ED9CAA21E8524AC1F185C5A0F781BBB8B0D1A0F73F2ECBE19656F981C7001&internet=true&status=

To determine if the message was archived or not, search for "Archiver: Message" and you will see one of these messages:
15:46:12, 196[RTWQuartzScheduler_Archive_Worker-1] [DEBUG] Archiver: Message archived succesfully!
15:45:27, 069[RTWQuartzScheduler_Archive_Worker-1] [DEBUG] Archiver: Message was previously archived so skipped!

To detemine if a message had an error, search for "[ERROR]". There may be a significant amount of noise which is why having job, mailbox and message information is important to provide context:
15:39:50, 606[RTWQuartzScheduler_Archive_Worker-1] [ERROR] DefaultCAAPIErrorHandleStrategy: reportError: ArchiveEmail :: com.gwava.workerutils.DefaultCAAPIErrorHandleStrategy.returnCaller:55 :: Server-ERROR encountered on item: 563FE20D.sfgw14.sfgw14po.200.200009E.1.3152.1com.gwava.archive.exceptions.RetainWorkerException: Server-ERROR encountered on item: 563FE20D.sfgw14.sfgw14po.200.200009E.1.3152.1

Example:

I had a customer that had a job running for a very long time 23 days whic had processed over 15 million messages. However, there were only 3.5 million messages in the system. Something was obviously broken. Gathering the previous day's worker log file I was overwhelmed since it was nearly 3GB in size. The errors were mainly Attachment "null" errors and others that did not appear to have any way to cause whatever was happening.

There was only one mailbox archived during this day as all the "Archiver: Location" folders were from a single user.

All the messages were previously archived and skipped, which told me that we were looping through a single mailbox.

I decided to try seeing if there was a pattern. I found that the Processing ID: is unique and an always growing hexidecimal number when archiving O365. Therefore useless.

Since the hash is unique in Retain I could try that, knowing that some messages could share the hash if an attachment or other email body part was identical, but figuring I could further filter if needed. However I did not need to. The first hash in the list returned eight results and they were all identical. I tried other hashes and found the same pattern. Somehow we were looping through the mailbox every ~3 hours. We stopped the looping job, ran a job against a single user, which ran successfully and ended gracefully. Aborting the job broke the infinite loop and allowed Retain to complete the next job successfully.

Note: Sometimes messages have control characters in them like NUL and TextPad won't allow you to copy&paste, you can use Replace with regular expression and hex to change NUL (\x00) to something innocuous like \t (tab).

Additional Information

This article was originally published in the GWAVA knowledgebase as article ID 2746.