How to Read Retain Worker Logs (Exchange / O365 Jobs)

  • 7019357
  • 23-Jul-2015
  • 01-Sep-2017

Environment

Retain 3.x
Exchange or O365

Situation

An archive job runs yet the user's Inbox items are not in the Retain mailbox's Inbox folder.

Resolution

This article will take you through the steps of reading the Worker log to determine whether it even received the Inbox folder from Exchange/O365 or whether the item was processed during the job.  There could have been errors on the Inbox itself, or errors on the specific items.  However, there also could be an issue with the profile's settings that cause Retain's query not to match the items in the Exchange Inbox.

However, you might be dealing with other archiving issues and just want to understand how to read the Worker log.  This article helps out with that; however, the Retain 3.x logs are admittedly very difficult to decipher, so after your best efforts, please feel free to open a support ticket, upload the Worker log(s) to Support's FTP server, and we can have a developer decipher it if we cannot. 

It should be noted that the Worker logging has been completely revamped in Retain 4.  They have been made much easier to read and make sense of.



When you see an issue like this, the first thing to do is go right to the Worker log.  The Worker will request items from O365 one folder at a time.  So, it will mention "Inbox" specifically in the log and then it will start processing items. 

The best approach is to search the Worker log for "dredging saga".  That takes you right to the point where the job begins.  Delete everything in the log prior to that line and save the log as a different name.  This makes the log easier to work with and allows the original log to remain unmodified.

It then sends status information to the Server ("updating job status information name: [stat name]").  The "JobStatusTracker" then reports the job ID and other info about the job that the Server has stored in the database.  It gets this through our REST API, which is Retain's application programming interface.  It allows third party developers to hook into Retain and add value. 

Right after that, you see "ProcessExchFilters:".  This is where it reports the profile information for the profile associated with the job.  Looking at a specific customer's log, it reports this:
profile.getDateType(): 8
profile.getDateType(): ndaysOrOlder

The "getDateType" number definitions have not yet been defined.  We'll update this KB article after running some tests with various profile settings to see what date types we get.

Next, it starts an LDAP lookup of all the users.  In O365, the Worker cannot perform an LDAP lookup for users to Active Directory in Azure.  Instead, it looks at your exchangeuser.csv and exchangegroup.csv files in ...\RetainServer\WEB-INF\cfg.

After enumerating all the user mailboxes for the job, it communicates the number of total mailboxes to the Server and begins to login to a mailbox.

When logging into a mailbox, you'll always see: "LOGGING IN AS: [mailbox]".  So, when trying to troubleshoot a specific mailbox, copy all the data from the beginning of the job up to the point where it logs into the first mailbox.  Paste that into a separate window so you can refer to it later.  Then, do a search for "LOGGING IN AS: [mailbox]" for the mailbox you want to research.  Once you get to that login, delete everything in the log before that point and save it.  This assures that - as you start searching through the mailbox processing for that one mailbox - you don't accidentally go into an area of the log outside of the mailbox processing that you are trying to troubleshoot. 

Next, search for "LOGGING IN AS" (only) to take you to the next mailbox FOLLOWING the mailbox you are researching.  Delete everything from that point and forward and save the log.  Now all you should have in the log is the logging information for the Worker processing just that one mailbox.  You'll follow this process for EACH mailbox you are researching.  We recommend focusing on one mailbox at a time, so this is why it was suggested at the start to save the modified log as a different name.

At the beginning of the login, it will discover the mailbox and then enumerate folders:
RetainExchangeWebserviceFactory - invoking retainews method: getFolder
RetainExchangeWebserviceFactory - invoking retainews method: getFolder
RetainExchangeWebserviceFactory - invoking retainews method: getFolder
ExchangeFolderInfo - folderClass: IPF.Appointment is Calendar: true
ExchangeFolderCache - adding folder to the chache: Calendar - namePath :\John Doe\Calendar, \Root\Calendar, id: [long alphanumeric string, which is an ID stored in t_folder in the Retain database]

It should be noted that we are aware of the misspelling of cache as "chache".
 
It does this for every folder.  The first part of the "namePath" shows where the folder path in the mailbox.  The next path path "\Root\Calendar" is the folder ID that Retain is assigning to the folder.  This allows Retain to determine if the folder already exists in the user's Retain mailbox. 

All system folders (other than Drafts and Junk E-mail) will have a folder ID that matches their folder name (i.e., \Root\Calendar or \Root\Mailbox). 

All user-created folders or a "Drafts" or a "Junk E-mail" folder, the folder ID will be \Root\PersonalFolder

This not only tells Retain Server what type of icon to assign to the folder and but helps it to distinguish if the folder already exists.  Thus, a user could create a personal folder called "Calendar" at the root level of the mailbox but Retain will know that it is NOT the actual Calendar system folder because of the folder ID.  NOTE: There is a very old bug in Retain where Drafts and Junk Mail were incorrectly given folder IDs of "\Root\PersonalFolder\Draft" and "\Root\PersonalFolder\JunkMail".  Not only is Drafts missing the "s" and JunkMail should be Junk E-mail, but both those folders are listed as "PersonalFolder".  This is why they show up in the Retain mailbox with the personal folder icon rather than their system folder icon. This issue cannot be fixed because doing so would give existing mailboxes duplicate folders.

Note that there are three folder class types:  IPF.Appointment, IPF.Note, and IPF.Task.  In the aforementioned example, it reads "ExchangeFolderInfo - folderClass: IPF.Appointment is Calendar: true".  That is stating that the "Calendar" folder is an "Appointment" class folder and is a calendar.  It does that same check to see if the given folder is a Calendar, so you will see all the others read something like this: 
RetainExchangeWebserviceFactory - invoking retainews method: findFolder
RetainExchangeWebserviceFactory - invoking retainews method: getFolder
ExchangeFolderInfo - folderClass: IPF.Note is Calendar: false
ExchangeFolderCache - adding folder to the chache: Trash - namePath :\John Doe\Trash, \Root\Trash, id: [string]=

This example shows that the folder class is IPF.Note, which is any non-calendar and non-task folder.  It also states that it is not a Calendar folder.  Since it is the system folder called "Trash", it's folder ID is "\Root\Trash".

Thus, if you feel a folder in the user's Exchange/O365 mailbox is missing, this is the place to look to check if Retain is getting that folder from Exchange and what its folder type is being interpreted to be.

It then includes the scope of the query it will be making to Exchange/O365.

Search forward for the phrase "ProcessExchFilters - storageDateRange =" (or just "storageDateRange") and it should take you right to that area.  It will report the range in Unix time, so convert that number using this website:  http://www.4webhelp.net/us/timestamp.php.

For example:  ProcessExchFilters - storageDateRange = begin=0,end=1429040453, dateRange = begin=0,end=1429040416

The "storageDateRange" is driven by the "Duplicate Check" section of the "Scope" tab of the profile associated with the job.  The example is showing that the mailbox has never been processed because the item store flag ("begin" time) is "0".  The end time is 1429040453, which is the Unix time value representing the time the job started.

The "dateRange" is driven by the "Date Range to Scan" section in the "Scope" tab of the profile.  In this case, the begin=0 with the end equaling the time of the job start tells us that it is set to "All Messages (ignore date)" setting in the profile.

The combination of those two ranges determines the net range for which Retain will query Exchange.  So, in this case, Retain will ask Exchange for all messages from the beginning of time to the start of the job.  If the mailbox had been processed successfully prior to this job run, the "begin" value in the "storageDateRange" would be a Unix timestamp that would equate to a date/time from a day or so ago.

Now you know what date range Exchange/O365 will be processing.  If there are items in the user's Inbox in Exchange or O365 that fall within that date range, then you should be able to perform a search on the exact message's subject in the log because the Worker should be processing it.

Next, we start retrieving ("fetching") items from the mailbox.  We do it one folder at a time.  In the log you'll see:
"CursorFetchThread - fetching from folder([user email address]): [folder name]"
It then requests the items.  This is where it gets very convoluted and confusing in the logs.  Tech support feels your pain.  This is all being corrected in Retain 4.0, but the logs in 3.x can be very difficult to decipher.

Basically, if you are looking for an item and you know the message subject, search for that subject.  It then should give you a hit (and perhaps multiple hits depending on how generic the subject text is), but then you should be able to determine what occurred.  We'll attempt to explain.
2015-04-27 00:01:27,749 TRACE [Thread-437868] com.gwava.ews.archiveimpl.process.CursorFetchThread: fetching from folder(jdoes@company.com): Inbox
2015-04-27 00:01:27,749 TRACE [Thread-437868] com.gwava.ews.RetainExchangeWebserviceFactory: invoking retainews method: findItem
2015-04-27 00:01:27,843 TRACE [RTWQuartzScheduler_Archive_Worker-1] com.gwava.caapi.process.ArchiveEmail: sent mail textbody with subject: FW: Time Off Approved, server-response: Archive: OK: All is well. Result=AddedEMails: 1, emailID=41660668, parentID=126600
2015-04-27 00:01:27,843 TRACE [RTWQuartzScheduler_Archive_Worker-1] com.gwava.caapi.process.ArchiveEmail: +++msgCount: 1
2015-04-27 00:01:27,843 TRACE [RTWQuartzScheduler_Archive_Worker-1] com.gwava.caapi.process.ArchiveEmail: EmailID: 41660668, parentID: 126600, stubbingStage: 0

It grabbed the item from the user's Inbox.  The subject of the message was "FW: Time Off Approved". 

Right after that, you can see what occurred with the message.  Look where it says "Result=".  If "AddedEmails:" = "1", then the message got added.  If "0", it did not get added and the likely cause (short of an error, which WOULD be displayed in this area if such were the case), a "0" means it was already in the system.

SUMMARY
So, you'll want to go through the Worker log and check for:
  1. Was the Inbox folder seen by the Worker?  If so, did it give any message about excluding it?  If so, check the Advanced tab of the job's associated profile for a setting that is excluding that folder.
  2. What is the storage date scope?  Do any of the items in the Inbox land within that scope?
  3. Search for the messages in the log by searching for their subject.
If you cannot find the messages listed yet the log shows that Retain got the folder from Exchange, there is usually a storage date range issue.  Either the profile has a bad setting or the user's Retain mailbox has an item store flag that is set to a time beyond the time of the messages in the user's mailbox; or, there is an error when Exchange is trying to read the Inbox folder, so there should be clear indications of errors in the log.  But even then, you should be able to find the message by subject in the log if there is an error because there will be a line in the log near the error showing the subject of the message that was having an error.

If the message shows up in the log but is not in Retain, call technical support and a support engineer will be able to look into this a little more deeply with you.

Additional Information

This article was originally published in the GWAVA knowledgebase as article ID 2586.