Exchange Archiving Overview

  • 7019130
  • 11-Dec-2015
  • 07-Aug-2017

Environment


Retain 3.x/4.x
Exchange Module

Situation


How do I archive Exchange properly?

Resolution


As an Exchange system admin, you have concerns about how to balance the needs of your organization’s data retentions needs with the limitations of Exchange.

Why Archiving to A Separate Server is Important

Your organization’s retention policy requires you to store years of data. With the average user receiving 120 emails a day, you are looking at over 40,000 messages a year, add that over 10 years, plus the calendar items, contact lists and so on, and you are looking at half a million items or more, just for an average user, then you have the heavy users that get a million items a year.
While you can certainly use Exchange’s retention features, but Exchange has quotas that keep it working, because we have learned from experience that if an Exchange mailbox has too many items, it suffers performance issues. http://blogs.technet.com/b/exchange/archive/2005/03/14/395229.aspx
The reality is that the vast majority of messages will never be accessed again by the user. So why keep them on the Exchange server, dragging it down?
Retain allows you to remove messages from the Exchange server so Exchange can do what it does best, which is making sure messages get to the right place and off load the long term storage of messages to a server that specializes in storage.

How Not To Lose Data

Out of the box Exchange allows users to receive mail, file it or trash it. When a user puts a message in the Deleted Items folder, they can then empty that folder, which moves the message into a semi-hidden folder called Recoverable Items. By default, Exchange will keep items in the Recoverable Items folder for 14 days and then remove them from disk. However, a user can right click on the Deleted Items folder and access “Recover deleted items…” which will bring up a dialog box where they can either recover the item to their inbox or purge the item which will immediately remove the item from disk.
While that is good for disk space, it is not so good for you when the men in the dark suits come, especially when they already have some incriminating messages sent from your mail servers, which they always do.

Your job is to store all messages so when the lawyers come asking about certain messages you are able to comply with the request and not go to jail. Retain is here to help make that happen but you have to enable some settings in Exchange to make that work.

When the lawyers come and you put a Litigation hold on one or more users, nothing can magically make deleted message reappear. You need to have them stored somewhere, preferably someplace easier to reach then the tape backups. However, if you set things up properly Retain will have all the messages and you can give the lawyers the information they need quickly, easily and securely.

Retain is an archiving solution. It keeps a copy of all the messages and makes them easy to search. Retain is usually run once a day and it may take a few hours to download messages from Exchange. The initial dredge may take weeks. From now on what you need is a way to prevent users from being able to purge items from Exchange before Retain has a chance to storage all the messages.

Exchange does not have item level controls over messages in its own system. Exchange does not know if an individual item has been archived or not, so we need to find a way to copy all the messages.

How to Get A Complete Archive

The first thing you need to do is find a way to keep Exchange from deleting items until Retain has a chance to archive them. You could use a Litigation hold but there is the downside of having Exchange keep all items forever filling up your server. An In-Place Hold is very similar to a Litigation hold but is easier to set for the entire system (https://technet.microsoft.com/en-us/library/ff637980%28v=exchg.150%29.aspx).

You also want it rolling so it only lasts a limited amount of time, so your Exchange server does not experience performance issues due to excessive storage load. You want this time period to be long enough for your team to be able to detect and resolve issues before they become real problems. We recommend at least 14 days, but 60 days and up to 90 days is certainly a good idea.

What happens with a Rolling In-Place Hold is that when a user attempts to Purge an item from their Recoverable Items folder it is moved to a hidden folder called Purges. The Purges folder is inaccessible to the user but not the system. When the Retain Profile is set to “Include user's recoverable items” then Retain will use the ApplicationImpersonation user to traverse the folder structure of each user’s mailbox including the Purges folder. http://msexchangeguru.com/2013/03/29/totaldeleteditemssize/

Creating a complete archive is straightforward to do by enabling a Rolling In-Place Hold in Exchange. You’ll have a Profile for your initial dredge that archives everything.

  • Message Settings should be all kinds of messages.

  • Scope should be set to:

    • Date Range to Scan = “All Messages (ignore date)”

    • Duplicate Check = “Try to publish all messages (SLOW)”

    • Set the Item Store Flag, enabled by default

  • Miscellaneous should be set to:

    • Enable “Store/index Internet Headers”

    • Enable “Include user’s archive mailbox” (if applicable)

    • Enable “Include user’s recoverable items”

    • If you want to include Public Folders set that to “Owned by Mailbox” as the impersonation user does not have the rights to access a document that resides in a different mailbox, it can only enter one mailbox at a time

  • Profile should also be set

    • Duplicate Check is set to “Ignore all messages older than item store flag (fast)” with the Item store flag enabled

How To Know Retain Is Working

There are a few ways to check to see if Retain is working.

  • You can check the Worker status web page [server]:48080/RetainWorker[add the worker number if you have multiple workers]. This is best to check while a job is running.

  • You can use the reports from Retain’s Reporting and Monitoring Server daily to detect issues. If a user’s Item Store Flag is not up to date then you know there is something going on. Usually a message that is returning an error when Exchange tries to access it. You can have reports scheduled to be emailed to you in a recurring manner. This is usually easiest.

  • You can also check to see if an individual job is running by checking the Job/[job name]/Status in the Retain Web Console.

What If Your Users Want Messages In Their Proper Folders

Most users remember where they stored items rather than the item itself. So while they could use search it may make good sense to set things up so your users can look in their folders. Unfortunately, Retain is not a file manager, so the best way to pull this off would be to run a job over the weekend that has a profile that goes back 7-14 days and updates the location of the messages.

Archiving Alternative: Journaling

If you have On-Premise Exchange you can use a different archiving method called journaling. This doubles the amount of storage Exchange requires as it keeps a copy in the journal mailbox. This is also not available in Office365.

In Exchange Admin Center you can set up a special mailbox called a journaling mailbox that collects all the mail for your domain. Retain can dredge this mailbox and delete items as they are stored. Which brings up the major downside. If you have a large message volume it may happen that the mailbox becomes too large for Exchange to serve the mailbox to anyone including Retain. So it is very important to monitor the journal mailbox so it doesn’t become too large and it is more about the number of items rather than size of those items. I have seen a mailbox become unservable with 125,000 messages in it but another with >1 million messages that was able to be dredged. It seems to depend on the hardware that Exchange has access to.

Journaling also has the limitation of keeping everything it gets in one mailbox so anyone who searches it will see everyone’s email. That is often not desirable for ordinary users to have that ability.

Additional things to make life easier

Exchange has a set of mailboxes called HealthMailbox that the system uses to make sure that it is functioning properly. Mostly it is just lots of messages that say: “This is a mailbox delivery probe”
You can and should exclude these users from your production Retain system.

Additional Information

This article was originally published in the GWAVA knowledgebase as article ID 2678.