Exchange Archiving Strategies

  • 7019285
  • 20-May-2015
  • 05-Sep-2017

Environment

Retain 3.x
Exchange Module

Situation

What is the best way to archive Exchange?

Resolution

As the system administrator it is your job to comply with your organization’s data retention policy. It is best to talk to your organization’s legal counsel to determine what you are required to keep and for how long. Part of that is retaining an archive of the messages in your Exchange system. This is where Retain helps make your job easier, by copying messaging in Exchange and keeping them on a separate server where they don’t put extra stress on the Exchange server.

You have to find a way to balance the required message retention policy of your organization with the limits of your Exchange system. You could certainly enable a Litigation Hold on everyone and keep everything forever. But it would not take all that long before your heaviest users would run into performance and quota issues. And we know any message system runs best when it is storing the least messages. Moving the older messages to the Retain archive leaves the Exchange server free to process messages rather than store and search for them.

Out of the box, an Exchange system and Retain are setup to take snapshots of the state of user mailboxes at a given point in time. There may be items that were deleted and purged between archive jobs and those would not be in Retain. That might not be good enough for your retention policy. There are holes that messages can leak through and be lost instead of being retained properly.

That are options in Exchange and Retain you can enable that will allow you to plug those holes.

Exchange Message Lifecycle
Archiving Exchange requires some understanding of how Exchange is designed. Exchange does not have message level flags that can tell you if a message has been archived or not. So you have to set things up to give Retain a chance to archive messages before they are deleted from Exchange completely.

The lifecycle of a message in Exchange goes like this: A message enters the user’s mailbox, it gets read, maybe moved to a folder for a while, then trashed and deleted. Then it is just moved into the user’s Recoverable Items folder, where the message will sit for 14 days (by default, it can be extended to a maximum of 30 days) and is then purged by the system.

The user can restore items from the Recoverable Items folder all on their own, during this time period. But something else they can do is purge items from Recoverable Items permanently and immediately.

The one thing you don’t want to have happen is a user to receive an important message, delete it and then purge it from their Recoverable Items folder before there is a chance to archive that important message.

There are a couple of options to prevent a user from permanently deleting a message before it can be archived properly.

Exchange Journaling
It used to be that Microsoft recommended using the Journaling Mailbox feature to make sure you had a copy of all messages passing through your system. But since that feature is disabled in Office365 and is against the O365 Terms of Service to use an O365 mailbox in that manner, it appears Microsoft is moving away from Journaling mailboxes.

And it is understandable. An average email user will receive about 120 messages a day, additionally there are the messages they send. Since a Journaling Mailbox is just a mailbox but it is gathering copies of all the messages, it will fill up quickly in all but the smallest systems. And we know by experience that opening a 50+GB mailbox is something that Exchange struggles with. And if Exchange is struggling then everyone is struggling.

We don’t recommend Journaling Mailboxes because it means extra work for you the system administrator. You have to monitor the journaling mailbox daily to make sure it is not too large.

Retain does work with Journaling mailboxes and will even delete items after archiving the mailbox. But you must keep an eye on the Journaling mailbox, if some kind of connection glitch occurs or a large influx of messages, it might happen that Exchange will be unable to serve up a very large mailbox and Retain won’t be able to archive those messages or the quota will prevent it from accepting more message and messages will be lost. You can create a new Journaling Mailbox, that has to be done manually and you have to know that the old one is full. Then you’ll have to deal with the old mailbox that cannot be accessed.

An example Journal Profile would include:

Scope:

Data Range to Scan: All Messages (Ignore date)
Duplicate Check: Try to publish all messages (SLOW)
Set Storage Flags: enable Item Store Flag

Miscellaneous:

Store all attachments
Enable Store/index Internet Headers
Select Store by year and month (yyyyMM)

An example Journal Job would include:

Journaling:

Enable Journaling
Specify the journaling mailbox or mailboxes that you are using.
Enable “Delete archived items from journal” which is highly recommended.

See Also: Exchange Journaling Mailbox Recommendations

Exchange In-Place Hold
The other technique to use and the one that is recommended now is to use In-Place Holds. You can set up a In-Place Hold that prevents items from being purged from the system before there is a chance for Retain to archive them. You want to run a dredge every night and grab the messages immediately as having the message backed up to Retain is vital to data integrity.

You can set the rolling In-Place Hold for a few days, a couple of weeks, a quarter or more if you wish. We recommend 90 days as you can go on a two week vacation and still have time to deal with any issues that came up while you were going with little risk of data loss even if there is a failure the day you leave on vacation, unless the Exchange server fails completely.

With rolling In-Place Hold rather than one mailbox having to receive copies of all messages, in addition to all the users, your system keeps all the messages where they went anyway, with little additional stress on the system. Your users keep their messages in their mailboxes as normal, even delete them as normal, but they cannot remove the messages from disk themselves. The only real cost is a slight increase in disk usage until the hold is automatically released and the message purged by Exchange as it expires.

A nice thing about Rolling In-Place Hold is that you can delay the archiving of messages for a short period of time, for example a week or two. Most users would have finished acting on a message by that time and filed it where they wanted it. Now you have Retain archive the messages in the user’s mailbox including the trash. This way, when they log into their Retain archive mailbox they are greeted by their familiar folder structure.

An example Daily Profile would include

Scope:

Data Range to Scan: All Messages (Ignore date)
Duplicate Check: Ignore all messages older than item store flag (fast)
Set Storage Flags: enable Item Store Flag

Miscellaneous:

Store all attachments
Enable Store/index Internet Headers
Enable Include user's recoverable items 

Putting Messages into Folders

Retain does not do file management, so users won't be able to put items in folders once they are in Retain. However, Retain can update the location of items to where they are in Exchange. You can have this run, say, every weekend. This wil update the locations of the messages in the database so they are where the messages are in Exchange. Since these messages will be in Retain already the job should go pretty fast, though in the case of O365 you might want to limit it to 7-14 days. This way users will be able to find their messages by folder, and the job won't take too long.

The best thing to do is to archive every night so you have all the messages. Retain will store the message where ever it finds it. What you can then do is run a job where the Profile settings are:

An example Folder Update Profile would include

Scope:

Data Range to Scan: "Number of day from job start" and set the date to 14-90 days or newer
Duplicate Check: Try to publish all messages (SLOW)
Set Storage Flags: enable Item Store Flag

Miscellaneous:

Store all attachments
Enable Store/index Internet Headers
Enable Include user's recoverable items

End of Message Life
Almost all data retention policies have time limits on how long you must keep a message and after that time you are free to delete them. A little thought now can make your life easier in the future.

You can use Deletion Management to remove messages from the Retain archive. At the simplest, if your retention policy is seven years, you can use Deletion Management to delete messages whose delivered date is more than seven years ago.

You may have different retention policies for different groups of users. You may have to keep corporate level users for ten years but hourly workers may only need to be retained for three years. In Exchange, you can group them into Distribution groups. In Retain, you can create jobs for those distribution groups under Jobs/Mailboxes/Distribution lists and use “Enable data expiration” under Jobs/Core Settings to set when the message can expire in Retain. Then have Deletion Management use the Expiration date for the criteria.

Since Retain uses Single-Instance Storage and only stores one copy of a particular message, some items will be held until the last user releases it. So an organization-wide announcement might be held far longer than you might expect. If a user is under Litigation hold in Retain all messages linked to that user will be kept until that hold is released.

Additional Information

This article was originally published in the GWAVA knowledgebase as article ID 2547.