As part of some of the (archiving-) projects I have worked on, I frequently get asked if there is an easy way to determine what the size of the archive will be once it’s been activated. Although a bit odd at first, there are actually many good reasons why you’d want to know how big an archive will be.

First of all, determining the archive size allows to better size (or schedule for) the storage required for the archives. While there are also other ways to do this, knowing how big an archive will be when enabled is very helpful.

Secondly, if you’re using Exchange Online Archiving (EOA), it allows you to determine the amount of data that will pass through your internet connection for a specific mailbox. If the amount of data is large enough (compared to the available bandwidth), I personally prefer to provision an archiving on-premises, after which I can move it to Office 365 using MRS. But that’s another discussion. Especially for this scenario it can be useful to know how much archive you can (temporarily) host on-premises before sending them off to Office 365 and freeing up disk space again.

In order to calculate how big an archive would be, I’ve created a script which will go through all the items in one (or more) mailbox(es) and calculate the total size of all the items that will expire. When an item expires (and thus is eligible to be moved to the archive) depends on the Retention Policy you assign to a mailbox and what retention policy tags are included in that policy.

As the name of the script depicts, it’s important to understand that it’s an estimation of the archive size. There are situations in which the results of the script will be different from the real world. This could be the case when you enabled the archive and a user assigned personal tags to items before the Managed Folder assistant has processed the mailbox. In such a scenario, items with a retention tag that are different from the AgeLimit defined in the script will be calculated wrongfully. Then again, the script is meant to be ran before an archive is created.

Secondly, the script will go through all the folders in a mailbox. If you disabled archiving of calendar items, these items will be wrongfully included in the calculation as well. I will try to built this into the script in future releases, but this has a lower priority as the script was built to provide a pretty good estimation, not a 100% correct number.

The script, which you can download here, accepts multiple parameters:

UserPrimarySMTPAddresses the Primary SMTP Address of the mailbox for which you want to estimate the archive size
Report full file path to a txt file which contains the archive sizes
AgeLimit The retention time (in days) against which items should be probed. If you have a 60 day retention before items get moved to the archive, enter 60.
Server Used for connecting with EWS. Optional. Can be used if autodiscover is unable to determine the connection URI.
Credentials The credentials of an account that has the ApplicationImpersonation Management Role assigned to it.

 

The output of the script will be an object that contains the user’s Primary SMTP Address and the size of the archive in MB (TotalPRMessageSize).

Credit where credit is due! I would like to thank Michel de Rooij for his immensely insane PowerShell scripting skills and for helping me with cleaning up this script to its current form. Before I sent it off to Michel, the code was pretty inefficient [but hey! it was working], what you’ll download has been cleaned up and greatly enhanced. Now you have a clean code, additional error handling and some more parameters than in my original script [see parameters above].

I hope you’ll enjoy the script and find it useful. I’ve used it in multiple projects so far and it really helped me with planning of provisioning the archives.

Note:  To run the script, you’ll need to have Exchange Web Services installed and run it with an account that has the Application Impersonation Management Role assigned to it.

Cheers,

Michael