« Are Virtual Desktops Hot Again? | Main | Citrix StorageLink 2.3 (EHV 5.6 FP1) Beta is now available »

July 26, 2010

VMware SRM and NetApp Solution for automated DR and advanced backup/recovery of Microsoft Exchange 2010

Posted by Abhinav Joshi – Reference Architect (Virtualization and Cloud Computing)


Exch+SRMFew months back, I
discussed the power of VMware SRM and NetApp SnapManager / SnapMirror solution to achieve automated DR and advanced backup/recovery for mission critical SQL Server VMs. I get constantly asked by a lot of customers on how similar benefits can be achieved for Microsoft Exchange environment using the joint VMware and NetApp solution.

I want to take this opportunity to expand on the VMware SRM and NetApp solution that provides the same value for MS Exchange as achieved for SQL Server and SharePoint (as detailed in the solution guide TR-3822). NetApp SnapManager for Exchange (SME) and SnapMirror technologies play a key role in this solution and provide the following benefits:

  • VSS aware application consistent backups of Exchange databases
  • Space efficient, instantaneous NetApp storage array based snapshots
  • Automated backup verification (Microsoft supported) and transaction log truncation
  • Remote replication with WAN acceleration (enabled by NetApp SnapMirror compression) to enable efficient disaster recovery
  • Granular recovery (up to the minute and individual mailbox restores)

Being able to achieve automated disaster recovery (DR) in a matter of few minutes along with advanced backup and recovery can provide you with a robust DR and business continuance solution with minimum downtime. This is where VMware vCenter Site Recovery Manager (SRM) strongly complements NetApp SME/SnapMirror solution. While NetApp SME provides advanced backup, replication, and recovery capabilities, VMware SRM provides DR workflow automation so that at the click of a button in vCenter server, the Exchange environment (Mailbox servers, CAS and HUB servers) can be recovered at the DR site in a matter of few minutes. 

NetApp space efficient FlexClones integration in SRM provides the capability to test the recovery of the entire Exchange environment at the DR site. Only minimum amount of storage is required to power on the Exchange VMs in the DR test network to verify the functionality.

In the 3000 seat Exchange 2010 environment I expand on later in this blog, only 20GB of new storage (including VM C: drives with OS and app binaries, databases, and logs) was consumed to power on seven Exchange VMs in the DR test network, and perform Exchange functional testing.

Lets take a deep dive into how the VMware SRM and NetApp SnapManager/SnapMirror solution provides a lot of value for the mission critical Exchange environment.

1. Below is a screenshot from one of the solution labs hosting several Microsoft Exchange 2010, SQL Server 2008, SharePoint Server 2007, and AD/DNS VMs on VMware vSphere 4 and NetApp storage arrays. Each application was hosted on a different NetApp MultiStore vFiler unit with a corresponding vFiler on the SRM Recovery Site.

5
 

2. The 3000 user Exchange environment was hosted on its own MultiStore vFiler unit. In this architecture, we performed several days of Exchange Load Generator testing without any issues. Both read and write latencies were well within the Microsoft recommendations. NetApp primary storage deduplication helps save storage cost for Exchange 2010 environments with an average savings of 25-30% for several customer deployments as discussed in this blog post by our Microsoft Alliance team.

3. The Exchange mailbox server VMs leverage NetApp SME to provide application consistent backups, granular recovery, and remote replication for the databases.

4. The backups for the Exchange VMs have already been configured to replicate the data to the DR site using NetApp SnapMirror. The DR site has a VMware vSphere 4 environment pre-configured with the NetApp storage (with MultiStore vFiler units) that hosts the SnapMirror replicated data for the VMs from the primary site.

5. Leveraging VMware SRM on the primary site, all the Exchange VMs have been configured in SRM protection group as shown below.

3
 

6. SRM recovery plans for these VMs have been created on the vCenter server on the DR site as shown below.

4

7. Next, we performed application consistent backup (with verification) for Exchange databases using NetApp SME solution. The backups are automatically replicated to the DR site using NetApp SnapMirror; being invoked directly from within NetApp SME.

8. Leveraging the WAN acceleration capabilities natively available in NetApp SnapMirror, we were able to achieve very high level of compression ratio, thereby helping reduce the WAN traffic significantly.

 

DR Testing

9. Leveraging the DR test capabilities in VMware SRM, we performed a DR test for the Exchange environment in the middle of the day while it was running production at the primary site. With NetApp FlexClone capability integrated into VMware SRM, zero cost copies of the Exchange VMs and databases were created on the DR site. The VM were automatically powered on in a private test network. The powered on VMs and databases (leveraging NetApp FlexClones) did not consume a lot of additional storage. Only the new writes to the VMs or Exchange databases consume additional storage (20 GB for this 3000 seat Exchange environment). We highlighted this unique capability earlier here.

9
 

 10. Once the test recovery for the Exchange VMs was complete, we were able to verify that the Exchange users are able to send and receive emails.

Note: To provide name resolution and user authentication services in the DR Test Network, the AD server at the SRM Recovery site was cloned just prior to running the DR test. Once the cloning is complete, before powering on the VM, we connected the cloned AD server to the DR test network. After the AD VM is powered on in the test network, five FSMO roles in the Active Directory forest were seized as per the procedure described in the following Microsoft KB: http://support.microsoft.com/kb/255504. The five roles are Schema master, Domain naming master, RID master, PDC emulator, Infrastructure master.

12

11. Next, we validated that minimum amount of storage was used for this economical and highly efficient Exchange DR test (only 20GB for the 3000 seat Exchange environment). This validates the NetApp storage efficiency capability that we discussed earlier. Once the database validation is complete, the SRM test recovery plan was finished. This will destroy the Exchange VM created in the DR test network and also destroy the writable FlexClone volumes created from the SnapMirror volumes on the DR site.

Perform Real Site Failover

12. The next step is to simulate a real disaster at the primary site and perform failover of the Exchange environment to the DR site. VMware SRM will automate all the storage, networking, ESX host, and VM operations required to make sure that the VM powers on at the DR site and the applications continue to function. This also involves breaking the NetApp SnapMirror relationship and mounting the LUNs / NFS volumes to the ESX host, and powering on the VMs.

14

13. Once the failover of the Exchange VM to the DR site is complete, we logged into the recovered Exchange mailbox VMs and validate that the RDM LUNs are still visible in NetApp SnapDrive.

 

14. Next, we validated that we are able to continue operating the Exchange environment at the DR site. Exchange users were able to successfully send and receive emails in this environment at the DR site.

17

15. Next, we validated that the Exchange database backups that were created at the primary site are still available at the DR site for restore purpose. We also validated that new backups can be performed on the Exchange database at the DR site using NetApp SME.

 

16. The last step was the successful validation that the new Exchange database backups we just performed in the previous step, including the backups that were originally performed at the primary site are still available for restore.

I hope this blog provided you insight into the unique data protection and automated disaster recovery capabilities available for MS Exchange with the joint VMware SRM and NetApp solution. Detailed information around the architecture for this solution is covered in the NetApp/VMware/Cisco tri-branded solution guide, available for download here.

As always, comments and feedback is highly appreciated

Follow me on Twitter @abhinav_josh   

Comments

Francesco Duranti

As always a really nice articles. We configured all the environment with vSphere 4,SRM 4.0.1,NFS for VM OS disk/iSCSI RDM for Exchange Mailbox store, SMVI for VM Backup/SnapDrive-SME for Exchange Backup and all seems to work correctly (we used SRM to do a migration of all our dev/test machines on the DR Site).
The only actual problem is that there's no SRA for SRM 4.1 so it's currently not possible to upgrade to vSphere 4.1 using SRM and NetApp storage... Any information on when the SRA will be out?

vSphere 4.1 is out with some good change for NFS and not being able to use/test it is not a nice thing... almost all SRA are already certified for it...

Abhinav Joshi

@Francesco

Glad to hear from happy customers like you. I understand your concerns. The SRA for vSphere 4.1 should be available for download very soon (next few days).

What other blogs would you like to see from us?

Francesco Duranti

Thanks for the reply :) I hope it will be soon so that we will be able to start 4.1 upgrade :)
It would be nice to see something dedicated to new minor release of netapp software that normally will not be published on the software change page but that many times have bug fix needed by most users (IE: Snapdrive 6.2Px and so on). Probably this will not need a blog just a better software change page :)

Francesco Duranti

Hi Abhinav... any news on SRM 4.1 Netapp SRA? Time's passing and no information about it is out... I'm starting to be a little worried about it...

Dave

Abhinav, any news on SRM 4.1. Anxiously waiting so I can upgrade to vSphere 4.1, its been over a month!

Abhinav Joshi

Hi Francesco and Dave,
Good news! The adapter should be out anytime now. Thanks for the patience. Stay tuned, I will update this thread as soon as the SRA 4.1 is released.

Francesco Duranti

I've just downloaded from VMWare SRM 4.1 page the new 1.4.3 SRA for SRM 4.1.
From the description it seems that only the release notes is changes. Is this true?
On Netapp NOW site there's a 1.4.3P1 version, that solve 2 bugs of 1.4.3. Is this version also good for SRM 4.1 now that 1.4.3 is certified?

Abhinav Joshi

@Francesco Yes, 1.4.3P1 is good for SRM 4.1 :)

Bob Beverage

This post really helped me move on to the next step of our DR plan. Prior to reading it we hadn't really thought about using vFilers to make the magic happen, but once we looked at them everything kind of came together.

http://www.grumpyadmin.com/wordpress/?p=85

The comments to this entry are closed.

Subscribe to This Blog


RSS


Virtualization Events

Photos

TRUSTe CLICK TO VERIFY