« XenServer 5.6 Beta is available today | Main | VM backup and recovery with XenServer 5.6 Beta »

March 26, 2010

VMware SRM and NetApp SnapManager/SnapMirror for automated DR and advanced backup/recovery of Microsoft SQL Server VMs

Posted by Abhinav Joshi – Reference Architect (Server and Desktop Virtualization)

 

In a recent blog on advanced backup and recovery for SQL Server database VMs, I briefly discussed the merits of leveraging NetApp SnapManager for SQL Server (SMSQL) to achieve highest levels of RTO and RPO for your SQL Server VMs. Some of the key benefits that NetApp SMSQL solution provides are:

  • VSS aware application consistent backups of databases
  • Space efficient, instantaneous NetApp storage array based snapshots
  • Automated backup verification and transaction log truncation
  • Remote replication with WAN acceleration (enabled by NetApp SnapMirror compression) to enable efficient disaster recovery
  • Granular recovery (both individual database transactions and up to the minute restores)

Being able to achieve automated disaster recovery (DR) in a matter of few minutes along with advanced backup and recovery can provide you with a robust disaster recovery and business continuance solution with minimum downtime. This is where VMware vCenter Site Recovery Manager (SRM) strongly complements NetApp SMSQL solution. While NetApp SMSQL provides advanced backup and recovery capabilities, VMware SRM provides DR workflow automation so that at the click of a button, the SQL Server databases can be recovered at the DR site in a matter of few minutes.

 

Leveraging NetApp space efficient FlexClones, the combined solution also provides the capability to test the recovery of the SQL Server VMs at the DR site. Only minimum amount of storage is required to power on the SQL Server VMs in the DR test network to verify the database functionality.

 

Lets take a deeper dive into how the VMware SRM and NetApp SnapManager/SnapMirror solutions complement each other.

 

1. Below is a screenshot from one of the solution labs hosting several Microsoft SQL Server, Exchange, SharePoint, and AD/DNS VMs hosted on VMware vSphere 4 and NetApp storage.

 

1 

 

2. The SQL Server VMs leverage NetApp SMSQL to provide application consistent backups, granular recovery, and remote replication for the databases.

 

3. The backups for the SQL Server, Exchange, and SharePoint VMs have already been configured to replicate the data to the DR site using NetApp SnapMirror. The DR site has a VMware vSphere 4 environment pre-configured with the NetApp storage that hosts the SnapMirror replicated data for the VMs from the primary site.

 

4. Leveraging VMware SRM on the primary site, all the SQL Server, Exchange, and SharePoint VMs have been configured in SRM protection groups as shown below.

 

2

5. SRM recovery plans for these enterprise application VMs have been created on the vCenter server on the DR site as shown below.

 

25 
  
 6.
 Next, we created a new table in the SQL Server database and performed a full database backup using NetApp SMSQL. The backup has also been automatically replicated to the DR site using NetApp SnapMirror; being invoked directly from NetApp SMSQL.

 

3

7. Leveraging the WAN acceleration capabilities natively available in NetApp SnapMirror, we were able to achieve very high level of compression ratio of 37:1, thereby helping reduce the WAN traffic significantly.

 

4

DR Testing

8. Leveraging the DR test capabilities in VMware SRM, lets perform a DR test for the SQL Server VM while the databases on the primary site are up and running. Using NetApp FlexClone capabilities, zero cost copies of the SQL Server VM and databases will be created on the DR site. The VM will be powered on in a private test network. The VM will not consume any storage. Only the new writes to the VM or databases will consume additional storage. We highlighted this unique capability earlier here.

 

5

9. Once the test recovery for the SQL Server VM is complete, lets login to the VM and verify the database functionality by adding a new table.

 

6

 7

10. Next step is to validate that only minimum amount of storage has been used to power on the SQL Server VM for the DR test. As we see in the screenshot below, leveraging NetApp FlexClone capabilities, only 2GB of new storage has been consumed to power on the SQL server VM in the DR test network. This validates the NetApp storage efficiency capability that we discussed earlier here. Once the database validation is completed, the SRM test recovery plan can be finished. This will destroy the SQL Server VM created in the test network and also destroy the writable FlexClone volumes created from the SnapMirror volumes on the DR site.

 

8

Perform Real Site Failover

11. The next step is to simulate a real disaster at the primary site and perform failover of the SQL Server VM to the DR site. VMware SRM will automate all the storage, networking, ESX host, and VM operations required to make sure that the VM powers on at the primary site and the applications continue to function. This also involves breaking the SnapMirror relationship and mounting the LUNs / NFS volumes to the ESX host, and powering on the VMs.

9

12. Once the failover of the SQL Server VM to the DR site is complete, lets login into the recovered SQL Server VM and validate that the LUNs are still visible in NetApp SnapDrive.

 

11

 

13. Next step is to validate that we are able to continue operating the SQL Server VM at the DR site. In this example, we will add a new table to a sample database called Adventure Works.

12

 

14. The next step is to validate that the SQL Server database backups that were created at the primary site are still available at the DR site for restore purpose.

13

 

15. Now lets perform a new full backup of the SQL Server database.

14

16. The last step is to validate that the new full database backup we just performed in the previous step, including the backups that were originally performed at the primary site are still available for restore.

 

15

I hope this provided you insight into the unique data protection and automated disaster recovery capabilities available in the joint VMware SRM and NetApp SnapManager solution. Detailed information around this solution is covered in the tri-branded NetApp/VMware/Cisco solution guide, available for download here.

 

Follow me on Twitter @abhinav_josh

As always, comments and feedback is highly appreciated.

Comments

Matt

Hello Abhinav,

Great blog! I'm very interested in talking to you about this solution as we are working on a project very similar with a customer. I'm particularly interested in the details of how you automated the SQL volume snapmirror break and LUN mounts within the VMs with SRM.

-Matt

Abhinav Joshi

Hi Matt,
Thanks for the feedback on the blog. All the solution details are covered in the Solution Guide that we discussed earlier here:

http://blogs.netapp.com/virtualization/2010/03/new-solution-guide-disaster-recovery-of-microsoft-exchange-sql-server-and-sharepoint-using-vmware-srm.html

I would be glad to talk to you about how everything works. Please feel free to email me abhinavj@netapp.com

The comments to this entry are closed.

Subscribe to This Blog


RSS


Virtualization Events

Photos

TRUSTe CLICK TO VERIFY