It has been one year to the date since the last time I authored a post on the topic of partition alignment. I think its fair to state this issue has gained awareness over the last year thanks to posts from a number of industry experts such as Duncan Epping, Aaron Delp, and Chad Sakac just to name a few. However alignment remains an open issue, which is unfortunate for customers and partners.
In this post I hope to shed some new light on this old topic with a to revisit this topic one year from now to proclaim misalignment is dead!
Let's Get on the Same Page
From where I sit it appears that this topic is somewhat confusing. It seems that some of the confusion is a direct result of some sales teams attempting to leverage the confusion around misalignment as incentive for the customer to purchase an alternative storage platform. I've heard the following messages far too many times...
Storage vendor sales pitch: “Mr. customer, Your current storage array suffers an awful performance penalty with misalignment. Our storage is unaffected by misalignment, so if you'd purchase our storage all of your problems will be resolved...”
These statements are simply rubbish. If you have a storage sales rep who has made statements such as these, maybe their taking you and your business for granted. Don't take my word for it; here are quotes, with direct links, from the technology partners which power your data center.
EMC Symmetrix with Windows - “Misalignment with these storage boundaries could potentially lead to performance problems."
EMC Symmetrix with Oracle - “Because of the first partition misalignment on x86 systems relative to the storage array tracks, all data on that partition will continue to be misaligned and that misalignment has been shown to cause performance degradation.”
EMC Symmetrix with Oracle - “Aligning partitions to a 64KB offset is a general requirement on Linux x86_64 with ASM and Symmetrix storage arrays.”
EMC Celerra - “Performance improvement as high as 40% was observed on partitioning the drive using DISKPART and aligning the disk.”
As you can See the IT industry is aligned on alignment!
The impact of a single misaligned VM may be nearly undetectable; however, today's NAS & SAN arrays store exponentially more data than before the onset of server virtualization. At scale the effect of misalignment becomes compounded and debilitating, impacting all of the VMs on the storage array.
Could you imagine If server virtualization resulted in a 30% penalty in CPU performance? Would you be eager to virtualized the majority of your data center? Would you throw 30% more CPUs to offset the overhead? This may sound far-fetched, but this is exactly what some are doing, deploying additional storage hardware to offset the performance impact of misalignment. I think we know this step is a stop-gap measure at best.
It's Time to Get Busy
Misalignment isn't going to solve itself, so let's discuss how one can start to tackle this issue and return our storage platforms to optimal performance levels. Your CIO will thank you for this as it will result in a reduction in storage expenditures as you increase the performance of your existing arrays.
Step 1 - Stop Deploying Misaligned VMs from Templates
If you are deploying Windows VMs you need to review your templates based on operating systems version. The primary offenders are Windows NT, 2000 and 2003. These platforms have a default starting partition offset of 32.256 bytes. For alignment they need to be evenly divisible by 4,096 bytes thus a minimum value for these systems is 32,768. This is the smallest value resulting in properly aligned partitions for NetApp and most arrays with the exception of EMC's Symmetrix DMX & VMAX.
When it comes to current version of Windows like Windows 7, 2008 and Vista these systems are properly aligned by default as they all have a 1MB starting partition offset, which works universally with all arrays. Kudos to Microsoft for stepping up to the plate to assist their entire customer base with this change!
While you may feel comfortable with the recent versions of Windows your templates can still be misaligned if you upgraded an older version of Windows to create the template.
I'd suggest you verify all of your templates and correct any, which are not properly aligned. If you’re a NetApp customer you can complete an audit with MBRscan and corrective actions with MBRAlign. If you're not a customer and/or prefer to not use the MBRTools you have a plethora of additional tools including (but not limited to):
VizionCore vOptimizerPro
PlateSpin
Leostream
vContinuum
Virtuozzo
I should add, that misalignment also occurs with almost every release of LINUX and only recently has been addressed in default settings. At the time I wrote this post I couldn't verify which recent releases and distros have moved to a 1MB partition offset. if someone sends me this info, I will add it to this post or an addendum.
Step 2 - Stop Deploying Misaligned VMs with your P2V Process
Unless you are using a physical to virtual migration tool that explicitly states it's aligns partitions that you likely are creating misaligned VMs. I hate to single out VMware here, but the (free) VM Convertor creates misaligned VMs.
If your P2V process requires refinement you have two choices, either...
a) Upgrade your P2V tools to one from the list above
or
b) Continue using the misbehaving tool, but run MBRAlign on the newly migrated VM prior to powering it on.
Frankly option a) seems much more elegant, but that's just my opinion.
Step 3 - Identify the Misaligned VMs in Production
If you have completed the above actions, you should fell confident you have started down the path of getting healthy, which is good; however, it only gets more difficult from here as we need to turn our attention to the VMs which are already running and this phase is going to require service disruption with each misaligned VM.
Before we jump to step 4, we need to begin by identifying the running VMs that are misaligned. Again NetApp customers can use MBRScan or our new tool Balance (formerly Akorri BalancePoint). As in step 2 if you're not a NetApp customer and/or prefer other tools you can many to choose from (see the list above).
Step 4 - Correct Misaligned VMs
This is the final phase, and as long as you are no longer proliferating misaligned VMs, soon this process will be a distant memory. There’s no shortcuts to this last step, well not today, so prepare yourself to embark on a substantial project which requires VM offline while the misalignment is corrected.
The most difficult part of this process tends to be obtaining permission for application owners to take their systems offline and frankly you may find some application owners will be unwilling to do so, while others are more than happy to do so in hopes of increased performance. If you are replicating these VMs for disaster recovery purposes, you should also be prepared to consider the bandwidth requirements to re-replicate these VMs. WAN bandwidth can sometimes act as a capacity limiter on alignment projects.
MBRAlign and the other tools listed above all complete the alignment process by rewriting the virtual disk (the *–flat.vmdk file) with an offset more friendly for storage arrays. Some tools send data between the hypervisor and the storage array, while others may require a third host to act as a proxy. Be sure you understand the data flow before embarking on this last step.
Now an alternative to the traditional method of rewriting the file is to migrate the application to a new VM. With the maturation of Windows 2008 I am seeing more customers go down this path. While it is not the norm, it is a viable option that may bring other benefits.
Recent NetApp Enhancements Around Alignment
NetApp engineering is committed to helping customers correct the issue of alignment, and I’d like to share with you a few recent updates...
- MBRAlign has been updated and now supports I/O offload or hardware acceleration for NFS datastores. You can download this update in the latest release of the EHU on NOW. Test results with a 8GB VMDK containing 4.1GBs of data show a performance improvement of 66% (time reduced from 5:43 to 1:58)!
The use of the offload capabilities is a bit different with Data Ontap 7.3.x, where the thin VMDKs are created and thick VMDKs with 8.0.1. This difference will not impact data deduplication results, in fact by aligning, you should see improved dedupe savings.
My apologies to VMFS customers, as I often state NFS is a networked file system and as such it allows for direct access to storage virtualization layers by hypervisors, orchestration tools, etc. As such NFS commonly receives points of integration before we can do so with VMFS. - Data Ontap 8.0.1 has an update (burt 167599) that allows the array to reduce the performance impact of I/Os created by misaligned VMDKs. While this update does not eliminate the need for alignment, it is a start.
I love what we can do with WAFL! I guess I should have shared this info in the reasons to upgrade to 8.0.1 post. ☺ - NetApp professional services have launched VMware Alignment Services, a turnkey solution available to customers that will execute your alignment project following best practices in order to ensure the smallest disruption to your environment. If you need to correct your alignment issue right away, VMAS may be your best bet.
Looking Forward to the Future…
In future releases NetApp will deliver… oh how I wish I could publicly share what we’ve got cooking. Damn those NDAs!
I realize this opening may have been a cruel move on my part, but while I can’t share specifically what we are doing I want to assure you the NetApp and VMware engineering teams are stepping up to provide more advanced methods of addressing misalignment and we are doing so on a number of fronts. As each capability comes to market I will make sure you can read about it here first.
If you are a NetApp customer or partner with a NDA, you can get the inside scoop by contacting your NetApp representative and asking for a roadmap presentation on this topic.
Wrapping Up This Post
Wow – this post was a bit longer than I had planned... I apologize for that. In review I believe we've covered the following points around misalignment:
- Alignment is an industry wide problem
- It impacts performance
- Adding hardware will only mask and not solve the issue
- Discussed the current methods to correct misalignment
- Shared NetApp enhancements to the situation
- Reinforced that we are working on some additional technologies to remediate this issue
I hope you find this information helpful and aid in your plans around alignment. I look forward to sharing as we make progress on our roadmap. Cheers!
Great post Vaughn! one thing I will note on the Linux side of the house, if you are using LVM (Logical Volume Manager) or Oracle's ASM, as long as you use the entire device i.e /dev/sda and do not put a partition table down i.e /dev/sda1 then you will not be mis-aligned. Its only when there is a partition table layed down the be OS when you are misaligned.
Posted by: David Robertson | April 08, 2011 at 10:15 AM
Definetly a great "add-on" to your original post I am wondering what kind of impact could have the re-aligment of vmdks on large architecture.
As you correctly state "Be sure you understand the data flow before embarking on this last step.".
Posted by: Alfwebcom | April 08, 2011 at 11:14 AM
This is very good information...Can someone please advice if PLATESPIN actually does align the disks after a P2V? Thank You!
Posted by: Parikshith Reddy | April 08, 2011 at 11:52 AM
alignment info on gparted for Linux: They've recently updated gparted to allow you to switch to a 1MB offset as well.
http://gparted.sourceforge.net/display-doc.php?name=help-manual&lang=C#gparted-specify-partition-alignment
Posted by: Nick Howell | April 08, 2011 at 12:16 PM
Hey Vaughn - First off, thanks for the link! A quick point of emphasis to your step 4. When you say you can migrate the application to a 2008 instance, you mean a fresh install (that is aligned by default) of 2008. You can't upgrade from 2003 to 2008 and gain alignment. If you were unaligned in 2003, you would be unaligned if you upgrade the vm to 2008 because it doesn't rewrite the partition tables. Makes sense when you think about it but I have seen that be a point of confusion in the past.
Thanks!
-Aaron
Posted by: Aaron Delp | April 08, 2011 at 02:35 PM
Will any of the new tools address thin provisioning after alignment? Correct me if I'm wrong, but if you have a thin provisioned VM that needs alignment, you not only have to take the VM offline to perform mbralign, but afterwards you would have to storage migrate it to a different datastore and then back? Without the storage migration step to re-thin provision it, I will make all my thin VMDKs "fat". Frankly, that's a PITA and is what has held us back from correcting all of ours. :)
I'm liking your idea of just migrating them to new 2008 servers. Lots of other benefits to go with it, like being able to resize the system partition with diskpart on the fly.
Posted by: Brian | April 09, 2011 at 07:42 AM
I'm pretty sure you can actually get the most recent version of Converter from VMware to produce aligned VMs -- you need to precreate the vmdk with an aligned partition and then point VMware Converter at an existing vmdk rather than creating one from scratch. A bit tedious but feasible (and not too bad if you create the vmdk thin and copy it as needed).
Posted by: Andriven | April 09, 2011 at 09:40 PM
NetApp Syncsort Integrated Backup or NSB -- the joint data protection solution from Syncsort and NetApp -- also provides built-in P2V migration capabilities. Couple of cool things about it.
1. It starts with backups of your physical servers, stored as NetApp snapshots.
2. When the P2V migration takes place, it first uses the snapshot to boot the VM (creates a FlexClone). This means that your "conversion" time is about 5-10 minutes. That is, the new VM is up and running in that time.
3. The migration of the data to the VMDK takes place behind the scenes after the VM is already running off the FlexClone.
4. When the data is all moved to the VMDK, it invokes Storage vMotion to switch from the FlexClone to the VMDK. Zero additional downtime.
5. As part of the data migration from P2V, storage alignment takes place. So your new VMDK is correctly aligned.
6. When the server is migrated, data protection is already in place and backups just continue to run.
7. If needed, NSB can also migrate systems back to physical servers. V2V is also available.
All this is included as part of the licensing cost. There is no additional fees for the migration tools. And it's capacity based so you can migrate all the systems you want.
Note: primary storage does NOT need to be NetApp. Works with any DAS or SAN storage. Supports Windows and Linux systems.
It's pretty neat stuff!
Peter Eicher
Syncsort
Posted by: Peter Eicher | April 10, 2011 at 11:08 AM
ZFS misalignment due to variable block size is the pain point we're seeing now on our Netapp storage. The VMware side of things has been sorted for some time.
Posted by: Tim | April 11, 2011 at 03:37 AM
@All - thanks for the great dialog & feedback
@David - thanks for the additional info
@Alfwebcom - I'll see if I can get to this request, but it may be beyond my scope.
@Parikshith - yes platespin aligns vmdks
@Nick - great pint on gparted
@Aaron - right!
@Brian - MBRAlign will preserve the thin attribute with the exception of using I/O offload with Data Ontap 8.0.1. Now, I'd suggest thin or thick provisioning is a non-issue if you are using data deduplication... you are using dedupe aren't you? :)
@Andrew - good points, but may be a bit difficult for mass adoption. I have found if it isn't easy than most wont adopt.
@Peter - thanks for the info on Syncsort
@Tm - NetApp arrays don't use ZFS, so may I ask you to clarify your statement?
Posted by: Vaughn Stewart | April 11, 2011 at 07:24 AM
What about the problems in aligning Windows Server 2008 pointed out here:
http://communities.netapp.com/message/45740
Dave
Posted by: Dave | April 11, 2011 at 08:59 AM
@David: While not using a partition table works for ASM and maybe LVM, this is not supported by SnapManager for Oracle on RedHat.
And, how does LVM make sure that it's logical partitions (or filesystems) are aligned?
Posted by: Wayne | April 11, 2011 at 10:21 AM
If you use Hyper-V I'd add one additional step: Use fixed VHDs. Dynamic VHDs insert container metadata inline with the filesystem data (as the VHD grows) resulting in misalignment regardless of your partition layout. So if you're fixing partition misalignment on a VM with dynamic VHDs make sure you also migrate (not convert) to fixed VHDs as well.
Posted by: Maddenca | April 12, 2011 at 06:38 AM
I thought I read that if you created the VM using vCenter, then vCenter will automatically align them for you. Is that correct?
Posted by: Malhoit | April 13, 2011 at 06:42 AM
I noticed there is not much conversation around aligning VMs that exist in ESXi, just using the host utilities for ESX or third party tools that barely work on ESXi. I take it this is some of the NDA stuff? :(
Posted by: Carl Skow | April 14, 2011 at 12:52 PM
@Malhoit: I'm afraid that's incorrect. It would be great if ESX had insight into how the guest OS writes data to disk and how the underlying storage handles disk blocks, but it doesn't, and that's why aligning is such a hassle. The only hope is that newer OS releases are aware of virtualization unlike the previous generation, so things are bound to get better once we start upgrading our old Windows 2003 and older installations.
Posted by: Daniel | April 16, 2011 at 01:45 AM
Great blog, Vaughn. Thank you!
Posted by: Dale Wickizer | April 18, 2011 at 11:20 AM
As a update, a co-worker pointed out that ASM does require a partition table but LVM you can use a raw device. Thanks @wayne and @allen for pointing out my error
Posted by: David Robertson | April 20, 2011 at 02:00 AM
Great post Vaughn. Really appreciate the insight!
Posted by: Jonathan Adair | April 25, 2011 at 06:26 PM
Great insight Vaughn, thank you.
Posted by: mash | April 26, 2011 at 02:54 AM
@Dave - thank for this, I'll forward it on to the engineers
@Carl - as soon as I can share what is going on with ESXi, I will.
Posted by: Vaughn Stewart | May 03, 2011 at 11:16 AM
@Vaughn - a few of us have noticed the mbrscan and (the newer) nfsstat -d output are not agreeing on the state of alignment - can you weigh in on if this just an interpretation of nfsstat -d output issue or if we actually have unaligned IO happening on what mbrscan says are aligned VMs?
http://communities.netapp.com/message/56904
thanks
Posted by: www.facebook.com/profile.php?id=658313066 | June 22, 2011 at 10:49 AM
@Fletcher - I don't have intimate knowledge around the data you are seeing with nfsstat, but I'd suggest you have an application in your VM that is creating a number of writes that are less than 4KB in size.
If what I suggest is accurate there's no need to be concerned, as it is normal behavior for the application.
I know you're well aware of misalignment, but please allow me to restate for those who may not be... What we want to avoid is having misaligned I/O for hundreds or thousands of VMs on an array. The inefficiency in I/O transfers due to a large mass of misaligned VMs will stress the array and lead to the eventual need to upgrade hardware.
Let me know if the small writes is or is not the case. I'd be happy to engage others to continue the conversation if needed.
Vaughn
Posted by: Vaughn Stewart | July 09, 2011 at 11:45 AM