I hope many of you read my recent post: Myth Busting: Storage Guarantees wherein I raised questions around the storage savings guarantee from EMC. The basic question is how can they deliver greater storage savings with fewer technical capabilities and use cases than what is available from NetApp. I believe this is a fair question and was hoping for some information sharing.
Now I don't expect most to be storage experts and to help raise the level of understanding I created the following chart.
Simply put, when one understands the technologies, their capabilities, and the parameters where each technology can be enabled; such a claim doesn’t appear to hold water.
As I have in the past, I extended an invitation to EMC to review the chart and to educate the public as to how the storage savings in their guarantee can be accomplished.
Unfortunately the only response (or compliment) the post received was from Chuck Hollis who shared a less than complimentary comment of "chartsmithing" in reference to the accuracy of the contents in the chart.
Honestly, I'm a bit concerned that the technical folks at EMC may have been asked to not engage in technical discussions. It appears as they want to communicate a message ofadvancedstorage technologies are the fundamentally the same from vendor to vendor, yet when asked to elaborate they becomedeafly silent. In my opinion, this behavior is like saying another hypervisor on the market today has the same capabilities as VMware ESXi, and one shouldn’t be concerned with the differences.
As for the lack of dialog, the silence speaks volumes…
The Lone Standout
To my surprise I recently learned of and read the post “Lies, Damn Lies, and Marketing…” from Richard Anderson, an engineer at EMC.
I commend Richard for his efforts and sharing his insight, and his level of professional courtesy. The post begins by verifying the content of the chart...
“As presented, Vaughn’s chart (below) is technically factual (with one exception which I’ll note), but it plays on the human emotion of Good vs Bad.”
&
“As far as keeping things factual, some of the EMC and NetApp features in this chart are not necessarily shipping today (very soon though, and since it affects both vendors I’ll allow it here).”
Houston, We Have Agreement
It is always great to see eye-to-eye with a competitor, however brief the mutual agreement may last.
Agreement over
At this point in the post Richard begins to change the subject of the discussion by modifying the measurement of storage savings technologies available on EMC arrays to one of stating that a SAN array and a unified array are two different types of storage controllers and thus this is an incorrect comparison.
“The first and biggest problem is the chart compares EMC Symmetrix and EMC Clariion dedicated-block storage arrays with NetApp FAS, EMC Celerra, and NetApp vSeries which are all Unified storage systems or gateways. Rather than put n/a or leave the field blank for NAS features on the block-only arrays, the chart shows a resounding and red NO, leading the reader to assume that the feature should be there but somehow EMC left it out.”
Technically speaking, Richard’s point is correct; however, his application of this distinction is unrelated to the topic of storage savings capabilities.
Comparing the storage savings capabilities that are available with SAN deployments CAN be directly compared, such as with a NetApp Unified FAS Array and a Traditional EMC SAN Array. In order to do so one must only look at the SAN capabilities. I believe I have done this in the original chart.
“I’ve taken the liberty of massaging Vaughn’s chart to provide a more balanced view of the feature comparison.”
&
“The goal if my post here really is to show how the same data can be presented in different ways to give readers a different impression.”
(click on chart to view at full size)
From what is shared I have a two comments. First - Zero Space Reclamation is the ability to reclaim data which has been deleted in a VMDK or LUN but still resides in the guest's file system. This is not the same capability as compression.
Second - Some of NetApp & EMC's largest customers have vSeries in front of Symmetrix DMX, Symmetrix VMAX, and Clariion and receive full support from both companies . This statement is simply untrue (but it's understandable why it's not promoted).
Myth Busted
I do truly desire to have open discussions around technical differences in amicable ways, and I am very appreciative of Richard’s efforts. I believe he has expanded the conversation.
To show my gratitude and willingness to share information I have integrated Richard’s feedback in version 2 of the storage savings capabilities chart and made it easy to understand by separating the features by where they support SAN & NAS.
(click on chart to view at full size)
From the chart it appears that a vendor shouldn't simply make a claim, like 20% additional storage savings, and when asked to support the claim avoid explaining how the savings are provided. There's too many holes in the feature set. EMC and NetApp are technologies companies, it is our goal to advance storage technologies and this advancement is only available through open dialog.
Things which may appear similar are not necessarily the same
Stating storage savings technologies are equal between EMC & NetApp is like stating that all hypervisors provide the same functionality as ESX/ESXi.
Such a comparison is somewhat true when considering the hypervisor, we all agree all hypervisors provide the ability to run more than a single system (or guest) on a host. Yet such a statement is myopic in nature and misleading as it leaves out the reasons why customers select a hypervisor (shared infrastructure, app availability, ease of management, etc) and when these features are understood and viewed with greater perspective it becomes clear to see why ESX/ESXi has such a dominance in the market.
Hopefully this post helps raise the level of understanding relative to storage savings capabilities in the market today.
One housekeeping note, whenever I discuss a non-NetApp technology there's a chance I may have an error. Please note should this happen, it is not intentional. If you find that I have any incorrect info let me know and it will promptly be corrected.
What do you mean when you claim "Zero-cost cloning". Zero as in "don't-copy-data clone" or "zero-cost license required"?
Because if you mean the later, FlexClone license wasn't free of charge last time I checked.
I clarification remark would be in place for that claim.
Posted by: Dejan Ilic | July 30, 2010 at 02:04 PM
Maybe I don't understand how things work, but I think perhaps you've overstated (or at least overcounted) some of the NTAP advantages.
Specifically:
1) I don't think it's possible for a block device to understand what a "file" is, so in the SAN section, I don't see how anyone could claim to do "identical file dedupe." Now, I'll grant you that your Filers' "block or sub-file" dedupe will perhaps eliminate duplicates of all the blocks in a file, but I don't think that's the same as file level dedupe. File level dedupe thus should be deleted or N/A across the board for the SAN section.
2) Noting your response to Richard about "Space reclamation", if your intent in the SAN section is that the targets must support one (or both) of the SCSI UNMAP / WRITE_SAME(Unmap) commands, then I beleive your table is incorrect for both CLARiiON and Celerra (block) today. And Symmetrix gets support for both APIs in an upcoming software update.
3) Similarly, if your intent under the NAS section is that deleted file space is recovered/reused for other writes, then your table is incorrect for the Celerra case - in fact, I'm pretty sure that ALL NFS and CIFS servers will free/release/reuse space from deleted files as required for new writes.
Changing these No's to Yes's would leave only disk/cache dedupe as a NetApp advantage, which are really one in the same and together they hardly justify separate line items.
On the other hand, you've excluded many features which would tilt the discussion away from your products. For example, NTAP does not offer Flash as a storage tier, only as a cache. Without arguing the merits of this strategy, it does mean that there are always 2 copies of any data stored in a PAM - one in the flash, and the "persistent" copy on disk. While CLARiiON does offer similar with FASTcache, both Symm and CLARiiON can also use Flash as the persistent store (with or without FAST), eliminating X gigaabytes of redundant data.
And if we were to expand the comparison tables beyond just capacity-related features, there's a plethora of features and capabilities that Symmetrix, CLARiiON and Celerra products offer that NetApp Filers (and vFilers) do not. But since they are mostly unrelated to the "Storage Guarantees" topic, I won't dive into those here.
Suffice to say that a truly balanced product comparison between these products will display at least one red "NO!" for each product.
Posted by: the storage anarchist | August 02, 2010 at 09:09 AM
@Barry (aka the storage anarchist) Thanks for taking the time to comment and share your thoughts.
On item 1 - file & block level dedupe. I think it easiest to remember that files are comprised of blocks and as such an intelligent array can eliminate identical files, and one with a bit more intelligence can eliminate the blocks which are common within dissimilar files.
Now I'm saying files, but I am not limiting the use case of dedupe to NAS. I am referring to any dataset, be it stored on SAN or NAS.
I included file level dedupe because it is a function available in Celerra for NAS files and it is provided by NetApp block-level data deduplication for SAn & NAS. In tis way a customer can understand the available capabilities of storage savings technologies between the two vendors, their platforms, and storage interconnects.
On items 2 & 3 regarding space reclamation; I should clarify my form of measurement is based around can the array reclaim storage capacity from data deleted from within a virtual disk file (ala VMware VMDKs, Hyper-v VHDs, etc.).
I believe that this functionality, which requires coding to coordinate the vSCSI calls within a VM isn't available from VMware or EMC today.
With that said, I will revise my post based on your feedback. I realize I did not define my use case.
On SSD (or EFD) and the idea that there are redundant copies of data in flash cache. All controllers (industry wide) store data in cache and on disk; hover, only NetApp offer TSCS and this a single object in cache can be used to serve multiple external references.
I believe you may be comparing Flash Cache as SSD, which it is not. NetApp will be offering SSD in the near future, in order to address the small number of workloads which actual require a 10x performance gain with read access and that cannot be addressed by Flash Cache.
I'll give you FAST is a brilliant means to increase performance with EMC architectures. I'm just not sure customers can afford the 20X price premium of EMC's EFDs (aka SSDs) for the majority of their data sets.
As for the complete comparison of storage features, I'm truly only interested in the capabilities of storage arrays with virtual infrastructures and cloud deployments. I understand EMC has many capabilities that NetApp does not offer, like connectivity to mainframes. While this capability is very valuable it provides little value to cloud deployments.
Barry, thank you for sharing. I understand I see the world a bit differently than traditional storage guys. My 'all-in' approach to virtual infrastructures results in me only evaluating offerings relative to the cloud and thus my positioning of technologies probably frustrates those purely focused on storage capabilities exempt of applicaiton.
Posted by: Vaughn Stewart | August 02, 2010 at 12:09 PM
A few minor clarifications:
I indeed was pointing out that NTAP's Flash Cache (PAM) created duplicate storage for its content, while FAST's use of SSD/EFD as a tier does not. Good to know that NTAP too sees applications for which SSD-as-a-tier is more appropriate than -as-a-cache.
I have to take issue with your asserted 20x price factor for EFDs - it today is more like 8x the $/GB of 15K FC drives, and still declining.
More importantly, FAST does not require customers purchase EFDs for "the majority of their data sets" - in fact, exactly the contrary. The operational objective of FAST is to intelligently place only the BUSY portions of files/datasets/LUNs on Flash (actually, the subsets that are cache-misses). In typical applications this working set is usually only 5-10% of the total assigned/used/consumed capacity. Putting the rest on SATA can more than offset the costs of EFDs while delivering significantly better performance than a disk-only approach (in other words, a similar value proposition to PAM, without the duplicated disk space).
My understanding is that the space reclamation SCSI commands I mentioned are supposed to be supported in the recent VMware 4.1 update, and will be also be supported by HyperV, W2K8 Server and numerous other platforms by the end of this year. And the new VAAI BlockZero API also enables space reclamation, at least on Symmetrix & CLARiiON (in the not-too-distant future).
It's good to get these things clarified, and a pleasure to do so in a non-combative manner. But Vaughn, one word of advice: be very careful about casting aspersions as to the value of z Series to the cloud. Remember, they invented virtualization (and practically every other technology) decades ago. And they to take kindly to being called irrelevant :^)
Posted by: the storage anarchist | August 03, 2010 at 09:19 AM