« Grab a Discounted VMworld Full Conference Pass from NetApp | Main | Free Book - vSphere on NetApp Best Practices »

July 30, 2010


Dejan Ilic

What do you mean when you claim "Zero-cost cloning". Zero as in "don't-copy-data clone" or "zero-cost license required"?
Because if you mean the later, FlexClone license wasn't free of charge last time I checked.
I clarification remark would be in place for that claim.

the storage anarchist

Maybe I don't understand how things work, but I think perhaps you've overstated (or at least overcounted) some of the NTAP advantages.


1) I don't think it's possible for a block device to understand what a "file" is, so in the SAN section, I don't see how anyone could claim to do "identical file dedupe." Now, I'll grant you that your Filers' "block or sub-file" dedupe will perhaps eliminate duplicates of all the blocks in a file, but I don't think that's the same as file level dedupe. File level dedupe thus should be deleted or N/A across the board for the SAN section.

2) Noting your response to Richard about "Space reclamation", if your intent in the SAN section is that the targets must support one (or both) of the SCSI UNMAP / WRITE_SAME(Unmap) commands, then I beleive your table is incorrect for both CLARiiON and Celerra (block) today. And Symmetrix gets support for both APIs in an upcoming software update.

3) Similarly, if your intent under the NAS section is that deleted file space is recovered/reused for other writes, then your table is incorrect for the Celerra case - in fact, I'm pretty sure that ALL NFS and CIFS servers will free/release/reuse space from deleted files as required for new writes.

Changing these No's to Yes's would leave only disk/cache dedupe as a NetApp advantage, which are really one in the same and together they hardly justify separate line items.

On the other hand, you've excluded many features which would tilt the discussion away from your products. For example, NTAP does not offer Flash as a storage tier, only as a cache. Without arguing the merits of this strategy, it does mean that there are always 2 copies of any data stored in a PAM - one in the flash, and the "persistent" copy on disk. While CLARiiON does offer similar with FASTcache, both Symm and CLARiiON can also use Flash as the persistent store (with or without FAST), eliminating X gigaabytes of redundant data.

And if we were to expand the comparison tables beyond just capacity-related features, there's a plethora of features and capabilities that Symmetrix, CLARiiON and Celerra products offer that NetApp Filers (and vFilers) do not. But since they are mostly unrelated to the "Storage Guarantees" topic, I won't dive into those here.

Suffice to say that a truly balanced product comparison between these products will display at least one red "NO!" for each product.

Vaughn Stewart

@Barry (aka the storage anarchist) Thanks for taking the time to comment and share your thoughts.

On item 1 - file & block level dedupe. I think it easiest to remember that files are comprised of blocks and as such an intelligent array can eliminate identical files, and one with a bit more intelligence can eliminate the blocks which are common within dissimilar files.

Now I'm saying files, but I am not limiting the use case of dedupe to NAS. I am referring to any dataset, be it stored on SAN or NAS.

I included file level dedupe because it is a function available in Celerra for NAS files and it is provided by NetApp block-level data deduplication for SAn & NAS. In tis way a customer can understand the available capabilities of storage savings technologies between the two vendors, their platforms, and storage interconnects.

On items 2 & 3 regarding space reclamation; I should clarify my form of measurement is based around can the array reclaim storage capacity from data deleted from within a virtual disk file (ala VMware VMDKs, Hyper-v VHDs, etc.).

I believe that this functionality, which requires coding to coordinate the vSCSI calls within a VM isn't available from VMware or EMC today.

With that said, I will revise my post based on your feedback. I realize I did not define my use case.

On SSD (or EFD) and the idea that there are redundant copies of data in flash cache. All controllers (industry wide) store data in cache and on disk; hover, only NetApp offer TSCS and this a single object in cache can be used to serve multiple external references.

I believe you may be comparing Flash Cache as SSD, which it is not. NetApp will be offering SSD in the near future, in order to address the small number of workloads which actual require a 10x performance gain with read access and that cannot be addressed by Flash Cache.

I'll give you FAST is a brilliant means to increase performance with EMC architectures. I'm just not sure customers can afford the 20X price premium of EMC's EFDs (aka SSDs) for the majority of their data sets.

As for the complete comparison of storage features, I'm truly only interested in the capabilities of storage arrays with virtual infrastructures and cloud deployments. I understand EMC has many capabilities that NetApp does not offer, like connectivity to mainframes. While this capability is very valuable it provides little value to cloud deployments.

Barry, thank you for sharing. I understand I see the world a bit differently than traditional storage guys. My 'all-in' approach to virtual infrastructures results in me only evaluating offerings relative to the cloud and thus my positioning of technologies probably frustrates those purely focused on storage capabilities exempt of applicaiton.

the storage anarchist

A few minor clarifications:

I indeed was pointing out that NTAP's Flash Cache (PAM) created duplicate storage for its content, while FAST's use of SSD/EFD as a tier does not. Good to know that NTAP too sees applications for which SSD-as-a-tier is more appropriate than -as-a-cache.

I have to take issue with your asserted 20x price factor for EFDs - it today is more like 8x the $/GB of 15K FC drives, and still declining.

More importantly, FAST does not require customers purchase EFDs for "the majority of their data sets" - in fact, exactly the contrary. The operational objective of FAST is to intelligently place only the BUSY portions of files/datasets/LUNs on Flash (actually, the subsets that are cache-misses). In typical applications this working set is usually only 5-10% of the total assigned/used/consumed capacity. Putting the rest on SATA can more than offset the costs of EFDs while delivering significantly better performance than a disk-only approach (in other words, a similar value proposition to PAM, without the duplicated disk space).

My understanding is that the space reclamation SCSI commands I mentioned are supposed to be supported in the recent VMware 4.1 update, and will be also be supported by HyperV, W2K8 Server and numerous other platforms by the end of this year. And the new VAAI BlockZero API also enables space reclamation, at least on Symmetrix & CLARiiON (in the not-too-distant future).

It's good to get these things clarified, and a pleasure to do so in a non-combative manner. But Vaughn, one word of advice: be very careful about casting aspersions as to the value of z Series to the cloud. Remember, they invented virtualization (and practically every other technology) decades ago. And they to take kindly to being called irrelevant :^)

The comments to this entry are closed.