« Storage Best Practices for KVM on NetApp Released | Main | Data Deduplication and Data Compression in Data Ontap »

June 25, 2010

Comments

Storagezilla

Fixed block deduplication, which is what you've shown in your diagram above, only works when there's actual duplicate segments to remove.

That's why every version of Exchange you listed except 2010 offers miserable dedup savings for time spent when Exchange SIS is enabled.

I won't speak for anyone elses Archiving products but with SourceOne archiving enabled Exchange 2010 gains SIS and compression so instead of the 8MB file sent to 50 people stored on disk it's a less than 10KB message pointing back to a SIS and compressed file in an archive which may or may not be sitting on a storage system offering data deduplication.

For the majority of datasets which aren't Full Backups or VMware images compression can consistently deliver savings unless the data is already compressed.

Oracle Advanced Compression on production databases is a case in point and like Celerra Deduplication the savings can be gained in a reduction of replication traffic and less data sent to external backup media.

I don't favor one of what you've covered over the other as the technologies are different but trying to diminish any of them is a mistake.

Chad Sakac

Disclosure - EMCer here.

A good post Vaughn. I hear 'Zilla's perspective that it tends to highlight the strengths of one approach (the one which is a NetApp strength) and dismisses the strengths of other approaches in a variety of contexts - but hey, that goes with the territory I suppose.

A couple quick comments...

1) generally, data reduction algorithms, all of which can be called "deduplication" in a sense - after all:

- "single instancing files with indirection" dedupes identical files
- "single instancing blocks with indirection" dedupes identical blocks
- "single instancing bit sequences with substitution" is a form of compression (compression replaces information sequences with more efficient information sequences).

2) The key is that their effectiveness (in "information efficiency") and cost (in terms of computational or IO load) depends on a lot of things - the dataset and the implementation.

3) the question of implementation has to include the topic of post-process vs. inline process.

It's an interesting area of emerging technology no doubt. Inline sub-file dedupe is widely deployed and effectively a technological ante for backup targets, and starting now to appear in primary storage use cases (ocarina, ZFS), though still seem to have caveats (as always - don't we all - not casting a stone as we all live in glass houses).

I LOVED this topic back in 2nd year electrical engineering - information theory (Claude Shannon) http://en.wikipedia.org/wiki/Claude_Shannon (one wicked smart dude for his day).

Example - in general purpose NAS use cases, file-level dedupe (or "single instance files" if you prefer, or "file system compression using file-level object analysis" if that suits you better) is very basic, but also remarkably effective. It eliminates a LOT of redundancy, has very little CPU or storage load introduced as part of the process.

Conversely, with NFS with VMware or VMFS use cases, this does virtually nil. Compression with that dataset results in 30-50% capacity efficiency.

In storage arrays, the other thing that is a big consideration are other "adjacent effects" of any feature or function - something that affects us all in myriad ways.

Within EMC, all 3 of these techniques (file level, sub-file level, and compression) we mentally lump into "Data Reduction Engine" technologies. We apply them into our archive products, our backup products, our primary storage products. VMware increasingly is also applying them into their stack.

BTW - the analogy of "Zip" or "NTFS compression" are bad analogies for how we do compression (and I would suspect this applies to other array vendor implementations).

Consider.... When you hit "play" on your iPod, there's no visible "decompressing". Yet of course the MP3 or AAC format is natively compressed. Likewise, when you take a picture on a camera, there's no visible pause as it gets stored to the flash card. Yet of course the JPEG, TIFF format is natively compressed.

Those are also not perfect analogies - as they are very rapid compression algorithms, but they are also lossy.

What is used across EMC (Celerra, CLARiiON, Recoverpoint all use this today, and now that it's part of the core data reduction engine tech, it's being used in more places) for compression is a patented very rapid lossless compression algorithm.

Today, the compression is still a post-process on primary storage, with the decompress being done in real-time on reads (hence the small latency impact). The whole file isn't decompressed, only the portion of the file/set of blocks being accessed.

We're not stopping here (and I'm sure NetApp and all the startups too aren't stopping either!).

Loads of R&D about expanding our primary storage data reduction engine to even more efficient blended approaches, bringing our existing block-level dedupe into primary storage use cases, and how to apply the IP across EMC to the greatest effect.

Likewise, with the emergence of massively multicore CPUs, and also CPU instructions that are particularly well suited to hash calculations (used in many places for this topic) for finding identical block or file objects and compression - more and more can be done inline.

The design goal is to make data reduction - and the application of the most efficient underlying technique an "invisible data services" attribute of the system.

We certainly aren't perfect (who is?), but this pursuit for ongoing efficiency in every dimension, with the broadest set of use cases is something that serves our mutual customers (and the industry as a whole) well IMO.

twitter.com/HPStorageGuy

Hey Vaughn - add HP to your list of deduplication vendors. In case you missed it, we announced HP StoreOnce this past week at HP Technology Forum. I won't take up a lot of space here talking about it but you can read, watch, and listen more about it on several articles I have on my blog: http://h30507.www3.hp.com/t5/Around-the-Storage-Block-Blog/bg-p/139/label-name/storeonce. And of course, I work for HP.

MAC

Hi Vaughn ,

i just wanted to check on how exactly does netapp provides data compression.

Thanks for the great post

regards
MAC

Nicholas Weaver

Nice write-up Vaughn. This has definitely got my curiosity engaged on the differences and use cases around the topic.

JohnFul

Oh were to begin?

The golden rule is “know thy workload”.

The MP3 example is actually a good one. The application that creates the MP3 does the compression. I’ve you’ve ever converted a WMV to MP3, or imported your current CDs into iTunes, you’ve got an idea of the work that entails. The MP3 or (MP4) can then be stored somewhere and accessed by an MP3 player application or dedicated device. It’s at the endpoint, not the storage, that the MP3 is decompressed and played. It’s not too much work for a single dedicated device; it appears to happen instantaneously. Now just imagine transferring the work of a few thousand such devices to the storage controller; you’d likely bring it to its knees.

Now imagine a CIFS volume full of home directories. In many of those home directories you have the very same MP3 file. If you enable file level deduplication, you would certainly save some space because the file would only be stored once with multiple references to it. Now, what happens when 1000 users attempt to play the same MP3 at the same time? If your cache isn’t SIS aware, you immediately fill up your cache and all those reads go to disk. Of course at the disk, all those reads are hitting the same blocks over and over again bringing event the most stout storage controller to it’s knees.

Now, what if the 1000s of users of this MP3 or latest viral video like to share it with their friends? They open up Outlook and, instead of a link attach the file and spread the love. File level deduplication of 2TB Exchange 2010 databases isn’t all that practical. The when in doubt sub it out approach of MAPI stubbing and storing the file somewhere else will generally bring your Exchange server to its knees with MAPI RPC traffic in addition to the disk IO. Sort of goes against the “large, cheap mailboxes” thrust of the IO improvements in Exchange 2010 anyway, doesn’t it? If you stub it out, you’re increasing the IOPS density of a smaller storage pool forcing more expensive SAS or FC disk or EFD. What if you did a very fine grained block level deduplication instead, you had a very large cache implemented in Flash, and your cache was aware of the deduplicated blocks?


JohnFul

B Riley

Thanks for the clarification. These kinds of posts / conversations really help all of us to cut through the marketing and get to the nuts and bolts.

Vaughn Stewart

To all who have left comments, thank you for sharing your thoughts and feedback. IIt is very rewarding to post a blog and to generate a semi-real time discussion.

@Mark (aka Zilla) - I think we are on similar paths (EMC & NetApp that is), which is to provide hardware acceleration to storage related tasks critical in software based ecosystems.

@Chad - You make a solid point when you state that an MP3 is natively in a compressed format and the host (laptop, iPod, etc) decompresses the data as it plays the file. The difference here is your iPod isn't attempting to decompress hundreds or thousands of MP3s simultaneously, not does it have the responsibility to compress raw audio files.

On post processing versus in-line data deduplication… it makes a lot of sense particularly on a production data set. I ensures optimal performance during peak operations and most data sets have a daily change rate which is sub 2%. This is a small sacrifice to make in terms of training capacity for ensuring performance.

As I've said many times, Chad & I see the world with the same perspective almost all of the time (say 95% of the time). The other 5% is where we differ. Dense storage is in everyone's future, and it is our jobs to ensure that the choice technology is used with the appropriate data set.

@Calvin (HP Storage Guy) - I'm glad to see HP jump into the storage efficiency game. Thanks for the link, I'll read up on the advances.

@Mac & @B - I'll dive into this technology in greater depth next week.

@Nick - I'm blown away by your work (my friends rave about you non-stop). Thank you for reading my babble.

@John - Again another much smart than I, thanks for continuing the discussion.

Brian Gracely

Vaughn,

This is well written. You do a nice job of laying out the differences between the technologies. There are a number of things I like about the post and the subsequent discussion between NetApp and EMC.

1 - As much as EMC and NetApp will debate terminology, implementations and use-cases, the most important thing is how hard they push each other to advance these important technologies. Competition leads to great results, especially for customers.

2 - Both NetApp and EMC are on their 2nd, 3rd or 4th generation of these technologies. This stuff isn't easy. It takes many years to work out the nuances between theory/labs and real life implementations.

3 - The discussion is based on feedback from customers about specific use-cases. Customers that have had the technology in production - either in primary storage or backup. Those customers know and trust the technology because it works and they are asking for even more savings than they have already received.

4 - From all the meetings that I've recently been in, NetApp and EMC customers are no longer concerned about the technologies. Two years ago, that wasn't the case. It takes quite a while for customers to trust the technology:
- How much does it impact the CPU on the arrays?
- Will it lose data?
- Will it corrupt the data?
- Will the application developers trust that their data is safe?

Considering how fast virtualization is changing the data center, making shared storage even more important than ever, I'm glad to be partnering with the two best deduplication vendors on the planet. Being able to match the efficiency story that VMware brings on the server-side is equally important for NetApp and EMC to bring on the storage side.

Thanks to both companies for making great deduplication technologies. Keep pushing each other. Customers love the value it brings to their virtualized data centers

Christopher Kusek

Vaughn,

I like how pretty the images you used are, they look so clean and very smooth in the presentation of the topic.

I am also fortunate that everyone in the thread before me covered pretty much every point of consideration I might mention, bring up, discuss, etc. :)

Fortunately, there are things on both of our sides of the fence (so to speak) that I know about, yet sadly cannot discuss; nonetheless I'm glad to see you continuing being active and producing great content like this to get the conversation going on it.

- Christopher

Chad Sakac

@John - my point was just that algorithms around compression differ wildly.

People as consumers of technology are used to fast lossy decompress in MP3 in HW, in fast lossy compress in JPEG, and slow compress/decompress with Zip. There are, of course lossless fast compress/decompress (though often reclaim less capacity than the other examples).

I was just saying that the words we all use carry a ton of implicit meaning in the listener's ears (sometimes correct, often not).

I'm also certainly not poo-poohing the value of a sub-file level dedupe where there is benefit in the example you point out.

Today, to achieve a similar effect in the View case for example (and similar cache scaling), we use View Composer. Of course, that is use case specific, and done at the infrastructure layer it is more general.

In the same way that I'm sure there are folks at NetApp working feverishly away in areas where EMC has a technology that they don't, of course we're furiously working away on sub-file pointer based dedupe to augment our current primary storage techiques, and in many cases leveraging the great technologies we have in the backup source and target space.

Thanks for the dialog!

JohnFul

@Chad

There are a wide variety of data reduction methods, and it seems new ones are invented every day. My point was that workloads vary widely as well. Understanding the workload goes a long way toward picking an effective data reduction method.


Thanks

JohnFul

Storagesavvy

Vaughn,

I really appreciated this post, and combined with all of the comments you received, we all, as vendors, should be educating our customers on the different approaches so they can understand where each will help them.

Aaron Delp

Morning Vaughn - It appears you're attracting EMCer's to your posts like moths to a flame! Great dialog all around and thanks to everyone for the professional conversation!

As somebody on the outside here is how I see this developing. EMC and NetApp are taking a fundamentally different approach (not the first time) to solve the same issue. The fact that both are trying to solve the issue is a good thing because the concept of "data space reduction" is at the top of everyone's list these days.

Of course the first question to go along with this is what penalty will I encounter? For many, saying 10% or less is money in the bank. If you go much higher than 10%, many customers start to get a little uncomfortable. 10% is "in the noise" and a non-factor.

Based on what I have seen so far (I'm willing to bet Chad is writing furiously as I type this) it is hard to beat NetApp's savings vs. performance trade-offs. Combine this with the dedupe aware cache and replication and it is an amazing solution for virtualization.

I have many customers with dedupe ratio's of 70%+ (record is one customer had 91% one time I checked). There are some caveats around dedupe on VMFS LUNs vs NFS volumes but that is a conversation (or blog post) for another day.

Conclusion: DeDupe sells a LOT of arrays. It's not about the drives, it's about the efficiency behind the drives.

MAC

Hi Vaughn,

i saw intersting article about snapcompress and snapencrypt from netapp

ontap has included compression and encryption is this for real

The Storage Alchemist

Hey Vaughn,

Good to see that primary storage optimization is really getting a lot of virtual ink these days. I have a couple of comments.

While I think this post does a great job at helping to distinguish the differences of the technologies, however, I find it a bit odd that you would say that customers don't have a 'solid' understand of the differences, this is not the case when we are speaking to customers. Also, I think you need to be a bit careful at the color added to some of the 'definitions' (it is the FUD that is spread that ensures customers don't have a 'solid' understanding). For example compression, when done right, doesn't add any performance penalty and in fact increases performance. The Storwize technology, which sits in front of over 200 NTAP files at over 100 customers sits all claim that they are seeing no performance degradation at all. I will point out that most NTAP users (I say most) aren't really taxing their systems to a 10% performance impact (per Aaron) may in fact be just that - noise given the benefits. However, I say, if I don't need to worry about performance degradation, then why even think about it if the optimization is done properly.

The other thing I would add is that when you make your comparisons that you stick to a common baseline. To show things like single instancing with a word document and then deduplication with a VMware image are two different use cases. (Not to mention both cases are essentially file use cases and there is no block example.)

I would concede that deduplicaiton is a great fit for .vmdk files that DON'T store unique data inside the .vmdk but store this data on a shared system. However when users are storing data inside the .vmdk deduplicaiton goes out the window. Again, we have a number of customers (in both VMware accounts and non VMware accounts) that tell us that NTAP dedulication provides them with anywhere from 9% to 18% optimization on primary storage consistently because there just isn't that much repetitive data in primary storage systems (unlike backup where data is sent day after day after day). (Now if you are storing .vmdk files without unique data in the file, then you can get 90+% optimization.)

Since we all know that primary storage optimization is a must for the data center due to the growth of data, this is a great topic to discuss. So, in the spirit of helping end users make good business decisions lets arm them with the right information.

Vaughn Stewart

@Brian & Aaron - well said, I think you've hit the nail on the head. Storing data in a space efficient format is a requirement. Understanding which format is best for a particular data set is where an understanding of these technologies is of value. This last point, is why I started this post (because of the loose usage of the term dedupe).

Vaughn Stewart

@Steve (aka the alchemist) – Thank you for reading and sharing your thoughts. I was a bit perplexed by some of your comments, which I believe the source of this 'confusion' may lie in a lack of understanding the shared storage model found with virtual infrastructures and cloud deployments. With technologies from VMware and other server virtualization vendors it is common to store multiple VMs on large shared storage pools. This data layout introduces a number of new capabilities and is central in the ability of block level data deduplication to significantly reduce one's storage footprint.

I do have a few points from your comments which I'd like to dive into...

When you stated - storing data inside of a VMDK deduplication goes out the window. Your claim flies in the face of what the industry is witnessing, and is just plain wrong.

1. See my post http://blogs.netapp.com/virtualstorageguy/2009/01/deduplication-g.html
2. See Aaron’s comments on this post alone. Aaron does not work for NetApp.
3. Try Google: http://www.netapp.com/us/company/news/news_rel_20071127.html

From a technical perspective…
4. Data deduplication is block level, thus sub file (or sub VMDK), and provides savings for with SAN & NAS datasets.
5. Deduplication results would be identical on datasets if they resided on a NAS share or in a VMDK.
6. As multiple VMs are stored on a storage object dedupe occurs within a single VM and across all on the storage object.

As for your comment on - when done right compression increases array performance... As compression requires data to stored and served in a non-native format that requires manipulation of the data as it is either read or written can you elaborate on your statement and provide supporting data?

And your comment on - common baselines… In the post I did not show leveraging single instance storage with a virtual machine, as this scenario does not exist in a production environment. Two identical VM would become a support issue as soon as they were powered on. Thus I used the NAS analogy; where identical files can be commonly found.

Is it possible that your negative claims around the capabilities of NetApp may be more related to a desire to promote your product rather than furthering this technical discussion?

I would my assertions to be wrong. Would you mind supporting one of your claims? I’d like to be put in contact with a purported NetApp customer who is obtaining a 9%-18% storage savings with VMware on NetApp. There is nothing to lose here as if this customer exists they have an opportunity to provide free storage to these customers as a part of our 50% storage savings guarantee.

http://www.netapp.com/us/solutions/infrastructure/virtualization/guarantee.html

Alchemy never really panned out, did it? One of the cornerstone desires of such a practice was the ability to change the invaluable into something rare & precious. It sounds fantastic, the ability for one to turn lead into gold; however, if this ability was possible it would have devastated the value of gold. Gold would be as commonplace as lead, and thus worthless.

Maybe alchemy is best left for the history books and fictional stories.

invisible

Gentlemen,

First, here is the output:

Netapp2> df -g -s
Filesystem used saved %saved
/vol/TRMN_E00_N/ 1343GB 7240GB 84%

above is a volume from production NFS datastore hosting >50 VMs. Volume dedup (ASIS) is NOT enabled - we use file based cloning and thus results.

However, I am a little bit concerned - what would happen (in theory) when a block to which other 5 are reffing to, goes bad?

Have anyone seen the situation when people are NOT deploying dedup solutions for that particular reason?

Giacomo

Nevertheless I had until few months ago some doubts -high CPU utilization -regarding post processing deduplication on DoT (I knew and appreciated DD appliances for long) I recently started two new environment based on a 3140A and 3160A clusters on two different customers and environments but both based on FC SAN and NAS (CIFS and NFS) mix.
In both cases, where the goals were to migrate each single bit of data from other storage (IBM DS and EMC CX) I started the volumes and luns preparation each time as absolutely thin provisioned: so that, no volume reservation, zero fractional reserve, vol autosize, snap autodelete and not reserved LUNs, even the ones used for db such SQL or Oracle RAC on raw partitions. Or the datastore for VMware built on FC and SATA disks.
Well. Now that more than one month of data in production has passed me and the customers are very satisfied of obtained results!
The space is saved even on tempdb luns of SQL, is greatly saved on thin provisioned (from the vSphere perspective and tools) vmdk of the images of servers (a lot of Windows 2K3 and 2K8) and is greatly saved on other luns and overall on CIFS areas dedicated to the user data.

And last, nevertheless the CPU raises to 90 percent and more during the dedupe processes run simultaneously on 5-8 volumes this value in my opinion is a fake "alert". DoT uses all the CPU has available and if it's sleeping now it's clear that uses it for dedup while mantains at the same time an excellent responsiveness for all other apps (san, file sharing, snaps and others..).

Here you can find some pics extracted by the "storage utilization" of MyAutosupport area of these two environments.

http://files.splinder.com/e490d453a0047e1392d74be991a578f2_medium.jpg

http://files.splinder.com/c733f1d22af4eeab19b69dd63e20dc1b_medium.jpg

Puma Shoes

This is an outstanding written post, Thanks for yet an additional insightful post, as normally!

eric

@invisible

Can you explain how you got your figures without turning SIS on please?

As far as I am aware a command run in NTAP shows the output that WAFL can see and if SIS is not on then what was shown here? What file cloning are you referring to?

Thanks in advance.
Eric

The comments to this entry are closed.

TRUSTe CLICK TO VERIFY