« Large Scale Data Protection | Main | Playing to Lose, Hoping to Win: EMCs Latest Guarantee (Part 2) »

May 20, 2010

Comments

emc doesn't need 14+1 R5 numbers to hit the 20% savings number. It works with all sort of raid configs and is does not rely on 14+1s

Mike,

I don't quite understand how you can say that that if I have 135 TB of useable capacity requirements, that you can accomplish this with 139.2 TB of raw spinning disk.

When I use the NetApp storage efficiency calculator, It shows that for 135 TB of usable capacity it would require 204 TB of RAW. I did not change any values, just simply put 135 TB in my companies capacity requirements. That is more along the lines of 67% raw to useable capacity.

As for the EMC tool, and so we don't start the debate of best practices, or is that how a real EMC customer would deploy this, I used a very conservative all 4+1 R5 configuration, which in reality, most customers will deploy a mixture of RAID group sizes. In a roughly 35 TB NAS and 100 TB SAN configuration, I end up with 62% Raw to useable.

Now here is the interesting part. These EMC numbers are BEFORE I account for Virtual Provisioning, and dedupe alone. The NetApp numbers already take into account all of this stuff. Then I add things like FAST v2 and FAST Cache, I can get even more efficient.

Hey Pete,

I wonder if we are related :). Thanks for the comment.

The 135TB on 139TB of physical disk is achieved by block deduplication reducing the amount of physical storage the 135TB of data would consume.

The NetApp efficiency calculator is a top down look at storage efficiency, where as the EMC calculator is a bottom-up look at RAW to usable capacity. The tools aren't outright comparable.

All LUNs have some wasted space in them. Typical customers provision nearly 60% additional storage than they need. For example, a filesystem on a 1TB LUN may only be consuming 300GB of capacity. The additional 700GB is allocated, but not holding data. This is over-provisioned storage. The EMC calculator is just looking for a number of usable capacity that will be provisioned to the host (eg. the 1TB LUN). The NetApp Calculator is looking for the actual "consumed" capacity (eg. the 300GB within the LUN) as the input. By having you put in the estimated consumed capacity and allowing the tool to estimate typical "wasted/Overprovisioned" space, we can get a rough idea on how much thin provisioning may save you.

If you just want a RAW to usable comparison with no additional efficiency features, NetApp uses another tool for that which is almost identical to EMCs calculator. If you think about it, the RAW to usable numbers (without other features) should almost always be similar as both NetApp and EMC provision Raidgroups into usable capacity.

NetApp does have an additional 10% overhead for our virtualization layer that EMC doesn't have, but NetApp also recommends much larger (14+2 thorough 20+2) raid group sizes for all production workloads regardless of performance needs that improves our efficiency. When considering these together, the RAW to Usable numbers should be a near wash. I'll demonstrate that in my next post.

The discussion here SHOULD be on how we can reduce usable capacity after the raidgroups are built, as that is where the differences lie.

I hope this helps. The calculator can seem a bit confusing and I'll touch base with some contacts internally to see if we can have the wording updated to prevent misunderstandings like this. Thanks again for the post!

Guys, the secalc.com site has a bug (that Kusek exploited) when calculating netapp storage with all efficiencies turned off.

226 base 10 TB (or about 200 base2) are needed to provide over 150TB usable without any of the space efficiency features turned on.

I used the internal Synergy calculator to figure that out.

So, it looks to me EMC will have to start giving away a bunch of storage.

D

Dimitri, I honestly have to say I think it is absolutely adorable and sweet that you are claiming I "exploited a bug on the secalc" whereas, as I stated in my post and as I'll state here, had I KNOWN there was a clear bug, I would have not published what I did as I had found it. This is actually the first time I had ever used the NetApp Space Efficiency calculator in the several years since it was published - Can't blame me for assuming that the data that it would report to me (while I was at NetApp, and now that I am no longer) would be anything but accurate, considering the longevity it has been out there.

I do appreciate all of the attention you guys have been paying on this matter, and please do let me know when the calculator has indeed been corrected so I can run through the figures again - considering on both of our sides these are customer public calculators and being that they are intended to reflect an accurate picture of a customers storage needs, I'd like that message to continue to be honest, forthright and true; and not let things like bugs misrepresent the data.

Thanks and I look forward to putting this matter to bed! Until that time, I won't have a whole lot to share on this subject!

Christopher!

What happens when you use 2tb SATA drives under ontap 7? Isn't there a 16tb limit still under ontap 7? Wouldn't the aggr max out at 8 or 10 drives(6+2 or is it 8+2)? Please recalculate the numbers with 2TB SATA drives with ontap 7 (what most of your customers use), to put the calculation to the test - even with your misleading use of raid groups sizes..wow - 20+2s!!. btw ...I've never seen any ntap customer run 20+2s, ever, and I've come across 100s of ntap environments.

So again, I'm trying to compare real world configs. Not the configs EMC would like to compare. I would not recommend 2TB drives on 7 because of the aggr size limits. 64bit aggrs were designed for these types of large capacity drives. There is more to this than capacity alone as 64bit aggrs provide much better performance as well. I'm not even sure 2TB drives are supported in 7.3. However, when running similar configs in 8.0 with 16 disk raidgroups and 2TB drives, I do show that, from a pure RAW to Usable comparison, the EMC config would require about 16% less RAW. The primary reason why SATA is less capacity efficient for us than FC is due to the block checksums we use. We've opted to use a checksum layout that requires 11% overhead on SATA disks, but significantly enhances the performance we can get out of the drive.

Mike Riley blogged on this a while back. http://blogs.netapp.com/efficiency/2009/08/measuring-storage-efficiency-part-ii-taxes.html

As Mike states, the checksums make sense, especially when you start running additional space savings technologies on top of the drives, such as deduplication and compression and are part of the reason why our dedup performance is so high, even on SATA.

As far as the 20+2 comment, the raidsizes are based of the drives used and how large the aggr can grow. 20+2 is perfect for 450GB drives. Since 450GB drives are newer, that may explain why you haven't seen the config. I don't see how using 20+2s are misleading or risky, especially since they have many times the failure protection of a RAID 4+1. Are you also discouraging the use of 4+1?

while you guys are fixing your calculator, perhaps you could take a look at the RAID 6 numbers? Turn off all features EXCEPT RAID 6 and the netapp calculator claims storage savings (huh? for RAID6??)

The comments to this entry are closed.

TRUSTe CLICK TO VERIFY