« Thoughts from day 1 of the VMware Partner Exchange | Main | VMware Admins are Storage Admins - vStorage Integration Part 1 »

February 11, 2010

Comments

Travers Nicholas

Full disclosure: I work for EMC as a vSpecialist.

Hi Vaughn, I hope you're well. I would love to read more detail behind the solution you mention:

"the FAS3140, with two shelves of 450GB 15K FC drives and outfitted with PAM, has been certified to drive over 5,000 desktops."

Specifically, I'm interested in understand how the configuration supports a heavy write workload (as are many desktop environments that I have personally examined - I'm also interested to hear if your experiences are similar).

Thanks mate

Vaughn Stewart

@Travers - hey mate, hows things? I'm still waiting on that New Zealand All Blacks Jersey. ;)


The sizing was completed by NetApp and VMware engineering. As I sit in the terminal at LAS, departing from VMware Partner Exchange, I dont have the details of the working set available to me. I believe the testing was driven by an 8 IOP per desktop load, but again allow me to follow up with my team.


Regarding how NetApp can achieve such high performance numbers where as traditional array architectures like EMC struggle to achieve similar performance with similar hardware... The answer is found in the differences in our controller configs and kernel.


Before you scoff at such a position, consider the differences found in ESX/ESXi, Hyper-V and XenServer when each is run on similar hardware.


So let us start with what we hav in common


- I believe we agree that a 15k FC drive provides ~220 IOPs @ 20 ms latency wether on EMC or NetApp.

- I believe we agree that memory devices serve data faster than disk based devices.

- I believe we agree that array read cache enhances storage I/O performance by serving recently accessed data from memory versus disk.


So far so good? OK, lets differentiate...


All writes to an array powered by Data ONTAP provides write acknowledgements from NVRAM (battery backed up memory). This results in very fast I/O to the client.
NVRAM acts as a log that queues write requests, and writes long stripes of data across data and parity drives. This results in easy write pressures on the array, thus increasing performance.


Traditionally speaking reads on a traditional array (like EMC) operate like NetApp; however, that all changed with dedupe an storing data in VMDKs.


To begin, Data ONTAP allows any storage protocol to be deduped at the block level. This is much different than EMC's dedupe line, where dedupe can be backup (aka data Domain) or production (Celerra NAS only and objects which can be deduplicated must be 100% identical).


I must add, I am not a fan of EMC's marketing where the ability to compress NAS files is also labeled as dedupe. Does EMC marketing really believe customers can't understand the differences between deduplication and compression? What next, will you claim that thin provisioned VMDKs provide the same storage savings as deduplicated VMDks? (yes that was a rhetorical question)


Back to NetApp's dedupe technology, with VMDKs it allows the array to understand that an I/O request from one VM may actually have already been served and stored in the same disk block from a request from another VM. This understanding of the common contents of a VMDK results in reduced drive seeks (I/O load) and provides increased read performance.


Now deduping the storage footprint significantly increeases the ratio of IOP requests per drive and this scenario can lead to the over taxing the drive IOP capabilities. This is where intelligent caching comes into play. By having a dedupe aware storage controller cache (in both the system cache and via PAM expansion modules) NetApp gains an expanded I/O benefit of reading one block into cache and serving it to 1,000s of VMs. This results in increase read effiecencies and performance.


No other storage vendor can do this today.


So let's put this all together...


NVRAM saves write IOPs, dedupe disk and array cache literally reinvents the construct of read overhead, and in conjunction these technologies result in freeing up significant amounts of disk drive IOPs. The free IOPs are now available, almost exclusively, for servicing the unique block level I/O requests of the running VMs.


Taking I/O further...


Add in other techs like, NetApp NFS, which is devoid of the shallow I/O queues (seen with EMC and traditional array architectures) and the sematics associated with clustered file systems (say in this case VMFS), one can scale an ESX/ESXi cluster beyond the traditional 8 node VMware recommendation. Note the 8 node limit (or strong recommendation) is because of LUN queues and the other constructs.

Also consider that NetApp clones are disk based where as Linked Clones leverage VMware's snapshot technology (in their delta disks). The delta disks are actually SCSI transaction logs, and operate differently than a block device. this would be similar to asking which is a faster way to read data from an Oracle DB, or from an oracle transaction log?


One can gain extra I/O efficencies with the disk based clone, and when combined with all of the above technologies, a View on NetApp solution can scale better, and address the heavy I/O loads, than what a traditional array architecture.


Looking ahead, you have inspired me. I will put togther a post or posts which breaks this process down, complete with I/O flow. I think the customer base and partner community could really use this type of info.

Thanks for the comments, and go All Blacks!


Boarding doors just closed - gotta go before I can proof read!

This message was sent by my thumbs... fault them for the brevity and poor spelling

Chad Sakac

Disclosure - EMC employee here.

Vaughn - Dedupe and compression are both variants of data-reduction technologies. I don't want to be TOO much of a nerd here, but the point is valid.

Data reduction techniques have varying effectiveness/cost (and here cost means "CPU cycles, processing time, performance impact", ergo not $$ but "engineering costs") depending on the dataset. A trivial example:

1) filesystem containing ten files. Four Files are EXACTLY the same.
2) filesystem containing ten files. Files are similar, but not the same.

- file-level dedupe is extremely efficient in the first example (low impact, high efficiency gain).
- compression is moderately efficient in the second example (low impact, moderate efficiency gain

Celerra F-RDEv2 (the nerdy engineering name - "File Redundant Data Elimination" is accurately characterized as Dedupe and compression. It finds and deduplicates files at the file object level (which is the most efficient, and largest immediate savings for general purpose NAS), and compresses within files. F-RDEv1 skipped files >200MB.

While VMware on NFS is very important to you and me both, the broad use of our NAS devices tends to be dominated by NAS, and if we can deliver MASSIVE efficiency gains there, that was our original design focus. Example: "if we can demonstrate a 60% efficiency gain with the dataset which consumes the majority of the storage - which today is general purpose unstructured NAS and backup storage - then we are helping our customers".

The march of storage efficiency technologies continues for us as I'm sure it does for you.

F-RDEv2 (which is GA, and has been and will continue to be free) now has no file-size restriction. This means that ON TOP of Thin Provisioning, it provides about an additional 40-50% capacity savings gain when applied directly to the VMware on NFS use case.

BTW - the same release that expanded the application of redundant data elimination to customers using Celerras also added the ability to snapshot and clone individual files within a filesystem (in fact also clone a file ACROSS filesystems). This is analagous to the "hardware accelerated" fast/full clone. There is also a vCenter plugin for that. My post on that topic is going up shortly. BTW, we think that in the client virtualization use case, more customer pain is about client image management and composition, not storage IO. Our guidance for customers will not to be to use that function to eliminate the stuff that View Composer/Thinapp can be used for. The VM hardware-accelerated snap/clone is useful in many use cases, but the "big" challenge in the VDI use case is overall image mgmt transformation.

Ok - back to the "dedupe/compress" discussion....

Now, again, being an engineer about this - in cases where file level commonality is very low (this is the VMware on NFS use case), and where block-level commonality is high (common on the VMware on NFS use case), a block-level dedupe can result in even higher capacity reduction effects. Again - that's that specific dataset. It's not uncommon for people to see 80% (and in some cases even higher) capacity efficiency gains with block-level dedupe approaches (ala NetApp dedupe, or EMC's backup dedupe with Data Domain and Avamar).

But - on the other hand:
1) as you noted - does an 80% efficiency gain that results in a configuration that is 80% smaller? Not without dealing with performance density. NetApp's general response (in my experience) is PAM. PAMs are of course not free. PAM drive UP the platform cost, offsetting (not implying eliminating) some of the cost benefit of the block-level deduplication in the first place? For example, if you could reduce the footprint by half but the impact of PAM adds 20% to the configuration, then the CAPEX cost delta (and capex is the trickiest part of View campaigns with the customers I talk to) isn't 50%, then it's 30%. Note that this is at least the CAPEX cost, application of solid-state and memory generally has a sustained OPEX benefit (space/cooling)


2) one upside of our approach is that it has no impact reduction on filesystem size, is unaffected by any feature (snapshots etc) and certain other things that are nice.

Now, on PAM... Is it accurate, (yes/no) to observe the following... PAM is a cache as you state. It is a well designed and efficient cache (that is effectively larger than it would be otherwise due to the similar application of dedupe to the cache itself). I publicly embrace the fact that NetApp engineering is filled with bright smart people. I would suggest that EMC engineering is also filled with bright smart people.

So, let's put aside the marketing. How does it affect the I/O density for IO's that can't be ELIMINATED (common read I/O), but only BUFFERED (i.e. write caches help with absorbing IO bursts, but eventually need a destage to disk - not enough disk IO and it overflows - resulting in write I/O that is governed by the back-end disk config). Duncan Epping at VMware had a great post on this topic here - strongly suggest reading the comments between many folks (myself included). http://www.yellow-bricks.com/2009/12/23/iops Note that his post was triggered by the overselling of the things you list above in your post.


Again, not implying "dedupe = bad" or "PAM = bad", but rather that there are always two sides to every story - with all us vendors. We each try to "simplify" the message, but these topics are deep, and long. We all "market". It's also impossible in a single blog post, or a 1hr presentation to cover every up and down over any given technology.

Also, each of us vendors sees the world through our own eyes, but in the end, the customer's eyes are the ones that matter

Speaking of not liking another's marketing, here's two for you:

1) As much as I know you like calling everyone but NetApp "a traditional storage array" - our core array architectures actually have both been around for the same rough time. There are certain things we can do that you cannot, and vice versa. Pros and cons man, pros and cons.

2) that "90%" number that was thrown out at PEX by NetApp about VDI. Will you EVER be able to provide the source? It doesn't jibe at all with the data I have directly from the View product team. Will post shortly what data I do have, but would love for you to provide ANY supporting data behind that position.

Hope you have a safe flight, and was good to see you and your colleagues at PEX.

Vaughn Stewart


Re: [NetApp - The Virtual Storage Guy] Chad Sakac submitted a comment to The VMware Express and View on NetApp


@Chad Thanks for the follow up, many good points of agreement. Many items to touch on...

To start, this thread is getting religious, but lets be frank, the move from physical servers to virtual servers began with similar heated debates. Only a few years ago, those who dismissed server virtualization and technologies challenged the ability of ESX to meet the needs of every application in the data center. his was the wrong perspective for one to take.

Ill position this is the same argument we are having in EMC versus NetApp.

EMC has great technology, with lots of functions which NetApp does not and will never offer. For example EMC provides connectivity to VAX, IBM System/360, UNIVAC 1100/2200, and the like. To your point, there are multiple ways to skin a cat, and this is the crux of our debate. I believe that the capabilities of Data ONTAP aligns closer to the goals of VMware deployments as compared to what is available with traditional array architectures. This doesnt make the later bad, it just makes the discussion of how their interoperate and deliver value one which highlights obvious differences.

So lets look at a few of your points...

On dedupe of NAS: I noticed you elegantly turned the topic of dedupe to only speak to what EMC can provide... Dedupe of backup data sets and NAS file level dedupe. In having an open dialog Id expect a discussion acknowledging that NetApp provide block level deduplication with everything. FC, FCoE, NFS, iSCSI, CIFS, HTTP, FTP, Backups, Archives, and replication bandwidth.

On the IOPs blog post: Theres too much over simplification and EMC knows all storage positioning going on in the comments. Case in point, you said, ...not claiming to be an expert on NetApp, so please correct me if Im wrong on any of the below. The parity cost of R6 is the same as R-DP... RAID-DP is a superset of RAID-4, it only meets the spec requirements of RAID-6 (which is supporting loss of double disk failure thought double parity sets versus mirroring data). EMCs RAID-6 is like traditional storage array implementations where it is a superset of RAID-5.

On the comment, PAM is a cache: EMCs caching mechanism reads ahead following disk drive track. It has no logic or understanding of the data being served. In other words, it provides benefits via brute force efforts. It is quite inefficient in terms of cache hit ratio. NetApp Intelligent Caching reads ahead by following inodes, by default it reads ahead data by caching the data which was written after the subsequent read data was stored. This results in higher cache efficiency. Now add dedupe intelligence to the cache and cache hit ratios go great than 100% as we are serving multiple IO requests, from different virtual and physical clients accessing a shared storage pool (aka datastore), from single blocks in cache. PAM simply increase the amount of intelligent cache by adding up to 1TB of L2 cache. Please dont discount hw utilization and efficiency, this is a core driver of virtualization initiatives.

On goals of reducing costs and the costs of PAM: Lets just play with simple numbers. EMC requires 1TB (no RAID here) + DR + D2D (say one year = 100%) + offsite backup = 4TB. NetApp requires 100GB + DR + D2D + offsite backup = 400GB. I saved 90% across the board, sp what is PAM costs a few thousands of dollars.

On SSD: SSD is very energy efficient and packed full of IOPs. It also costs 10X of FC. Case closed for general purpose use until commodity prices are the norm. Add in the lack of reliability of SSD, and factor in the need for more than what is provided with RAID 5 protection and what do we have? Fast but really expensive RAID-10 or lose much of the performance of SSD with RAID-6.

More info on SSD reliability in summary is on Wikipedia. http://en.wikipedia.org/wiki/Solid-state_drive

On Compression is Close Enough to Dedupe: Compression copies data and stores it in a non usable format. Access requires additional IOPs on the array to decompress the data and when being saved recompress. More IOPs being wasted here. BY contrast dedupe is storing data in a state which it can be directly read. There is a requirement for dedupe to fold new writes to disk on an interval basis, which is like compression; however, it does not recreate the file as compression does. Instead it simply moves pointers and deletes blocks which are redundant. Very easy I/O load to complete the process on the array.

On NetApps claim of VDI market share: Why is it critical for EMC to discredit the NetApp share with VDI? Is it because desktop virtualization forces one to implement advanced storage features and as this happens it places pressure on enabling the same features in the virtual server environment?

If my positioning of this tech versus that tech is inaccurate maybe you could explain to the readers EMC recent take out program targeted at NetApp specifically in VMware deployments? May I offer a reason for this program.... because what I am sharing and evangelizing is spot on. If I spun lies and half truths there would be no need for any such programs.

Scott Lowe

Vaughn, with regard to this part of your comment:

"On goals of reducing costs and the costs of PAM: Lets just play with simple numbers. EMC requires 1TB (no RAID here) + DR + D2D (say one year = 100%) + offsite backup = 4TB. NetApp requires 100GB + DR + D2D + offsite backup = 400GB. I saved 90% across the board, sp what is PAM costs a few thousands of dollars."

Where did your starting storage requirement numbers (the 1TB vs. 100GB) come from?

Vaughn Stewart

@Scott - great question. I hope congradulations are in order on your VCDX defense.


As the post is referencing View I used a simple number of 90% dedupe. Feel free to adjust the numbers up or down, and what you will find is NetApp will require a significantly smaller footprint as compared to solutions designed to deliver the same vaue with EMC. Note, I dont find value in making two config match, it is much better to qualify business or customer requiremets and having each vendor architect their best solution. It is from the point one can compare NetApp to EMC.


One may ask how can that be? Its quite simple, all of Storage savings technologies inherent in VMware products (thin VMDKs, Linked Clones, etc) provide the exact same storage savings with any storage vendor.


Between the two storage vendors NetApp offers a significant technological advantage in terms of reducing the storage footprint over Offerings from EMC. Block level dedupe on everything versus file level for only NAS is one area. Another is the impact of reducing production footprints and how this reduces DR footprints. Or we can discuss backup offerings, consider when serving D2D backups as a part of the production array menas that the replication of this data offsite results in an array which serves both the purposes of DR (With SRM) and offsite backup storage. This last design builds off of the previous and redces the total array counts from 4 (for EMC) to 2 (for NetApp).


So if VMware software provides the same value on all, the differentiation is obtained from the array controller kernel, data managment tools, vCenter plug-ins, and cost of solution.

This message was sent by my thumbs... fault them for the brevity and poor spelling

Dillon

@ EMC guys,

Proceed in reducing the rhetoric and the essays and see what EMC can deliver...TODAY.

Like NetApp suggested, provide the equivalent NS platform and show the world what the system is capable of delivering.

Chad sakac

Vaughn - the impact of cache on bootstorms is well understood.

Bootstorms are also NOT the core View/VDI scaling issue - period. When linked clones are used, or our new VM-level snapshot approach that is RCU-like (so that the common blocks in the "base replica") are common, all the IO's nicely fit into the caches in the smallest arrays.

We have also produced a TON of data on that.

I'll add that you dramatically underestimate how read-ahead caching works in our arrays.

But, time for bed for me, following the guidelines set by @Dillion, I'm going to reduce the rhetoric.

I will do the same thing that Vaughn has done here, and get a customer to provide their experiences. Would that help?

Vaughn Stewart


Am I correct when I clarify your statement of, our new VM level snapshot approach... to with the new storage options in VMwares next release of View all storage vendors gain enhanced IO scaling


This is great for VDI as a whole, and it will raise the performance of all solutions.


Onspin and accuracy in communications... I am failing to see what you mean by our as it implies an EMC technology, and unique value and i believe you are referring to a VMware technology and value prop. Am i incorrect here?

This message was sent by my thumbs... fault them for the brevity and poor spelling

Vaughn Stewart


@Chad - one more thought. You suggest that I have under estimated the caching mechanism of EMC arrays yet you provided no data to support such a statement. Can you please share?

This message was sent by my thumbs... fault them for the brevity and poor spelling

refurbished computers

Google is taking a hard look at figuring out how to incorporate real time searches into the SERPs now, since microblogging services are taking away a large volume of searches that used to go to Google.

Travers Nicholas

Hey Vaughn,

Sorry about the delayed response, just got back home after a week in the freezing cold and have spent a nice day in the pool with my son. Good times!

First up I have to say bravo on pitching me the gospel as a u-turn to my question, I did have a giggle and should have known I was walking right into it! Nevertheless I thank you for your fast response, and hope to see a follow up with some data or a link to a document I can read on the solution you mentioned, not a biggie - I'm just curious.

As you'll no-doubt agree, we're in an industry that suffers issues with information asymmetry between the vendor and the customer. In my experience, this applies especially in the case of virtual desktop environments, due to the fact that many VDI deployments are owned and operated by the group of the IT team that manages desktops and might not have had experience purchasing consolidated shared storage systems before. It is with this apparent lack of information symmetry in mind that I would like to provide my two cents and ask for some clarification of FAS/V-Filer operations.

It seems to me that much focus is applied to point solutions that match features, rather than complete architectures, that optimise business process. What do I mean by this? Well let's consider this IO storm discussion. It is implied by many technologists that infrastructure systems associated with VDI environments should be able to maintain operations during IO storms, and with that point I agree. Implying that this anomaly will most certainly occur on a frequent basis is where information asymmetry comes into play. It is easy for customers to picture boot storms happening: user arrives in the office at the same time as 1,000 of his or her colleagues and boots up their PC, KABLAM! I would suggest though, that this is an architectural issue, rather than a technical one - e.g. Why not leave the virtual desktops on, or if DPM is in play, suspend them.

So we have run into a situation where the vendor has a solution to which they are mapping a problem, rather than the other way around. I'm not trying to single-out NetApp here by the way, EMC too have published documents and references on how to deal with IO storms in Virtual Desktop environments, and continue to educate our customers on how our products and solutions can quite capably address those problems. The problem with this "positioning" is that the customer's true requirements may be overlooked in light of the "prioritised recommendations from the industry experts", which are solutions to the anomaly not the norm.

So what are typical challenges in VDI environments? I think Chad has done a great job as usual of articulating what a lot of them are (above) so I won't repeat or regurgitate his words. I will add something though. In a virtual desktop infrastructure rollout (and this applies particularly to VMware View), we have a substantial opportunity to improve the desktop service to the customer. This may be through "instant-on" type usability, the ability to access new applications instantly and on-demand, or even providing a faster response time while performing everyday desktop user tasks (after all, making 1,000 users more productive every day of the week will generally be more valuable to the business than ensuring the system can handle 1,000 reboots once per year).

Being an Infrastructure Manager in my previous role I became semi-attuned to the practice of vendors retrospectively applying requirements to their solutions, rather than determining what MY problems were and helping me with those, and I feel like I have identified a pinch of this behaviour in your response to my question above. In light of this, I'm going to ask you for help and further clarification of my understanding of NetApp's FAS/V-Filer platform.

To use your example of the 1TB worth of VMs which have been deduplicated down to 100GB, let's look at a virtual desktop scenario with the following average IOPS (these numbers were taken from production VMs not in a lab, though I'm happy to apply the same formula to other numbers you have from running-in-production VMs):

7.6 IOPS per desktop
35% Read Hit
15% Random Read
50% Random Write

If we assumed 10GB per VM (although the above mentioned VMs ranged from 8 GB to 10 GB), we'd fit 100 VMs into our 1TB FAS/V-Filer environment (again, hard and fast numbers here to keep it simple) generating a total of 760 front end IOs. Once the post-processing deduplication job occurs, our VMs have been deduplicated down to 100 GB (using your example).

What does the IO profile look like?

Read Hit: Well, our read hits will most likely be served by PAM and some may be served by previous writes that are still in cache, so even though we have to read this from disk into cache at some point, for simplicity let's count this number as 0 IOs (generous IMO, but let's move on).

Random Read: Our random reads will most likely not come from cache/PAM (please correct me if I'm wrong here, though I can't imagine PAM predicting and pre-fetching all reads, I confess to not being a NetApp products specialist) so we will probably incur some read commands on disk here (0.15 x 760 = 114 IOs).

Random Writes: Our writes will also be served by cache, but must be destaged at some point prior to being deduplicated and therefore must be counted. It is my understanding that RAID-DP calculates parity in NVRAM and incurs 3 back-end IOs for 1 front-end write, though these numbers are impacted as the file system (WAFL)becomes full and has less contiguous space to write to (again, please correct me if I'm wrong, this information came to me through a conversation I had with a NetApp customer at a VMUG who was underutilising his FAS for performance reasons, I guess you could consider this a form of short-stroking, which all storage vendors implement under certain circumstances, some before the sale as planned architectures and some after the sale as unplanned architectures). Let's assume I'm wrong and FAS/V-Filers are able to perform the same regardless how full they are and go with 0.5 x 760 x 3 = 1,140 IOs. Now let's take the totals:

Front end IOs = 760 IOs

Read Hit (back end) = 0 IOs
Random Read (back end) = 114 IOs
Random Write (back end) = 1,140 IOs

TOTAL (back end) = 1,254 IOs

If we used 450GB 15K RPM drives we'd be servicing about ~180-200 IOs before response time became bad, which means 7 drives to service the workload. 7 drives is 3,150 GB raw. As I understand it (please correct me if I'm wrong), NetApp FAS environments should deduct about 30% from raw capacity for aggregate creation, file system formatting, snapshot space, etc, which leaves us with 2,205 GB usable. Say we consume 100 GB, what would we do with the rest of the space if we couldn't serve any more IOs with the drives? We'd either have empty space that we couldn't use, or full space that we couldn't service.

So it seems like I've come to a point in the road that I need to ask you to provide some clarity and confirm or correct my assumptions. Even if you reduced the back end IOs for writes down to an average of 1.5 IOs per front end IO, you would still need to serve 684 IOs on the drives which would equate to 4 drives, or 1,800 GB raw (~1,260 usable) which doesn't make sense to me. This makes things especially unclear when considering that delivering a consistent level of service in VDI environments generally means handling heavy bursts of IOs (acknowledged reads are less of an issue than writes, but using the above configuration, a burst of writes would result in sub-optimal response times to virtual desktop applications).

My conclusion (though perhaps premature?) is that the technology works, but the results will vary based on the customer's environment. NetApp and EMC have technologies that can help customers reduce their infrastructure spend we must pay close attention to the customer's requirements and the information we are providing pre-sale if we are to ensure satisfaction and healthy ongoing customer relations.

I wondered if you had forgotten about the All Blacks/Pistons swap! I certainly remember, how about I bring one for you when I see you at VMworld? I might even be able to get a World Cup 2011 jersey by then. ;-)

All the best mate,

Travers (@traversn)

vsr

@Travers said "Why not leave the virtual desktops on, or if DPM is in play, suspend them."

I wanted to point out something on this: Boot storms are not the only I/O storm that VDI environments can experience.

For example, there's login storms (i.e. the images are left powered on, or suspended as you suggest, but concurrent system logon processes can easily generate an I/O peak way above the average usage level etc.)

Sure the I/O generated by logon is not as much as boot, but if you go beyond the number of IOPs the storage system is sized to provide, then it doesn't really matter if you've over it by 1000 iops or 100,000 iops.

I do agree that image management is a very important consideration, but the first and biggest challenge is I/O storms.
Haven't we all heard the horror stories of VDI environments which have been crippled by I/O storm issues, employees being sent home because they couldn't logon and do any work etc?
If you don't overcome the I/O storm hurdle first, then image management seems a bit like making beds on the titanic.

By the way, do EMC have a solution for dealing with I/O storms (honest question)?

I only ask because, while I've seen lots of comments from EMC'ers on this & other blogs questions Netapp's PAM module etc, I'm not sure I've actually seen EMC's answer to the same problem.
Is questioning Netapp's solution a substitue for not having an solution youselves?

Scott Werntz

@Travers

You said "(after all, making 1,000 users more productive every day of the week will generally be more valuable to the business than ensuring the system can handle 1,000 reboots once per year). "

What OS (ie. Windows) only needs a reboot once a year?

Travers Nicholas

@vsr,

I understand where you're coming from here and agree, boot-up is not the only kind of I/O storm that could possibly occur. Are there any particular I/O storms you're talking about? Antivirus scans? Antivirus definitions rollout? Software patching rollout? It seems we're back to where my point started, asking for a solution to a problem that hasn't been defined. Is it user initiated or system initiated? Is it reads or writes? Is it low probability high impact or high probability low impact? Depending on the answers here, we might have a technical issue, an architectural issue, or a business issue (e.g. accept the risk or spend the $$$).

If we deployed EMC's user folder solution as well as VMware's linked clone solution we would be serving user data from EMC Celerra NAS (CIFS) (with deduplication, antivirus capability, etc), while non-user data will be served from VMFS or NFS datastores which can be deployed to enterprise flash drives. The combination of load distribution, built-in array cache, and high speed low latency drives provides significant protection from all sorts of I/O storms. Adding fully automated storage tiering (FAST) and PowerPath/VE to the mix is icing on the cake.

Again, to add to my previous "conclusion" I'm not trying to say that NetApp can not deliver what they promise, and I'm certainly not implying anything negative about NetApp's products, I'm simply stating that I believe that in many areas of our industry the realised value post-sale is not equal to the perceived value pre-sale, in part as a result of information asymmetry between vendor and customer.

@Scott Werntz,

My apologies, thanks for picking it up, what I meant to type was this:

(after all, making 1,000 users more productive every day of the week will generally be more valuable to the business than ensuring the system can handle 1,000 simultaneous reboots once per year)

Thanks,

Travers (@traversn)

Vaughn Stewart

Re: [NetApp - The Virtual Storage Guy] Travers Nicholas submitted a comment to The VMware Express and View on NetApp


@Travers Im not so sure youd still want a Piston jersey with the way this season is going. Shoot me your size, and preferred player number and address to [email protected] and Ill get this out to you.

I think you raised an excellent point in your comments; way too often vendors look at tackling issues solely based on what they sell. Its true, and happens way too often in our industry (aka the IT industry). With that said, I dont believe this is the case when considering the unique storage challenges of virtual desktops.

Consider the suggestion to leave the VMs running... We find that ~85% of the users log into their desktops during a 90 minute window each morning. This phenomenon is fairly well know and is referred to as a login storm.

Consider the suggestion to suspend the VMs... The VMware suspension process requires the contents of the memory of the VM to be written out to disk. The resume process requires this content to be read in order to restore the VM. Based on the amount of memory assigned to a desktop VM, commonly between 512MB 1GB, is more data than what is read to boot a VM. In addition, this data has much less commonality than the OS binaries and as such is less likely to be able to serve multiple requests from array cache as opposed to OS and application binaries.

I/O storms happen often and in many situations. One can either look at deploying gobs of storage array hardware (i.e. Clariion, Celerra, Symmetrix, and other traditional array architectures), expensive storage array hardware (i.e. EFDs at current prices with highly redundant RAID protection), or one can deploy intelligent array technologies on commodity hardware like PAM RAID-DP on FC drives.

I do want to thank you for your comments, as they help me understand that we need to share more knowledge on this topic.

Adriaan

@Travers,
Vaughn needs to direct you to a WAFL specialist because your analysis of:-

Front end IOs = 760 IOs

Read Hit (back end) = 0 IOs
Random Read (back end) = 114 IOs
Random Write (back end) = 1,140 IOs

TOTAL (back end) = 1,254 IOs

is EXACTLY where NetApp does not do what you expect - fundamentally the elephant in the room - those 1,140 random write IOS are dealt with by the redirect on write technique turning them into a much smaller number of large sequential IOS written in stripes with minimal head movement and thus extra low latency - the whole purpose of NVRAM.

This dramatically changes the wallclock time required to complete them (latency) leaving signicant capacity for dealing with the few read IOS. Hence the ability to use significantly less disks.

On the boot storm issue - which one would you buy - the one that can weather it or the one with instructions on how you may aviod it?

The comments to this entry are closed.

TRUSTe CLICK TO VERIFY