Wednesday 10 December 2008

VMware View - Linked Clones Not A Panacea for VDI Storage Pain!

Why do I always seem to be the bad guy. I know the other guys dont gloss over the limitations or realities of product features on purpose, but sheesh it always seems to be me cutting a swathe of destruction through the marketing hype. Somehow I don't think I will be in contention for VVP status anytime soon (I'm sure the profanity doesn't help either but... fuck it. Hey it's not like kids read this, and I'm certainly not out there doing this kind of thing).

As any reader of this blog will be aware, VMware View was launched a week or so ago. Amongst it's many touted features was the oft requested "linked clone" functionality, designed to reduce the storage overheads associated with VDI. But it may not be the panacea it's being made out to be.

My 2 main concerns are:

1) Snapshots can grow up to the same size as the source disk. Although this situation will probably not be too common, you can bet your life that they will grow to the size of the free space in the base image in a manner of weeks. I've spent an _extensive_ amount of time testing this stuff out to try and battle the VDI storage cost problem. But no matter how much you tune the guest OS, there's just no way to overcome the fact that the NTFS filesystem will _always_ write to available zero'ed blocks before it writes to blocks containing deleted files. This means if you have 10GB free space, and you create and delete a 1GB file 10 times, you will end up with no net change in used storage within the guest however your snapshot will now be 10GB in size. Don't belive me? Try it yourself. Now users creating and deleting large files in this manner may again be uncommon, but temporary internet files, software distribution / patching application caches, patch uninstall directories, AV caches and definition files, temporary office files... these things all add up. Fast. Now of course each environment will differ in this regard, so the best thing you can do to get an idea of what the storage savings you can expect is take a fresh VDI machine, snapshot it, and then use it normally for a month. Of if you think you're not the average user, pick a newly provisioned user VDI machine and do the test. Monitor the growth of that snap file weekly. I think you'll be surprised at what you find. Linked clones are just snapshots at the end of the day, they don't do anything tricky with deduplication, they don't do anything tricky in the guest filesystem, they are just snapshots. Which leads me to my next major concern.

2) LUN locking. We all know that a lock is acquired on a VMFS volume whenever volume metadata is updated. Metadata updates occur everytime a snapshot file is incremented, at the moment this is hardcoded to 16MB increments. For this reason, the recommendation from VMware has always been to minimise the number of snapshotted machines on a single LUN. Something like 8 per LUN I think was the ballpark maximum. Now if the whole point of linked clones is to reduce storage, it's fair to assume you'd be planning to increase the number of VM's per LUN, not decrease them. In which case houston, we may have a problem. Perhaps linked clones increment in blocks of greater than 16MB, which may go some way towards solving the problem. But at this time I don't know if that's the case or not. Someone feel free to check this out and let me know (Rod, I'm looking at you :-)

Now of course there are many other design considerations, such having a single small LUN for the master image and partitioning your array cache so that LUN is _always_ cached. It's a snap'ed vmdk and therefore read only so this won't be a problem. Unless you dont have enough array cache to do this (which will likely be the case outside of large enterprises, or even within them in some cases).

In my mind, the real panacea for the storage problem is statelessness. User environment virtualisation / streaming and app virtualisation / streaming are going to make the biggest dent in the storage footprint, while at the same time allowing us to use much cheaper storage because when the machine state is empty until a user logs on, it doesn't matter so much if the storage containing 50 base VM images disappears (assuming you catered for this scenario, which you did because you know that cheaper disk normally means increased failure rate).

So by all means, check View out. If high latency isn't a problem for you, there's little reason to choose another broker (aside from cost of course). But don't offset the cost of the broker with promises of massive storage cost reduction without trying out my snap experiment in your environment first, or you may get burnt.

UPDATE Rod has a great post detailing his thoughts on this. Go read it!

UPDATE 2 Another excellent post on this topic from the man himself, Chad Sakac. Chad removes any fears regarding LUN locking, which IMHO is only possible with empricial testing, which is exactly what Chad and his team have done. Excellent work, and thankyou. With regards to my other concern of snapshot growth and the reality of regular snapshot reversion, he also clarifies the absolutely vital point in all of this, which I didn't do a very good job of in my initial post - every environment will differ. Although I still believe the place I work is fairly typical of any large enterprise at this point in time, the absolute best thing for you to do in your shop is test it and see what the business are willing to except in terms of rebuiilds or investment in other technologies such as app virtualisation and profile management. They may or may not offset any storage cost reductions, again every place will differ.