Sunday 4 January 2009

The Rise of the Stupid Endpoint

Welcome to the first SAAAP of 2009. And boy am I feeling philosophical today :-). The similarities between this post and David Isenberg's article Rise of the Stupid Network end with the title, but if you haven't read Isenberg's article you should do so - the principles he outlined over 10 years ago are just as applicable today to network infrastructure.

The Failure of Distributed Computing
The proliferation of distributed computing in the datacenter was essentially driven by the low cost of x86 hardware. Rather than spending huge sums on mainframe or proprietary Unix infrastructure and have it sit there underutilised for a long period (hmmm, sound familiar?) or use it for non-critical applications, why not buy smaller, cheaper x86 servers instead? The managment of these many small systems was never really factored into the equation however, and before too long there were more applications for management of this infrastructure than you could poke a stick at. But for a long time, those management systems were generally lacking in intelligence. And because of that, more staff were needed to manage the management systems, respond to alerts etc etc. x86 didn't turn out to be as cheap as promised, not because of the hardware but because of the lack of intelligent management systems.

What We Learned from Management Systems
The distributed computing management systems taught a new generation of system administrators (as was I) what had existed in the mainframe infrastructure for decades - the advantages of centralisation. Monitoring baselines, patch and software distribution, backup, job scheduling... these things have always been centralised in enterprise distributed computing environments. Such environments would be extremely expensive to operate otherwise. Yet for some reason this train of thought never really made it all the way to the personality of the endpoint. Because of this, system level backup and restore is of paramount importance in todays datacenter. DR and lifecycle management is a painfully manual and labour intensive process. Administrative overhead is exacerbated.

Endpoint Stupidity, Centralised Intelligence
You cannot make something stupid without simultaneously making something else intelligent. And that intelligence needs to be centralised. A single endpoint has many touch points in the enterprise. Are all these touch points still required? Is the cost of removing them greater than the cost of living with them? Is the pain simply due to a lack of intelligence in the tools, or the processes that you follow? Orchestration tools may be the solution for this, but do not fall into the trap of merely alleviating the pain and not addressing the underlying cause.

Infrastructure Like Water
"The height of cultivation is really nothing special. It is merely simplicity; the ability to express the utmost with the minimum." - Bruce Lee
That's what it all boils down to. In order for distributed computing infrastructure to work, we need to simplify the endpoint. When the endpoint is simple, it can be formless like a cloud (sorry, I couldn't resist), empty like a cup. You cannot have elasticity in the datacenter without this simplification. A piece of hardware may be running ESX today, Windows tomorrow, and shutdown the next day. In order to do that, the personality of the hardware cannot exist on the hardware itself. Likewise with a workload. It may ask for one level of resources today and another tomorrow, it doesn't matter. It may be running in one location today and another tomorrow, it shouldn't matter. Virtualisation has obviously made this much easier than it would have been otherwise, and I'm taking virtualisation as whole here - not just hypervisors, but VLANs, VSANs, VIPs, DDNS etc etc. But the same principles need to apply to both physical and virtual infrastructure.

End Game
So that's the what, it's up to people like you and me to figure out the how. With problems like these to solve, how can 2009 not be a good year :-)