To come back to Steve's presentation -- here are the slides from a previous version. I hope he will soon post the updated version we saw yesterday, but the differences are not major. The co-author of the talk is Julio Guijarro. Their area of interest within HP Labs is the deployment of large applications across distributed resources and the management of these apps/resources with an eye to maximizing their output and minimizing their cost. A familiar (and hard) problem for everybody who works in the hosting industry.
Steve talked about how the infrastructure architectures have changed over the years from a single web server talking to a single database server, to clustering, and finally to server farms and computing-on-demand. The challenge for us 'server farmers' is to figure a way to manage thousands of servers, heaps of storage, a myriad of network infrastructure devices, and large distributed applications on top of that -- all while keeping everything purring and happy, running to their maximum potential. Sounds impossible, but Amazon seems to be doing a decent job at it. And in fact Steve spent quite some time talking about how Amazon changed the game by their S3 and EC2 offerings. Even though they're not quite ready for prime time in terms of production deployments, Amazon will soon get there. As a proof, see their recent introduction of static IP addresses in EC2, and of the possibility of running your application in different data centers.
In my opinion, the best of Steve's slides are the 'Assumptions that are now invalid' ones. They really turn the 'established facts and best practices' of infrastructure and application design on their heads. Here are some examples of assumptions that don't hold anymore in our day and time:
- it is expensive to create, deploy and duplicate a new system, running a Linux image of your choice (see Instalinux as a counter-example)
- system failure is unusal and 100% availability can be achieved
- databases are the best form of storage
- you need physical access to the data center
- a single server farm needs to scale to infinity
I really recommend that you check out Steve's slides. There's a lot to chew on, but you can't afford not to chew on it, if you have anything to do with the IT industry these days.
Here are a couple more links that might prove useful:
- Anubis: a tuple-space implementation that uses multicast to share information between hosts within a site
- SmartFrog: a technology from HP used to distribute and manage applications (think puppet but geared towards application deployment); see also Google video
Update: Steve has some more thoughts on the Agile Infrastructure concept. Intriguing. This is something I'll definitely keep a very close eye on and tinker with.