Saturday, December 22, 2007

The power of checklists (especially when automated)

Just stumbled on this post at InfoQ on the power of checklists. It talks about a low-tech approach to improving care in hospitals, by writing down the steps needed in various medical procedures and putting together a checklist for each case. I've seen the power of this approach at my own company -- until we put together checklists with things we have to do when setting up various servers or applications, we were guaranteed to skip one or more small but important steps.

I'd like to take this approach up a notch though: if you're in the software business, you actually need to AUTOMATE your checklists. Otherwise it's still very easy for a human being to skip a step. Scripts don't usually make that mistake. Yes, a human being still needs to run the script and to make intelligent decisions about the overall outcome of its execution. If you do take this approach, make sure your scripts also have checks and balances embedded in them -- also known as tests. For example, if your script retrieves a file over the network with wget, make sure the file actually gets on your local file system. A simple 'ls' of the file will convince you that the operation succeeded.

As somebody else once said, the goal here is to replace you (the sysadmin or the developer) with a small script. That will free you up to do more fun work.


Anonymous said...

Good practice to be followed in every organization


Anonymous said...

Checklist in software development and testing sound like a good idea.

A similar process I discovered recently but in reverse is this: I try to write down exactly what I am doing through out the day in a journal. I find this helps organize my thoughts, keeps a good record of what I've done so I don't repeat the same thing, and quickly refreshes my memory about whats happened. Also - I have good notes to refer to a month from now when I want to recall why I came to whatever conclusion I came to.

This has been esp. useful while debugging code and writing new code. Possibly it is a modified and less rigorous version of test driven development. Anyway - it seems to help me.

Grig Gheorghiu said...

Hi, Denali

I agree with your approach, and in fact I use an internal blog instead of a journal. I have an earlier blog post on that:


Anonymous said...

Hi Grig,

Thanks for the link to your earlier post. Yes - a blog might be better and I'll give it a try. Until now, I was using a journal with the idea that it was more portable than a computer and I might need it somewhere outside of work and home. But so far this hasn't been true. I like the fact that a blog will truthfully keep track of dates - rather than a wiki which could contain time stamping typos.

Hopefully my co-workers and boss won't laugh and wonder why it took me five attempts to uncover cause for a particular bug. The vulnerability of an internal blog covering one's thought process and work might be one reason why it was so hard to get others to adopt the practice as described in a comment to your original post. Eh - oh well. The organizational gain seems greater than the cost of embarrassments. Or possibly the sense that it is a time-suck. Surprisingly, taking notes doesn't seem to take much time at all, and the benefits become obvious quickly.


Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...