Thursday, July 02, 2009

Dark launching and other lessons from Facebook on massive deployments

I came across this note from the Engineering team at Facebook which talks about how they managed to smoothly launch their recent 'pick a username' feature. The title of the note is, appropriately enough, 'Hammering usernames' -- this is of course because they were expecting their infrastructure to be hammered.

In the note I saw for the first time a name for a strategy that teams I've been involved with have applied before: 'dark launching'. Essentially, dark launching is releasing a new feature to a subset of your users, mostly with no UI changes, but otherwise exercising all the parts of your infrastructure involved in serving that feature. A good strategy to apply when you're dealing with massive, large-scale deployments, and when you want to see how your infrastructure behaves in conditions that are as close to production as possible. Because remember, there's nothing like production! Your careful load/stress testing exercises in a lab environment ain't gonna cut it.

The note from Facebook has all sorts of other nuggets of wisdom related to massive infrastructure deployments. I recommend subscribing to the RSS feed for 'Engineering @ Facebook's Notes'.

While googling for 'dark launching', I also came across this very good post by Dare Obasanjo. Recommended reading.


Unknown said...

They also discuss it in this great presentation : ... among other nice ideas about large scale deploy to production.

Anonymous said...

That's an interesting enough talk that Evgeny linked to but John Allspaw sure swears a lot during it.

Inder P Singh said...

You are right, there is nothing like production unless someone has gone through the pains of replicating an entire production environment (which you may not see in the case of massive deployments). Dark launching is adding another step in the deployment process. It is done before the new feature is made available to a group of users. You can gradually increase the number of users in the group.

Since the feature is not visible to the users in the case of a dark launch, you will not see any complaints logged by users. The onus is on the production support team to monitor the system and review the logs on an ongoing basis to discover any problems that their infrastructure is facing.

Inder P Singh

Paul Hildebrandt said...

Nice, I got to use "dark launching" in sentence today and everyone thought it was cool. It's also how I am rolling out our new message broker.

Peter Gfader said...

Interesting that others (eg Paul M Duvall in his dzone refzcard) refers to dark launching as:
Launch new features when it affects the least amout of users.

Link to Pauls refcard

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...