No snowflakes allowed

"Snowflake" is a term I learned from my colleague Jeff Roberts. It is used in the Chef community (maybe in the configuration management community at large as well) to designate a server/node that is 'unique', i.e. not in configuration management control. In a Chef environment, it means that the node in question was never added to Chef and never had chef-client run on it.

We've all been in situations where it seems overkill to go through the effort of automating the setup of a server. Maybe the server has a unique purpose within our infrastructure. Maybe we didn't feel like spending the time to create Chef recipes for that server. Whatever the reasoning, it seemed low-risk at the time.

Well, I am here to tell you there is danger in this way of thinking. Example: we deployed a server in EC2 manually. We installed the Sensu client on it manually and pointed it at our Sensu server. Everything seemed fine. Then one day we updated our Sensu configuration (via Chef) both on the Sensu server and on all the Sensu clients. Of course, the Sensu configuration on our snowflake server never got updated, since chef-client wasn't running on that server. As a result, the Sensu client wasn't checking in properly with the Sensu server, and the snowflake behaved as if it was falling off the map as far as our monitoring system was concerned. We had to manually update Sensu on the snowflake to bring it in sync with our configuration changes.

Basically, the result of having snowflake servers is that they do fall off the map as far as the overall automation of your infrastructure is concerned. They suffer bitrot, and you end up spending lots of time on their care and feeding, thus defeating the purpose of saving the time to automate them in the first place.

This being said, it's hard to be disciplined enough to run chef-client periodically on every single server in your infrastructure. I've never been able to do that before, but we are doing it now, mostly because of the insistence of Jeff. I do see the advantages of this discipline, and I do recommend it to everybody.


Popular posts from this blog

Performance vs. load vs. stress testing

Running Gatling load tests in Docker containers via Jenkins

Dynamic DNS updates with nsupdate and BIND 9