Saturday, February 26, 2011

AWS CloudFormation is a provisioning and not a config mgmt tool

There's a lot of buzz on Twitter on how the recently announced AWS CloudFormation service spells the death of configuration management tools such as Puppet/Chef/cfengine/bcfg2. I happen to think that the opposite is true.

CloudFormation is a great way to provision what it calls a 'stack' in your EC2 infrastructure. A stack comprises several AWS resources such as EC2 instances, EBS volumes, Elastic Load Balancers, Elastic IPs, RDS databases, etc. Note that it was always possible to do this via your own homegrown tools by calling in concert the various APIs offered by these services/resources. What CloudFormation brings to the table is an easy way to describe the relationships between these resources via a JSON file which they call a template.

Some people get tripped by the inclusion in the CloudFormation sample templates of applications such as WordPress, Joomla or Redmine -- they think that CloudFormation deals with application deployments and configuration management. If you look closely at one of these sample templates, let's say the Joomla one, you'll see that what happens is simply that a pre-baked AMI containing the Joomla installation is used when launching the EC2 instances included in the CloudFormation stack. Also, the UserData mechanism is used to pass certain values to the instance. They do add a nice feature here where you can reference attributes defined in other parts of the stack template, such as DB endpoint address in this example:

"UserData": {
          "Fn::Base64": {
            "Fn::Join": [
              ":",
              [
                {
                  "Ref": "JoomlaDBName"
                },
                {
                  "Ref": "JoomlaDBUser"
                },
                {
                  "Ref": "JoomlaDBPwd"
                },
                {
                  "Ref": "JoomlaDBPort"
                },
                {
                  "Fn::GetAtt": [
                    "JoomlaDB",
                    "Endpoint.Address"
                  ]
                },
                {
                  "Ref": "WebServerPort"
                },
                {
                  "Fn::GetAtt": [
                    "ElasticLoadBalancer",
                    "DNSName"
                  ]
                }
              ]
            ]
          }
        },

However, all this was also possible before CloudFormation. You were always able to bake your own AMI containing your own application, and use the UserData mechanism to run whatever you want at instance creation time. Nothing new here. This is NOT configuration management. This will NOT replace the need for a solid deployment and configuration management tool. Why? Because rolling your own AMI results in an opaque 'black box' deployment. You need to document and version your per-baked AMIs carefully, then develop a mechanism for associating an AMI ID with a list of packages installed on that AMI. If you think about it, you actually end up writing an asset management tool. Then if you need to deploy a new version of the application, you either bake a new AMI (painful), or you reach for a real deployment/config mgmt tool to do it.

The alternative, which I espouse, is to start with a bare-bone AMI (I use the official Ubuntu AMIs provided by Canonical) and employ the UserData mechanism to bootstrap the installation of a configuration management client such as chef-client or the Puppet client. The newly created instance then 'phones home' to your central configuration management server (Chef server or Puppetmaster for example) and finds out how to configure itself. The beauty of this approach is that the config mgmt server keeps track of the customizations made on the client. No need for you to document that separately -- just use the search functions provided by the config mgmt tool to find out which packages and applications have been installed on the client.

The barebone AMI + config mgmt mechanism does result in EC2 instances taking longer to get fully configured initially (as opposed to the pre-baked AMI technique), but the flexibility and control you gain over those instances is well worth it.

One other argument, that I almost don't need to make, is that the pre-baked AMI technique is very specific to EC2. You will have to reinvent the wheel if you want to deploy your infrastructure to a different cloud provider, or inside your private cloud or datacenter.

So.....do continue to hone your skills at learning how to fully utilize a good configuration management tool. It will serve you well, both in EC2 and in other environments.

3 comments:

Joe said...

I love reading what you post, but I think you are missing the potential here. It isnt going to kill chef/puppet/etc - it shifts when we do configuration - pushes it earlier, so tools like Chef, Puppet, et all needd to start thinking about generating AMI as an end result, not starting there. The benefit to a declarative setup is the set of lego blocks that are prebuilt that you can use. The appliance model... look to some of the tools that create these appliances - appliance factories: rPath, UshareSoft. Their UI is young, and hideously painful, but they make the tools that will want to use.

We want solutions, and the CloudFormation JSON structure provides a means to declaratively asert soution sets. Its the next level of commodization, and it will ultimately be the place that disassociates getting something done from the hardware and instances its run on.

In short, its the next level that will let us scale our solutions.

Grig Gheorghiu said...

Hi Joe -- thanks for your comment. Ideally something like OVF would work for what you're saying -- a common virtual machine format that you could create with config mgmt tools and that you could deploy in any cloud. We're not there yet, but I agree with you that it would be nice to have this sort of 'Lego set' interoperability.

Unknown said...

Hi,
Interesting article.

As a disclaimer I actually work at UShareSoft.

If you are interested in multi-tier VM deployments, then UShareSoft is also proposing a framework (Open Appliance Studio) that is allowing you to include the configuration logic into the template and a way to allow the end user to provide their configuration parameters (passwords etc). The configuration across VMs is done completely automatically. The framework can be used in any cloud also. Mixing this with OVF, Puppet or even CloudFormation can provide you interesting new ways to provision and configure complex multi-tier solutions.

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...