Thursday, June 16, 2016

Running Jenkins jobs in Docker containers

One of my main tasks at work is to configure Jenkins to act as a hub for all the deployment and automated testing jobs we run. We use CloudBees Jenkins Enterprise, mostly for its Role-Based Access Control plugin, which allows us to create one Jenkins folder per project/application and establish fine grained access control to that folder for groups of users. We also make heavy use of the Jenkins Enterprise Pipeline features (which I think are also available these days in the open source version).

Our Jenkins infrastructure is composed of a master node and several executor nodes which can run jobs in parallel if needed.

One pattern that my colleague Will Wright and I have decided upon is to run all Jenkins jobs as Docker containers. This way, we only need to install Docker Engine on the master node and the executor nodes. No need to install any project-specific pre-requisites or dependencies on every Jenkins node. All of these dependencies and pre-reqs are instead packaged in the Docker containers. It's a simple but powerful idea, that has worked very well for us. One of the nice things about this pattern is that you can keep adding various types of automated tests. If it can run from the command line, then it can run in a Docker container, which means you can run it from Jenkins!

I have seen this pattern discussed in multiple places recently, for example in this blog post about "Using Docker for a more flexible Jenkins".

Here are some examples of Jenkins jobs that we create for a given project/application:
  • a deployment job that runs Capistrano in its own Docker container, against targets in various environments (development, staging, production); this is a Pipeline script written in Groovy, which can call other jobs below
  • a Web UI testing job that runs the Selenium Python WebDriver and drives Firefox in headless mode (see my previous post on how to do this with Docker)
  • a JavaScript syntax checking job that runs JSHint against the application's JS files
  • an SSL scanner/checker that runs SSLyze against the application endpoints
We also run other types of tasks, such as running an AWS CLI command to perform certain actions, for example to invalidate a CloudFront resource. I am going to show here how we create a Docker image for one of these jobs, how we test it locally, and how we then integrate it in Jenkins.

I'll use as an example a simple Docker image that installs the AWS CLI package and runs a command when the container is invoked via 'docker run'.

I assume you have a local version of Docker installed. If you are on a Mac, you can use Docker Toolbox, or, if you are lucky and got access to it, you can use the native Docker for Mac. In any case,  I will assume that you have a local directory called awscli with the following Dockerfile in it:

FROM ubuntu:14.04

MAINTAINER You Yourself <you@example.com>

# disable interactive functions
ARG DEBIAN_FRONTEND=noninteractive
ENV AWS_ACCESS_KEY_ID=""
ENV AWS_SECRET_ACCESS_KEY=""
ENV AWS_COMMAND=""

RUN apt-get update && \
    apt-get install -y python-pip && \
    pip install awscli

WORKDIR /root

CMD (export AWS_ACCESS_KEY_ID=$AWS_ACCESS_KEY_ID; export AWS_SECRET_ACCESS_KEY=$AWS_SECRET_ACCESS_KEY; $AWS_COMMAND)

As I mentioned, this simply installs the awscli Python package via pip, then runs a command given as an environment variable when you invoke 'docker run'. It also uses two other environment variables that contain the AWS access key ID and secret access key. You don't want to hardcode these secrets in the Dockerfile and have them end up on GitHub.

The next step is to build an image based on this Dockerfile. I'll call the image awscli and I'll tag it as local:

$ docker build -t awscli:local .

Then you can run a container based on this image. The command line looks a bit complicated because I am passing (via the -e switch) the 3 environment variables discussed above:


$ docker run --rm -e AWS_ACCESS_KEY_ID=YOUR_ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY=YOUR_SECRET_ACCESS_KEY -e AWS_COMMAND='aws cloudfront create-invalidation --distribution-id=abcdef --invalidation-batch Paths={Quantity=1,Items=[/media/*]},CallerReference=my-invalidation-123456' awscli:local

(where distribution-id needs to be the actual ID of your CloudFront distribution, and CallerReference needs to be unique per invalidation)

If all goes well, you should see the output of the 'aws cloudfront create-invalidation' command.

In our infrastructure, we have a special GitHub repository where we check in the various folders containing the Dockerfiles and any static files that need to be copied over to the Docker images. When we push the awscli directory to GitHub for example, we have a Jenkins job that will be notified of that commit and that will build the Docker image (similarly to how we did it locally with 'docker build'), then it will 'docker push' the image to a private AWS ECR repository we have.

Now let's assume we want to create a Jenkins job that will run this image as a container. First we define 2 secret credentials, specific to the Jenkins folder where we want to create the job (there are also global Jenkins credentials that can apply to all folders). These credentials are of type "Secret text" and contain the AWS access key ID and the AWS secret access key.

Then we create a new Jenkins job of type "Freestyle project" and call it cloudfront.invalidate. The build for this job is parameterized and contains 2 parameters: CF_ENVIRONMENT which is a drop-down containing the values "Staging" and "Production" referring to the CloudFront distribution we want to invalidate; and CF_RESOURCE, which is a text variable that needs to be set to the resource that needs to be invalidated (e.g. /media/*).

In the Build Environment section of the Jenkins job, we check "Use secret text(s) or file(s)" and add 2 Bindings, one for the first secret text credential containing the AWS access key ID, which we save in a variable called AWS_ACCESS_KEY_ID, and the other one for the second secret text credential containing the AWS secret access key, which we save in a variable called AWS_SECRET_ACCESS_KEY.

The Build section for this Jenkins job has a step of type "Execute shell" which uses the parameters and variables defined above and invokes 'docker run' using the path to the Docker image from our private ECR repository:

DISTRIBUTION_ID=MY_CF_DISTRIBUTION_ID_FOR_STAGING
if [ $CF_ENVIRONMENT == "PRODUCTION" ]; then
    DISTRIBUTION_ID=MY_CF_DISTRIBUTION_ID_FOR_PRODUCTION
fi

INVALIDATION_ID=jenkins-invalidation-`date +%Y%m%d%H%M%S`

COMMAND="aws cloudfront create-invalidation --distribution-id=$DISTRIBUTION_ID --invalidation-batch Paths={Quantity=1,Items=[$CF_RESOURCE]},CallerReference=$INVALIDATION_ID"

docker run --rm -e AWS_ACCESS_KEY_ID=$ACCESS_KEY_ID -e AWS_SECRET_ACCESS_KEY=$SECRET_ACCESS_KEY -e AWS_COMMAND="$COMMAND" MY_PRIVATE_ECR_ID.dkr.ecr.us-west-2.amazonaws.com/awscli


When this job is run, the Docker image gets pulled down from AWS ECR, then a container based on the image is run and then removed upon completion (that's what --rm does, so that no old containers are left around).

I'll write another post soon with some more examples of Jenkins jobs that we run as Docker containers to do Selenium test, JSHint testing and SSLyze scanning.






2 comments:

Leighski said...

Hi there. thanks for this article.
Does the log output from the job running inside docker get passed back to jenkins to be stored there in the job execution history?
Can you see job progress from within jenkins?
How easy is it to fail the job in jenkins if the execution fails?
thanks

Unknown said...

Good article, are you using any orchestrator for Docker - jenkins job?

looking forward to the selenium post

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...