Tuesday, February 28, 2006

Thoughts on giving a successful talk

I just came back from PyCon and while it's still fresh on my mind I want to jot down some of the thoughts I have regarding talks and storytelling.

Tell a story

People like stories. A good story transports you into a different world, if only for a short time, and teaches you important things even without appearing to do so. People's attention is captivated by good stories, and more importantly people tend to remember good stories.

The conclusion is simple: if you want your talk to be remembered, try to tell it as a story. This is hard work though, harder than just throwing bullet points at a slide, but it forces you as a presenter to get to the essence of what you're trying to convey to the audience.

A good story teaches a lesson, just like a good fable has a moral. In the case of a technical presentation, you need to talk about "lessons learned". To be even more effective, you need to talk about both positive and negative lessons learned, in other words about what worked and what didn't. Maybe not surprisingly, many people actually appreciate finding out more about what didn't work than about what did. So when you encounter failures during your software development and testing, write them down in a wiki -- they're all important, they're all lessons learned material. Don't be discouraged by them, but appreciate the lessons that they offer.

A good story provides context. It happens in a certain place, at a certain time. It doesn't happen in a void. Similarly, you need to anchor your talk in a certain context, you need to give people some points of reference so that they can see the big picture. Often, metaphors help (and metaphors are also important in XP practices: see this fascinating post by Kent Beck.)

For example, when talking about testing, Titus and I used a great metaphor that Jason Huggins came up with -- the testing pyramid. Here's a reproduction of the autographed copy of the testing pyramid drawn by Jason. Some day this will be worth millions :-)



Jason compares the different types of testing with the different layers of the food pyramid. For good software health, you need generous servings of unit tests, ample servings of functional/acceptance tests, and just a smattering of GUI tests (which is a bit ironic, since Jason is the author of one of the best Web GUI testing tools out there, Selenium).

However, note that for optimal results you do need ALL types of testing, just as a balanced diet contains all types of food. It's better to have composite test coverage by testing your application from multiple angles than by having close to 100% test coverage via unit testing only. I call this "holistic testing", and the results that Titus and I obtained by applying this strategy to the MailOnnaStick application suggest that it is indeed a very good approach.

Coming back to context, I am reminded of two mini-stories that illustrate the importance of paying attention to context. We had the SoCal Piggies meeting before PyCon, and we were supposed to meet at 7 PM at USC. None of us checked if there were any events around USC that night, and in consequence we were faced with a nightmarish traffic, due to the Mexico-South Korea soccer game that was taking place at the Colisseum, next to USC, starting around the same time. I personally spent more than 1 hour circling USC, trying to break through traffic and find a parking spot. We managed to get 8 out of the 12 people who said will be there, and the meeting started after 8:30 PM. Lesson learned? Pay attention to context and check for events that may coincide with your planned meetings.

The second story is about Johnny Weir, the American figure skater who a couple of weeks ago was in second place at the Olympics after the short program. He had his sights on a medal, but he failed to note that the schedule for the buses transporting the athletes from the Olympic housing to the arena had changed. As a result, he missed the bus to the arena and had to wander on the streets looking for a ride. He almost didn't make it in time, and was so flustered and unfocused that he produced a sub-par perfomance and ended up in fifth place. Lesson learned? Pay attention to the context, or it may very well happen that you'll miss the bus.

In my "Agile documentation -- using tests as documentation" talk at PyCon I mentioned that a particularly effective test-writing technique is to tell a story and pepper it with tests (doctest and Fit/FitNesse are tools that allow you to do this.) You end up having more fun writing the tests, you produce better documentation, and you improve your test coverage. All good stuff.

(I was also reminded of storytelling while reading Mark Ramm's post from yesterday.)

Show, don't just tell

This ties again into the theme of storytelling. The best narratives have lots of examples (the funnier the better of course), so the story you're trying to tell in your presentation should have examples too. This is especially important for a technical conference such as PyCon. I think an official rule should be: "presenters need to show a short demo on the subject they're talking about."

Slides alone don't cut it, as I've noticed again and again during PyCon. When the presenter does nothing but talk while showing bullet points, the audience is quickly induced to sleep. If nothing else, periodically breaking out of the slideshow will change the pace of the presentation and will keep the audience more awake.

Slides containing illegible screenshots don't cut it either. If you need to show some screenshots because you can't run live software, show them on the whole screen so that people can actually make some sense out of them.

A side effect of the "show, don't tell" rule: when people have to actually show some running demo of the stuff they're presenting, they'll hopefully shy away from paperware/vaporware (or, to put it in Steve Holden's words, from software that only runs on the PowerPoint platform).

Even if the talk has only 25 minutes alloted to it, I think there's ample time to work in a short demo. There's nothing like small practical examples to help people remember the most important points you're trying to make. And for 5-minute lightning talks, don't even think about only showing slides. Have at most two slides -- one introductory slide and one "lesson learned" slide -- and get on with the demo, show some cool stuff, wow the audience, have some fun. This is exactly what the best lightning talks did this year at PyCon.

Provide technical details

This is specific to talks given at technical conferences such as PyCon. What I saw in some cases at PyCon this year was that the talk was too high-level, glossed over technical details, and thus it put all the hackers in the audience to sleep. So another rule is: don't go overboard with telling stories and anecdotes, but provide enough juicy technical bits to keep the hackers awake.

The best approach is to mix-and-match technicalities with examples and demos, while keeping focused on the story you want to tell and reaching conclusions/lessons learned that will benefit the audience. A good example that comes to mind is Jim Hugunin's keynote on IronPython at PyCon 2005. He provided many technical insights, while telling some entertaining stories and popping up .NET-driven widgets from the IronPython interpreter prompt. I for one enjoyed his presentation a lot.

So that you don't get the wrong idea that I'm somehow immune to all the mistakes I've highlighted in here, I hasten to say that my presentation on "Agile testing with Python test frameworks" at PyCon 2005 pretty much hit all the sore spots: all slides, lots of code, no demos, few lessons learned. As a result, it kind of tanked. Fortunately, I've been able to use most of the tools and techniques I talked about last year in the tutorial for this year, so at least I learned from my mistakes and I moved on :-)

Saturday, February 25, 2006

PyCon notes part 2: Guido's keynote

Some notes I took during Guido's "State of the Python Universe" keynote:

Guido started by praising a few community activities:
  • converting the source code management system for Python from cvs to subversion, hosted away from SourceForge at svn.python.org
    • they got tired of waiting for SourceForge to sync the read-only cvs repositories with the live repository
  • deploying buildbot for continuous integration of the build-and-test process of Python on many different platforms (I was particularly pleased by this remark, since it validates some of the points Titus and I emphasized during our tutorial; Guido went as far as saying "I hope all of you will start deploying buildbot for your own projects")
  • deploying and hosting Python packages at the CheeseShop, and using tools such as easy_install to download and install them (here Guido gave the example of TurboGears, which depends on many different packages, yet is not very hard to install via setuptools)
  • converting the python.org Web site to a more modern look and feel -- the new site is still in beta, but is supposed to go live next Sunday March 5th
The bulk of Guido's talk consisted in the presentation of new features in Python 2.5. Here are some of the things I wrote down:

The release candidate and the final version for 2.5 are slated for Sept. 2006, but might happen a bit earlier if Anthony Baxter, the release manager, has his way.

New in Python 2.5
  • absolute/relative imports (PEP 328) -- hurray!
    • you will be able to write:
  • from . import foo # current package
  • from .. import foo # parent package
  • from .bar import foo # bar sub-package
  • from ..bar import foo # bar sibling
    • any number of leading dots will be allowed
  • conditional expressions (PEP 308)
    • Guido's syntax choice: EXPR1 if COND else EXPR2
  • try/except/finally reunited (PEP 341)
    • you will be able to write:
    • try:
      • BLOCK1
    • except:
      • BLOCK2
    • finally:
      • BLOCK3
  • generator enhancements (PEP 342, PJE)
    • yield will be an expression, as well as a statement
    • send method will basically make it possible to have coroutines
    • can also send exceptions
  • with statement (PEP 343, Mike Bland)
    • with EXPR [as VAR]:
      • BLOCK
    • useful for mutexes, database commit/rollback
    • context manager object will be returned by EXPR.__context__ and will have __enter__ and __exit__ methods
    • some standard objects will have context managers:
      • mutexes
      • Decimal
      • files
        • you will be able to write:
        • with open(filename) as f:
          • # read data from f (NOTE: f will be close at block end)
  • exceptions revisited (PEP 325, Brett Cannon)
    • make standard Exception class a new style class
    • have new class BaseException as root of Exception hierarchy
    • all exceptions except KeyboardInterrupt and SystemExit are supposed to continue to inherit from Exception, not from BaseException directly
  • AST-based compiler (Jeremy Hylton)
    • new bytecode compiler using ABSTRACT syntax trees instead of CONCRETE
    • new compiler is easier to modify
    • Python code will be able to get access to the AST as well
  • ssize_t (PEP 353 Martin v. Loewis)
    • use C type ssize_t instead of int for all indexing (and things such as length of strings)
    • needed for 64-bit platforms
    • allows indexing of strings with > 2GB characters (and lists, although a list with > 2GB elements is not likely to fit in today's standard RAM amounts, but Guido said we'll be ready for the future)
  • python -m . (Nick Coghlan)
    • will run given module in given package as __main__
    • it was already possible to run python -m in Python 2.4
    • expansion to submodules of packages was tricky (PEP 302)
    • solution proved to be the runpy.py module
    • use case: python -m test.regrtest
New in the Python 2.5 standard library
  • hashlib (supports SHA-224 to -512)
  • cProfile (Armin Rigo)
  • cElementTree (Fredrik Lundh)
  • Hopefully also
    • ctypes
    • wsgiref
    • setuptools
    • bdist_msi, bdist_egg options for distutils
Other miscellaneous stuff:
  • collections.defaultdict (for multimaps) will enable you to write code such as:
    • from collections import defaultdict
    • mm = defaultdict(list) # IMPORTANT: list is a factory, not a value
    • def add(key, *values):
      • mm[key] += values
And finally....lambda lives! (thunderous applause)

The Q&A session wasn't that great, mainly due to some less-than-fortunate questions about design choices that Guido did his best to answer without insulting anybody. He got asked at the very end if he likes working at Google and he said something like "Talk to me after the talk. I LOVE it there!"

Friday, February 24, 2006

"Agile Development and Testing" Wiki and notes

For those interested, you can now peruse the Trac-based Wiki that Titus and I used to collaborate on MailOnnaStick, the application we developed and tested for our PyCon tutorial.
The main wiki page contains links to the slides and the handout notes we had for the tutorial, but I'll link here too:

PyCon notes part 1

I'll post some more PyCon-related stuff shortly, but for now I just want to let people who are here at PyCon know (in case they're subscribed to Planet Python, which I assume most of them are...) that Titus and I will present a shorter, 1-hour version of our Agile Development and Testing tutorial this Sunday Feb.26th. We reserved two consecutive Open Space sessions, from 10:50 AM to around 12:00 PM, in one of the Bent Tree rooms. Hopefully this will benefit people who wanted to get into our tutorial but were not able to at the time. We had some good feedback from the people who attended the tutorial, and we'll try to present some of the highlights on Sunday.

Friday, February 10, 2006

Recommended blog: AYE conference

If you're like me, you've never attended Jerry Weinberg's AYE conferences, but you'd like to at least read nuggets of wisdom from the people who organize them. Well, you can do that by subscribing to the AYE conference blog. Here's one wisdom nugget, courtesy of Jerry Weinberg himself, who quotes some Japanese proverbs:

"We learn little from victory, much from defeat.

So, do not think in terms of Win or Lose, because you cannot always win.
Think instead of Learn, for Win or Lose, you can always learn."

BTW, AYE stands for Amplifying Your Effectiveness.

Update

I had the chance of observing the truth of this proverb while cleaning up the mess caused by my post on using setuptools when you don't have root access. The swift and decisive defeat I suffered by incurring the wrath of the author of setuptools (see PJE's comments to that post) contributed a lot to my understanding on how you're really supposed to use setuptools. Lose, but learn -- I'm OK with that, even when I'm subjected to some unnecessary vile language IMO.

Thursday, February 09, 2006

Installing a Python package to a custom location with setuptools

Update 2/10/06

According to PJE, the author of setuptools, my approach below is horribly wrong. Although I personally don't see why what I attempted to show below is so mind-blowingly stupid (according again to PJE; see his 2nd comment to this post), I do respect his wish of pointing people instead to the official documentation of setuptools, in particular to the Custom Installation Location section.

OK, maybe I see why he says my approach is stupid -- it would require modifying the PYTHONPATH for each and every package you install. To be honest, I never used easy_install this way, I always had root access on my machines, so the non-root scenario is not something I see every day. The PYTHONPATH hack below can be avoided if you go through the motions of setting up an environment for easy_install, as explained in the Custom Installation instructions.

So, again, if you are really interested in getting easy_install to work as simply as possible when you don't have root access to your box, please READ THE OFFICIAL INSTRUCTIONS (and pretend the link is blinking to attract your attention.)

If you still want to read my original post, warts and all, here it is:

My previous post on setuptools generated a couple of comments from people who pointed out that I didn't really read Joe Gregorio's post well enough; they said his main problem was not being able to install Routes using setuptools as a non-root user, since he doesn't have root access to his server (BTW, Joe, if you're reading this, JohnCompanies offers great Linux "virtual private server" hosting plans where you have your own Linux virtual machine to play with).

So...in response to these comments, here is the 2-minute guide to installing a Python package to a custom location as a non-root user using setuptools:

1. Use easy_install with the -d option to indicate the target installation directory for the desired package

[ggheo@concord ggheo]$ easy_install -d ~/Routes Routes
Searching for Routes
Reading http://www.python.org/pypi/Routes/
Reading http://routes.groovie.org/
Best match: Routes 1.1
Downloading http://cheeseshop.python.org/packages/2.4/R/Routes/Routes-1.1-py2.4. egg#md5=be7fe3368cbeb159591e07fa6cfbf398
Processing Routes-1.1-py2.4.egg
Moving Routes-1.1-py2.4.egg to /home/ggheo/Routes

Installed /home/ggheo/Routes/Routes-1.1-py2.4.egg

Because this distribution was installed --multi-version or --install-dir,
before you can import modules from this package in an application, you
will need to 'import pkg_resources' and then use a 'require()' call
similar to one of these examples, in order to select the desired version:

pkg_resources.require("Routes") # latest installed version
pkg_resources.require("Routes==1.1") # this exact version
pkg_resources.require("Routes>=1.1") # this version or higher


Note also that the installation directory must be on sys.path at runtime for
this to work. (e.g. by being the application's script directory, by being on
PYTHONPATH, or by being added to sys.path by your code.)

Processing dependencies for Routes

[ggheo@concord ggheo]$ cd ~/Routes
[ggheo@concord Routes]$ ls -la
total 40
drwxrwxr-x 2 ggheo ggheo 4096 Feb 9 14:13 .
drwxr-xr-x 90 ggheo ggheo 8192 Feb 9 14:17 ..
-rw-rw-r-- 1 ggheo ggheo 27026 Feb 9 14:13 Routes-1.1-py2.4.egg
[ggheo@concord Routes]$ file Routes-1.1-py2.4.egg
Routes-1.1-py2.4.egg: Zip archive data, at least v2.0 to extract

As you can see, a single egg file was installed in ~/Routes. For other packages, easy_install will create a directory called for example twill-0.8.3-py2.4.egg. This is all dependent on the way the package creator wrote the setup file.

2. Add the egg file (or directory) to your PYTHONPATH. This is one way of letting python know where to find the package; as the output of easy_install -d says above, there are other ways too:

[ggheo@concord ggheo]$ export PYTHONPATH=$PYTHONPATH:/home/ggheo/Routes/Routes-1.1-py2.4.egg

3. Start using the package:

[ggheo@concord ggheo]$ python
Python 2.4 (#1, Nov 30 2004, 16:42:53)
[GCC 3.2.2 20030222 (Red Hat Linux 3.2.2-5)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import routes
>>> dir(routes)
['Mapper', '_RequestConfig', '__all__', '__builtins__', '__doc__', '__file__', '__loader__', '__name__', '__path__', 'base', 'redirect_to', 'request_config', 'sys', 'threadinglocal', 'url_for', 'util']
>>> routes.__file__
'/home/ggheo/Routes/Routes-1.1-py2.4.egg/routes/__init__.pyc'

I printed the __file__ attribute to show that the module routes is imported from the custom location I chose for the install.

All in all, I still maintain this was pretty easy and painless.

Please start (or continue) using setuptools

You might have seen Joe Gregorio's blog post entitled "Please stop using setuptools". Well, I'm here to tell you that you should run, not walk, towards embracing setuptools as your Python package deployment tool of choice. I've been using setuptools for a while now and it makes life amazingly easy. It's the closest thing we have to CPAN in the Python world, and many times it does a better job than CPAN.

I think part, if not all, of the problem Joe Gregorio had, was that instead of bootstrapping the process by downloading ez_setup.py and running it in order to get setuptools and easy_install installed on his machine, he painstakingly attempted to install Routes via ez_setup.py. I do find the two names ez_setup.py and easy_install confusingly similar myself, and I sometimes start typing easy_setup when I really want easy_install. So I wish PJE changed the name of at least one of these two utilities to something else, to make it easier to remember. I vote for changing easy_install to something with less characters and that doesn't have an underscore in it, but I don't have any good ideas myself.

If Joe had run "easy_install Routes", he would have seen this:

$ sudo easy_install Routes
Password:
Searching for Routes
Reading http://www.python.org/pypi/Routes/
Reading http://routes.groovie.org/
Best match: Routes 1.1
Downloading http://cheeseshop.python.org/packages/2.4/R/Routes/Routes-1.1-py2.4.egg#md5=be7fe3368cbeb159591e07fa6cfbf398
Processing Routes-1.1-py2.4.egg
Moving Routes-1.1-py2.4.egg to /usr/local/lib/python2.4/site-packages
Adding Routes 1.1 to easy-install.pth file

Installed /usr/local/lib/python2.4/site-packages/Routes-1.1-py2.4.egg
Processing dependencies for Routes

I don't see how it can get easier than this, to be honest. If you've never used setuptools before, here's a quick howto from Titus. EZ indeed :-)

Update

See this other blog post of mine for instructions on installing a Python package via setuptools, to a custom location and as a non-root user.

Tuesday, February 07, 2006

Running FitNesse tests from the command line with PyFIT

After a few back-and-forth posts on the FitNesse mailing list and a lot of assistance from John Roth, the developer and maintainer of PyFIT, I managed to run FitNesse tests from the command line so that they can be integrated in buildbot.

First off, you need fairly recent versions of both PyFIT and FitNesse. I used PyFIT 0.8a1, available in zip format from the CheeseShop. To install it, run the customary "python setup.py install" command, which will create a directory called fit under the site-packages directory of your Python installation. Note that PyFIT 0.8a1 also includes a Python port of Rick Mugridge's FIT library.

As for FitNesse, you need at least release 20050405. I used the latest release at this time, 20050731.

One word of advice: don't name the directory which contains your FitNesse fixtures fitnesse; I did that and I spent a lot of time trying to understand why PyFIT suddenly can't find my fixtures. The reason was that there is a fitnesse directory under site-packages/fit which takes precedence in the sys.path list maintained by PyFIT. Since that directory obviously doesn't contain my fixtures, they were never found by PyFIT. The fitnesse directory didn't use to be there in PyFIT 0.6a1, so keep this caveat in mind if you're upgrading from 0.6a1 to a later version of PyFIT.

Since PyFIT doesn't have online documentation yet, I copied the documentation included in the zip file to one of my Web servers, so you can check out this FIT Overview page.

Let's assume that your FitNesse server is running on localhost on port 8080, and that you have an existing test suite set up as a FitNesse Wiki page (of type Suite) which contains 4 sub-pages of type Test. Let's say your test suite is called YourAcceptanceTests, and it resides directly under the FitNesseRoot directory of your FitNesse distribution, which means that its URL would be http://your.website.org:8080/YourAcceptanceTests. Let's say your test pages are called YourTestPageN with N from 1 to 4.

If you want to run your FitNesse tests at the command line, you need to use the TestRunner. The command line to use looks something like this:

python /usr/local/bin/TestRunner.py -o /tmp/fitoutput localhost 8080 YourAcceptanceTests

in connect. host: 'localhost' port: '8080'

http request sent
validating connection...
...ok
Classpath received from server: /proj/myproj/tests:fitnesse.jar:fitlibrary.jar
processing document of size: 4681
new document: 'YourTestPage1'
processing document of size: 11698
new document: 'YourTestPage2'
processing document of size: 6004
new document: 'YourTestPage3'
processing document of size: 3924
new document: 'YourTestPage4'
completion signal received
Test Pages: 4 right, 0 wrong, 0 ignored, 0 exceptions
Assertions: 239 right, 0 wrong, 0 ignored, 0 exceptions

The general syntax for TestRunner.py is:

TestRunner.py [options] host port pagename

From the multitude of options for TestRunner I used only -o, which specifies a directory to save the output files in. By default, TestRunner saves the HTML output for each test page, as well as a summary page for the test suite in XML format.

Another caveat: there is a bug in PyFIT 0.8a1 which manifests itself in PyFIT not recognizing the elements of your test tables if they're not in camel case format. So for example if you have a test table like this:

!|MyFitnesseFixtures.SetUp|
|initialize_test_database?|
|true|

you will need to change it to:

!|MyFitnesseFixtures.SetUp|
|initializeTestDatabase?|
|true|

You will of course need to have a corresponding method named initializeTestDatabase defined in the SetUp.py fixture and declared in the _typeDict type adapter dictionary.

Recommended blog: Games from Within

I just stumbled across Games from Within, Noel Llopis's blog on agile technologies and processes applied to game development. Check out A day in the life for an example of how the Half Moon Studios guys organize their work day. I drooled reading the post :-) There are also interesting entries on their C++ unit test framework and on their selection of a build system (Scons, a Python-based build system, apparently didn't cut it in terms of speed).

Wednesday, February 01, 2006

Continuous integration with buildbot

From the buildbot manual:

"The BuildBot is a system to automate the compile/test cycle required by most software projects to validate code changes. By automatically rebuilding and testing the tree each time something has changed, build problems are pinpointed quickly, before other developers are inconvenienced by the failure. The guilty developer can be identified and harassed without human intervention. By running the builds on a variety of platforms, developers who do not have the facilities to test their changes everywhere before checkin will at least know shortly afterwards whether they have broken the build or not. Warning counts, lint checks, image size, compile time, and other build parameters can be tracked over time, are more visible, and are therefore easier to improve."

All this sounded very promising, so I embarked on the journey of installing and configuring buildbot for the application that Titus and I will be presenting at our PyCon tutorial later this month. I have to say it wasn't trivial to get buildbot to work, and I was hoping to find a simple HOWTO somewhere on the Web, but since I haven't found it, I'm jotting down these notes for future reference. I used the latest version of buildbot, 0.7.1, on a Red Hat 9 Linux box. In the following discussion, I will refer to the application built and tested via buildbot as APP.

Installing buildbot

This step is easy. Just get the package from its SourceForge download page and run "python setup.py install" to install it. A special utility called buildbot will be installed in /usr/local/bin.

Update 2/21/06

I didn't mention in my initial post that you also need to install a number of pre-requisite packages before you can install and run buildbot (thanks to Titus for pointing this out):

a) install ZopeInterface; one way of quickly doing it is running the following command as root:

# easy_install http://www.zope.org/Products/ZopeInterface/3.1.0c1/ZopeInterface-3.1.0c1.tgz

b) install CVSToys; the quick way:

# easy_install http://twistedmatrix.com/users/acapnotic/wares/code/CVSToys/CVSToys-1.0.9.tar.bz2

c) install Twisted; there is no quick way, so I just downloaded the latest version of TwistedSumo (although technically you just need Twisted and TwistedWeb):

# wget http://tmrc.mit.edu/mirror/twisted/Twisted/2.2/TwistedSumo-2006-02-12.tar.bz2
# tar jxvf TwistedSumo-2006-02-12.tar.bz2
# cd TwistedSumo-2006-02-12
# python setup.py install


Creating the buildmaster

The buildmaster is the machine which triggers the build-and-test process by sending commands to other machines known as the buildslaves. The buildmaster itself does not run the build-and-test commands, the slaves do that, then they send the results back to the master, which displays them in a nice HTML format.

The build-and-test process can be scheduled periodically, or can be triggered by source code changes. I took the easy way of just triggering it periodically, every 6 hours.

On my Linux box, I created a user account called buildmaster, I logged in as the buildmaster user and I created a directory called APP. The I ran this command:

buildbot master /home/buildmaster/APP

This created some files in the APP directory, the most important of them being a sample configuration file called master.cfg.sample. I copied that file to master.cfg.

All this was easy. Now comes the hard part.

Configuring the buildmaster

The configuration file master.cfg is really just Python code, and as such it is easy to modify and extend -- if you know where to modify and what to extend :-).

Here are the most important sections of this file, with my modifications:

Defining the project name and URL

Search for c['projectName'] in the configuration file. The default lines are:

c['projectName'] = "Buildbot"
c['projectURL'] = "http://buildbot.sourceforge.net/"


I replaced them with:

c['projectName'] = "App"
c['projectURL'] = "http://www.app.org/"

where App and www.app.org are the name of the application, and its URL respectively. These values are displayed by buildbot in its HTML status page.

Defining the URL for the builbot status page

Search for c['buildbotURL'] in the configuration file. The default line is:

c['buildbotURL'] = "http://localhost:8010/"

I changed it to:

c['buildbotURL'] = "http://www.app.org:9000/"

You need to make sure that whatever port you choose here is actually available on the host machine, and is externally reachable if you want to see the HTLM status page from another machine.

If you replace the default port 8010 with another value (9000 in my case), you also need to specify that value in this line:

c['status'].append(html.Waterfall(http_port=9000))

Defining the buildslaves

Search for c['bots'] in the configuration file. The default line is:

c['bots'] = [("bot1name", "bot1passwd")]

I modified the line to look like this:

c['bots'] = [("x86_rh9", "slavepassword")]

Here I defined a buildslave called x86_rh9 with the given password. If you have more slave machines, just add more tuples to the above list. Make a note of these values, because you will need to use the exact same ones when configuring the buildslaves. More on this when we get there.

Configuring the schedulers

Search for c['schedulers'] in the configuration file. I commented out all the lines in that section and I added these lines:

# We build and test every 6 hours
periodic = Periodic("every_6_hours", ["x86_rh9_trunk"], 6*60*60)
c['schedulers'] = [periodic]

Here I defined a schedule of type Periodic with a name of every_6_hours, which will run a builder called x86_rh9_trunk with a periodicity of 6*60*60 seconds (i.e. 6 hours). The builder name needs to correspond to an actual builder, which we will define in the next section.

I also modified the import line at the top of the config file from:

from buildbot.scheduler import Scheduler

to:

from buildbot.scheduler import Scheduler, Periodic


Configuring the build steps

This is the core of the config file, because this is where you define all the steps that your build-and-test process will consist of.

Search for c['builders'] in the configuration file. I commented out all the lines from:

cvsroot = ":pserver:anonymous@cvs.sourceforge.net:/cvsroot/buildbot"


to:

c['builders'] = [b1]


I added instead these lines:

source = s(step.SVN, mode='update',
baseURL='http://svn.app.org/repos/app/',
defaultBranch='trunk')

unit_tests = s(UnitTests, command="/usr/local/bin/python setup.py test")
text_tests = s(TextTests, command="/usr/local/texttest/texttest.py")
build_egg = s(BuildEgg, command="%s/build_egg.py" % BUILDBOT_SCRIPT_DIR)
install_egg = s(InstallEgg, command="%s/install_egg.py" % BUILDBOT_SCRIPT_DIR)

f = factory.BuildFactory([source,

unit_tests,
text_tests,
build_egg,

install_egg,
])

c['builders'] = [

{'name':'x86_rh9_trunk',
'slavename':'x86_rh9',
'builddir':'test-APP-linux',
'factory':f },
]

First off, here's what the buildbot manual has to say about build steps:

BuildSteps are usually specified in the buildmaster's configuration file, in a list of “step specifications” that is used to create the BuildFactory. These “step specifications” are not actual steps, but rather a tuple of the BuildStep subclass to be created and a dictionary of arguments. There is a convenience function named “s” in the buildbot.process.factory module for creating these specification tuples.

In my example above, I have the following build steps: source, unit_tests, text_tests, build_egg and install_egg.

source is a build step of type SVN which does a SVN update of the source code by going to the specified SVN URL; the default branch is trunk, which was fine with me. If you need to check out a different branch, see the buildbot documentation on SVN operations

For different types (i.e. classes) of steps, it's a good idea to look at the file step.py in the buildbot/process directory (which got installed in my case in /usr/local/lib/python2.4/site-packages/buildbot/process/step.py).

The step.py file already contains pre-canned steps for configuring, compiling and testing your freshly-updated source code. They are called respectively Configure, Compile and Test, and are subclasses of the ShellCommand class, which basically executes a given command, captures its stdout and stderr and returns the exit code for that command.

However, I wanted to have some control at least on the text that appears in the buildbot HTML status page next to my steps. For example, I wanted my UnitTest step to say "unit tests" instead of the default "test". For this, I derived a class from step.ShellCommand and called it UnitTests. I created a file called extensions.py in the same directory as master.cfg and added my own classes, which basically just redefine 3 variables. Here is my entire extensions.py file:

from buildbot.process import step
from step import ShellCommand

class UnitTests(ShellCommand):
name = "unit tests"
description = ["running unit tests"]
descriptionDone = [name]

class TextTests(ShellCommand):
name = "texttest regression tests"
description = ["running texttest regression tests"]
descriptionDone = [name]

class BuildEgg(ShellCommand):
name = "egg creation"
description = ["building egg"]
descriptionDone = [name]

class InstallEgg(ShellCommand):
name = "egg installation"
description = ["installing egg"]
descriptionDone = [name]

The two variables that I wanted to customize are description, which appears in the buildbot HTML status page while that particular step is being executed, and descriptionDone, which appears in the status page once the step is finished.

To make master.cfg aware of my custom classes, I added this line to the top of the config file:

from extensions import UnitTests, TextTests, BuildEgg, InstallEgg

Let's look at the custom build steps I added. For the unit_tests step, I'm telling buildbot to run the command python setup.py test on the buildslaves and report back the results. For the text_tests step, the command is /usr/local/texttest/texttest.py, which is where I installed the TextTest acceptance/regression test package. For build_egg and install_egg, I'm running my own custom scripts build_egg.py and install_egg.py on the buildslave, using the BUILDBOT_SCRIPT_DIR variable which I defined at the top of the configuration file as:

BUILDBOT_SCRIPT_DIR = "/home/buildbot/APP/bot_scripts"

As you add more build steps, you need to also add them to the factory object:

f = factory.BuildFactory([source,
unit_tests,
text_tests,
build_egg,

install_egg,
])

The final step in dealing with build steps is defining the builders, which correspond to the buildslaves. In my case, I only have one buildslave machine, so I'm only defining one builder called x86_rh9_trunk which is running on the slave called x86_rh9. The slave will use a builddir named test-APP-linux; this is the directory where the source code will get checked out and where all the build steps will be performed.

Note: the name of the builder x86_rh9_trunk needs to correspond with the name you indicated when defining the scheduler.

Here is again the code fragment which defines the builder:

c['builders'] = [
{'name':'x86_rh9_trunk',
'slavename':'x86_rh9',
'builddir':'test-APP-linux',
'factory':f },
]

We're pretty much done with configuring the buildmaster. Now it's time to create and configure a buildslave.

Creating and configuring a buildslave

On my Linux box, I created a user account called buildbot, I logged in as the buildbot user and I created a directory called APP. The I ran this command:

buildbot slave /home/buildbot/APP localhost:9989 x86_rh9 slavepassword

Note that most of these values have already been defined in the buildmaster's master.cfg file:
  • localhost is the host where the buildmaster is running (if you're running the master on a different machine from the one running the slave, you need to indicate here a name or an IP address which is reachable from the slave machine)
  • 9989 is the default port that the buildmaster listens on (it is assigned to c['slavePortnum'] in master.cfg)
  • x86_rh9 is the name of this slave, and slavepassword is the password for this slave (both values are assigned in master.cfg to c['bots'])
This command creates a file called buildbot.tac in the APP directory. You can edit the file and change the values of all the elements indicated above.

I also created my custom scripts for building and installing a Python egg. I created a sub-directory of APP called bot_scripts, and in there I put build_egg.py and install_egg.py, the 2 scripts that are referenced in the "Build steps" section of the buildmaster's configuration file.

Starting and stopping the buildmaster and the buildslave

To start the buildmaster, I ran this command as user buildmaster:

buildbot start /home/buildmaster/APP

To stop the buildmaster, I used this command:

buildbot stop /home/buildmaster/APP

When I needed the buildmaster to re-read its configuration file, I used this command:

buildbot sighup /home/buildmaster/APP

I used similar commands to start and stop the buildslave, the only difference being that I was logged in as user buildbot and I indicated /home/buildbot/APP as the BASEDIR directory for the buildbot start/stop/sighup commands.

If everything went well, you should be able at this point to see the buildbot HTML status page at the URL that you defined in the buildmaster master.cfg file (in my case this was http://www.app.org:9000/)

If you can't reach the status page, something might have gone wrong during the startup of the buildmaster. Inspect the file /home/buildmaster/APP/twistd.log for details. I had some configuration file errors initially which prevented the buildmaster from starting.

Whenever the buildmaster is started, it will initiate a build-and-test process. If it can't contact the buildslave, you will see a red cell on the status page with a message such as

18:50:52
ping
no slave

In this case, you need to look at the slave's log file, which in my case is in /home/buildbot/APP/twistd.log. Make sure the host name and port numbers, as well as the slave name and password are the same in the slave's buildbot.tac file and in the master's master.cfg file.

If the slave is reachable from the master, then the build-and-test process should unfold, and you should end up with something like this on the status page:

APP
last build
build
successful
current activity idle
time (PDT) changes x86_rh9_trunk
13:06:19

egg installation
log
egg creation
log
13:05:53 texttest regression tests
log
unit tests
log
update
log
Build 29

That's about it. I'm sure there are many more intricacies that I have yet to discover, but I hope that this HOWTO will be useful to people who are trying to give buildbot a chance, only to be discouraged by its somewhat steep learning curve.

And I can't finish this post without pointing you to the live buildbot status page for the Python code base.

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...