Friday, April 28, 2006

Selenium test creation and maintenance with

Michał Kwiatkowski is the author of, a useful Python module that can help you create and maintain Selenium tests. With make_selenium, you can go back and forth between Selenium tests written in HTML table format and the same tests written in Python.

When they start using Selenium, most people are drawn into using the Selenium IDE, which simplifies considerably the task of writing tests in HTML table format-- especially writing Selenium "action"-type commands such as clicking on links, typing text, selecting drop-down items, submitting forms, etc. However, maintaining the tests in HTML format can be cumbersome. Enter make_selenium, which can turn a Selenium test table into a Python script.

Here's an example of an HTML-based Selenium test that deals with Ajax functionality (see this post for more details on this kind of testing):

open /message/
dblclick //blockquote
waitForCondition var value = selenium.getText("//textarea[@name='comment']"); value == "" 10000
store javascript{Math.round(1000*Math.random())} var
type username user${var}
type email user${var}
type comment hello there from user${var}
click //form//button[1]
waitForCondition var value = selenium.getText("//div[@class='commentary-comment commentary-inline']"); value.match(/hello there from user${var}/); 10000
verifyText //div[@class="commentary-comment commentary-inline"] regexp:hello there from user${var}
clickAndWait //div/div[position()="1" and @style="font-size: 80%;"]/a[position()="2" and @href="/search"]
type q user${var}
clickAndWait //input[@type='submit' and @value='search']
verifyValue q user${var}
assertTextPresent Query: user${var}
assertTextPresent in Re: [socal-piggies] meeting Tues Apr 12th: confirmed
open /message/
assertTextPresent hello there from user${var}
assertTextPresent delete
click link=delete
waitForCondition var allText =; var unexpectedText = "hello there from user${var}" allText.indexOf(unexpectedText) == -1; 10000
assertTextNotPresent hello there from user${var}
assertTextNotPresent delete
clickAndWait //div/div[position()="1" and @style="font-size: 80%;"]/a[position()="2" and @href="/search"]
type q user${var}
clickAndWait //input[@type='submit' and @value='search']
verifyValue q user${var}
assertTextPresent Query: user${var}
assertTextPresent no matches

Let's assume we want to add more assertion commands to this table. Let's turn the TestCommentary.html file into a Python script with make_selenium:

python -p TestCommentary.html

This command generates a file called, with the following contents:'/message/')
S.waitForCondition('''var value = selenium.getText("//textarea[@name=\'comment\']");
value == ""
''', '10000')
S.pause('2000')'javascript{Math.round(1000*Math.random())}', 'var')
S.type('username', 'user${var}')
S.type('email', 'user${var}')
S.type('comment', 'hello there from user${var}')'//form//button[1]')
S.waitForCondition('var value = selenium.getText("//div[@class=\'commentary-comment commentary-inline\']"); value.match(/hello there from user${var}/);', '10000')
S.verifyText('//div[@class="commentary-comment commentary-inline"]', 'regexp:hello there from user${var}')
S.clickAndWait('//div/div[position()="1" and @style="font-size: 80%;"]/a[position()="2" and @href="/search"]')
S.type('q', 'user${var}')
S.clickAndWait('//input[@type=\'submit\' and @value=\'search\']')
S.verifyValue('q', 'user${var}')
S.assertTextPresent('Query: user${var}')
S.assertTextPresent('in Re: [socal-piggies] meeting Tues Apr 12th: confirmed')'/message/')
S.assertTextPresent('hello there from user${var}')
var allText =;
var unexpectedText = "hello there from user${var}"
allText.indexOf(unexpectedText) == -1;
''', '10000')
S.assertTextNotPresent('hello there from user${var}')
S.clickAndWait('//div/div[position()="1" and @style="font-size: 80%;"]/a[position()="2" and @href="/search"]')
S.type('q', 'user${var}')
S.clickAndWait('//input[@type=\'submit\' and @value=\'search\']')
S.verifyValue('q', 'user${var}')
S.assertTextPresent('Query: user${var}')
S.assertTextPresent('no matches')

Now we can just add more commands using the special S.command(args) syntax, where command can be any Selenium command, and args are the arguments specific to that command. When we're done, we run again, this time without the -p switch, in order to generate an HTML file. We can also specify a different name for the target HTML file:

python TestCommentary2.html

But there's more to make_selenium than moving tests back and forth between HTML table format and Python syntax. In fact, Michał's initial goal in writing make_selenium was to offer Python programmers an easy way of creating Selenium tests by writing Python code which would then be translated into HTML.

With make_selenium, you can take full advantage of Python constructs such as for loops when you write your Selenium tests. Here's an example inspired by Michał's documentation. Assume you have a large form with various elements that you need to type in. Then you click a submit button, and you want to make sure that after submitting, the values of the elements are the same. Here's how you would do it in a make_selenium-aware Python script:

def type_values(mapping):
for key, value in mapping.iteritems():
S.type(key, value)

def verify_values(mapping):
for key, value in mapping.iteritems():
S.assertValue(key, value)

data = {
'first_name': 'John',
'last_name': 'Smith',
'age': '25',


The only convention you need to follow is to use the special S object when you want to indicate a Selenium command. The functions type_values and verify_values are normal Python functions which repeatedly call S.type and S.assertValue on all the elements of the data dictionary. If you save this code in a file called and run make_selenium on it, you get a file called test1.html with the following contents:

type first_name John
type last_name Smith
type age 25
clickAndWait submit_data
assertValue first_name John
assertValue last_name Smith
assertValue age 25

You can then run this file in a TestRunner-based Selenium test suite.

You may ask why do you need make_selenium when Selenium RC is available. I think they both have their place. Selenium RC is your friend if you need the full-blown power of Python in your Selenium scripts. But sometimes, if only for documentation purposes, it's nice to have equivalent HTML-based tests around. On the other hand, HTML is cumbersome to modify and extend. I think that make_selenium bridges the gap between the "scripted Selenium" world and the "HTML TestRunner" world. In fact, it may be possible to write scripts that work in Selenium RC mode, and modify them minimally so that make_selenium understands them and is able to turn them back and forth, to and from HTML. I know Michał was working on this integration, but I haven't had a chance to test it yet with the current version of make_selenium (maybe Michał can leave a comment with a working example ?) Assuming this integration is working, then you can truly say your code is DRY -- you have one Python script expressing a Selenium test which can be run against a Selenium RC server, or turned into HTML for TestRunner consumption.

The current version of make_selenium is 0.9.5. You can download it from here. I encourage you to read the documentation (which BTW is almost entirely automatically generated from docstrings, in an agile fashion :-) Congratulations to Michał for a fine piece of work!

Wednesday, April 26, 2006

In-process Web app testing with twill, wsgi_intercept and doctest

At the SoCal Piggies meeting last night, Titus showed us the world's simplest WSGI application:
def simple_app(environ, start_response):
status = '200 OK'
response_headers = [('Content-type','text/plain')]
start_response(status, response_headers)
return ['Hello world!\n']
(see Titus's intro on WSGI for more details on what it takes for a Web server and a Web app to talk WSGI)

How do you test this application though? One possibility is to hook it up to a WSGI-compliant server such as Apache + mod_scgi, then connect to the server on a port number and use a Web app testing tool such as twill to make sure the application serves up the canonical 'Hello world!' text.

But there's a much easier way to do it, with twill's wsgi_intercept hook. Titus wrote a nice howto about it, so I won't go into the gory details, except to show you all the code you need to test it (code provided courtesy of twill's author):
twill.add_wsgi_intercept('localhost', 8001, lambda: simple_app)
That's it! Running the code above hooks up twill's wsgi_intercept into your simple_app, then drops you into a twill shell where you can execute commands such as:
>>> go("http://localhost:8001/")
==> at http://localhost:8001/
>>> show()
Hello world!

What happened is that wsgi_intercept mocked the HTTP interface by inserting hooks into httplib. So twill acts just as if it were going to an http URL, when in fact it's communicating with the application within the same process. This opens up all kinds of interesting avenues for testing:
  • easy test setup for unit/functional tests of Web application code (no HTTP setup necessary)
  • easy integration with code coverage and profiling tools, since both twill and your Web application are running in the same process
  • easy integration with PDB, the Python debugger -- which was the subject of Titus's lightning talk at PyCon06, where he showed how an exception in your Web app code is caught by twill, which then drops you into a pdb session that lets you examine your Web app code
But it gets even better: you can write doctests for your Web application code and embed twill commands in them. It will all work because of the wsgi_intercept trick. Here's a working example (yes, I accepted Titus's challenge and I made it work, at least with Python 2.4):

def simple_app(environ, start_response):
>>> import twill
>>> twill.add_wsgi_intercept('localhost', 8001, lambda: simple_app)
>>> from twill.commands import *
>>> go("http://localhost:8001")
==> at http://localhost:8001
>>> show()
Hello world!
\'Hello world!\\n\'

status = '200 OK'
response_headers = [('Content-type','text/plain')]
start_response(status, response_headers)
return ['Hello world!\n']

if __name__ == '__main__':
import doctest
(note that BLANKLINE needs to be surrounded with angle brackets, but the Blogger editor removes them...)

If you run this through python, you get a passing test. Some tricks I had to use:
  • indicating that the expected output contains a blank line -- I had to use the BLANKLINE directive, only available in Python 2.4
  • escaping single quotes and \n in the expected output -- otherwise doctest was choking on them
But here you have it: documentation and tests in one place, for Web application code that is generally hard to unit test. Doesn't get more agile than this :-)

Friday, April 14, 2006

Should acceptance tests be included in the continuous build process?

This is the title of a post by Dave Nicolette, a post prompted by some back-and-forth comments Dave and I left to each other on my blog regarding the frequency of running acceptance tests. Dave argues that acceptance tests do not really belong in a continuous integration build, because they do not have the same scope as unit tests, and they do not give the developers the feedback they need, regardless of how fast they actually run.

Here are some of my thoughts on this subject. First of all -- thank you, Dave, for your comments and blog post, which prompted me to better clarify to myself some of these things. I will argue in what follows that the speed of tests is of the essence, and is a big factor in determining which tests are run when.

Following Brian Marick's terminology, let's first distinguish between customer-facing (or business-facing) tests and code-facing (or technology-facing tests). I think it's an important distinction.

Customer-facing tests are high-level tests expressed in the business domain language, and they are created (ideally) through the collaboration of customers, business analysts, testers and developers. When a customer-facing test passes, it gives the customer a warm fuzzy feeling that the application does what it's supposed to do. Customer-facing tests are usually called acceptance tests. They can operate at the business logic level (in which case, at least in agile environments, they're usually created in executed with tools such as Fit or FitNesse), or at the GUI level (in which case a variety of tools can be used; for Web application, a combination of twill and Selenium will usually do the trick).

Code-facing tests are lower-level tests expressed in the language of the programmers. They deal with the nitty-gritty of the application, and when they pass, they give developers a warm fuzzy feeling that their code does what they intended it to do, and that they didn't break any existing code when they refactored it. Unit tests are a prime example of code-facing tests.

That said, there are some types of tests that can be seen as both customer-facing and code-facing. I'll talk more about them later on in this post.

Let me now discuss the various types of testing that Dave mentions in his post.

Unit tests

Unit tests are clearly code-facing tests. Michael Feathers, in his now-classical book "Working Effectively With Legacy Code", says that good unit tests have two qualities:
  • they run fast
  • they help us localize problems
Note that the very first quality is related to the speed of execution. If unit tests are not fast, that usually means that they depend on external interfaces such as databases and various network-related services, and thus they are not truly unit tests. Here is Michael Feathers again:

"Unit tests run fast. If they don't run fast, they aren't unit tests.

Other kinds of tests often masquerade as unit tests. A test is not a unit test if:

  1. It talks to a database.

  2. It communicates across a network.

  3. It touches the file system.

  4. You have to do special things to your environment (such as editing configuration files) to run it.

Tests that do these things aren't bad. Often they are worth writing, and you generally will write them in unit test harnesses. However, it is important to be able to separate them from true unit tests so that you can keep a set of tests that you can run fast whenever you make changes."

For more great advice on writing good unit tests, see Roy Osherove's blog post on "Achieving and Recognizing Testable Software Designs". Fast run time is again one of Roy's criteria for good unit tests.

How fast should a unit test be? Michael Feathers says that if a unit test takes more than 1/10 of a second to run, it's too slow. I'd say a good number to shoot for in terms of the time it should take for all your unit tests to run is 2 minutes, give or take.

Integration/functional tests

The terminology becomes a bit muddier from now on. While most people agree on what a unit test is, there are many different definitions for integration/functional/acceptance/system tests.

By integration testing, I mean testing several pieces of your code together. An integration test exercises a certain path through your code, a path that is usually described by means of an acceptance test. The boundary between integration tests and acceptance tests is somehow fuzzy -- which is why I think it helps to keep Brian Marick's categories in mind. In this discussion, I look at integration tests as code-facing tests that are extracted from customer-facing acceptance tests by shunting/stubbing/mocking external interfaces -- databases, network services, even heavy-duty file system operations.

This has the immediate effect of speeding up the tests. Another effect of stubbing/mocking the interfaces is that errors can be easily simulated by the stub objects, so that your error checking and exception-handling code can be thoroughly exercised. It's much harder to exercise these aspects of your code if you depend on real failures from external interfaces, which tend to be random and hard to reproduce.

One other benefit I found in stubbing/mocking external interfaces is that it keeps you on your toes when you write code. You tend to think more about dependencies between different parts of your code (see Jim Shore's post on "Dependency Injection Demystified") and as a result, your code becomes cleaner and more testable.

The downside of stubbing/mocking all the external interfaces is that it can take considerable time, which is the main reason why not many teams do it. But one can also say that writing unit tests takes precious time away from development, and we all know the dark and scary places where that concept leads...

To come back to the question expressed in the title of this post, and in Dave's post, I think that the type of integration/acceptance testing that I just described is a prime candidate for inclusion in the continuous integration build. It runs fast, it is code-facing, it gives developers instant feedback, it exercises paths through the application that are not being exercised by unit tests alone. Apart from the fact that it does take time to write this type of tests, I see no downside in including them in the continuous build.

One caveat: if you're using a tool such as Fit or FitNesse to describe and run your acceptance tests, then you'll be probably using the same tool for running the integration tests with stubbed/mocked interfaces. This has the potential of confusing the customers, who will not necessarily know whether they're looking at the real deal or at mock testing (see this story by David Chelimsky on "Fostering Credibility in Customer Tests" for more details on what can go wrong in this context). In cases like these, I think it's worth labelling the tests in big bold letters on the FitNesse wiki pages: "END TO END TESTS AGAINST A LIVE DATABASE" vs. "PURE BUSINESS LOGIC TESTS AGAINST MOCK OBJECTS".

Although it is a somewhat artificial distinction, for the purpose of this discussion it might help to differentiate between integration tests (those tests that have all the external interface stubbed) and functional tests, which I define as tests that have some, but not all external interfaces stubbed. For example, in the MailOnnaStick application, we stubbed the HTTP interface by using twill and WSGI in-process testing. This made the tests much faster, since we didn't have to start up a Web server or incur the network latency penalty.

I argue that if such functional tests are fast, they should be included in the continuous integration process, for the same reasons that integration tests should. Continous integration is all about frequent feedback, and the more feedback you have about different paths through your application, as well as about different interactions between the components of your application, the better off you are. The main criterion for deciding whether to include functional tests in the build that happens on every check-in is again the execution time of these tests.

And to touch on another point raised by Dave, I think you shouldn't include in your continuous integration process those tests that you know will fail because they have no code to support them yet. I agree with Dave that those tests serve no purpose in terms of feedback to the developers. Those are the true, "classical" acceptance tests that I discuss next.

However, here's a comment Titus had on this feedback issue when I showed him a draft of this post:
"One thing to point out about the
acceptance tests is that (depending on how you do your iteration
planning) you may well write the acceptance tests well in advance of
the code that will make them succeed -- TDD on a long leash, I guess.
My bet is that people would take a different view of them if we
agreed that the green/red status of the continuous integration tests
would be independent of the acceptance tests, or if you could flag
the acceptance tests that are *supposed* to pass vs those that aren't.

That way you retain the feel-good aspect of knowing everything is
working to date."
I think flagging acceptance tests as "must pass" vs. "should pass at some point", and gradually moving tests from the second category into the first is a good way to go. There's nothing like having a green bar to give you energy and boost your morale :-)

Acceptance tests

In an agile environment, acceptance tests are often expressed as "storytests" by capturing the user stories/requirements together with acceptance criteria (tests) that validate their implementation. I believe Joshua Kerievsky first coined the term storytesting, a term which is being used more and more in the agile community. I'm a firm believer in writing tests that serve as documentation -- I call this Agile Documentation, and I gave a talk about it at PyCon 2006.

Acceptance tests are customer-facing tests par excellence. As such, I believe they do need to exercise your application in environments that are as close as possible to your customers' environments. I wrote a blog post about this, where I argued that acceptance tests should talk to the database -- and you can replace database with any external interface that your application talks to.

However, even with these restrictions, I believe you can still put together a subset of acceptance tests that run fast and can be included in a continous integration process -- a process which run maybe not on each and every check-in, but definitely every 3 hours for example.

For example, an acceptance test that talks to a database can be sped up by having a small test database, but a database which still contains corner cases that can potentially wreak havoc on unsuspecting code. In-memory databases can sometimes be used successfully in cases like this. For network services -- for example getting weather information -- you can build a mock service running on a local server. This avoids dependencies on external services, while still exercising the application in a way that is much closer to the real environment.

System tests

By system testing, people usually understand testing the application in an environment that reproduces as close as possible the production environment. Many times, these types of tests are hard to automate, and definitely a certain amount of manual exploratory testing is needed. This being said, there is ample room for automation, if for nothing else but for regression testing/smoke testing purposes -- by which I mean deploying a new build of the application in the system test environment and making sure that nothing is horribly broken.

Now this is a type of tests that would be hard to run in a continous integration process, mostly because of time constraints. I think it should still be run periodically, perhaps overnight.

Performance/load/stress tests

Here there be dragons. I'm not going to cover this topic here, as I devoted a few blog posts to it already: "Performance vs. load vs. stress testing", "More on performance vs. load testing", "HTTP performance testing with httperf, autobench and openload".


This has been a somewhat long-winded post, but I hope my point is clear: the more tests you run on a continuous basis, and the more aspects of your application these tests cover, the better off you are. I call this holistic testing, and I touch on it in this post where I mention Elisabeth Hendrickson's "Better Testing, Worse Testing" article. One of the main conclusions Titus and I reached in our Agile Testing work and tutorial was that holistic testing is the way to go, as no one type of testing does it all.

As a developer, you need to the comfort of knowing that your refactoring hasn't broken existing code. If your continuous integration process can run unit, integration and functional tests fast and give you instant feedback, then your comfort level is so much higher (fast is the operational word here though). Here is an excerpt from a great post by Jeffrey Fredrick to the agile-testing mailing list:
"the sooner you learn about a problem the cheaper it is to fix,
thus the highest value comes from the test that provides feedback the
soonest. in practical terms this means that tests that run
automatically after a check-in/build will provide more value than
those that require human intervention, and among automated tests those
that run faster will provide more value than those that run slower.
thus my question about "how long does the test take to execute?" -- I
don't want a slow test getting in the way of the fast feedback I could
be getting from other tests; better to buy a separate machine to run
those slow ones.

related note: imperfect tests that are actually running provide
infinitely more value than long automation projects that spend months
writing testing frameworks but no actual tests. 'nuff said?"
My thoughts exactly :-)

If you're curious to see how Titus and I integrated our tests in buildbot, read Titus's Buildbot Technology Narrative.

To summarize, here's a continuous integration strategy that I think would work well for most people:

1. run unit tests and integration tests (with all external interfaces stubbed) on every check-in
2. also run fast functional tests (with some external interfaces stubbed) on every check-in
3. run acceptance tests with small data sets every 2 or 3 hours
4. run acceptance tests with large data sets and system tests every night

As usual, comments are very welcome.

Tuesday, April 04, 2006

30-second Selenium installation tutorial

For people new to Selenium, here's a 30-second (32.78 seconds to be precise) tutorial on how to install Selenium on a Linux box running Apache. I'll assume the DocumentRoot of the Apache installation is /var/www/html.

1. Download Selenium from The latest release of Selenium as of this writing is 0.6.
2. Unzip and cd into the selenium-0.6 directory.
3. Move or copy the selenium sub-directory somewhere under the DocumentRoot of your Apache server.

That's it -- you have a working Selenium installation!

Here's a screencast which contains the above steps. As I said, it took me 32.78 seconds to do all these things, and almost of all of that time was taken by the download of the zip file: QuickTime format and AVI format.

Now that you've installed Selenium, you might wonder what's your next step. Easy: open the following URL in a browser: ; you'll see the default test suite that is used to test Selenium itself. Click on the All button in the Control Panel frame on the right, and all the tests in the suite will be executed one by one. If you want to run only a specific test, click on that test name, then click Selected instead of All in the Control Panel.

Again, pictures are worth many words, so here's a screencast of running the default test suite in Firefox: QuickTime format (warning: 28 MB file) and AVI format. Note that some tests involving pop-up windows are failing, because Firefox doesn't allow pop-ups.

Next question: how do you write your own test? A Selenium test is basically an HTML file containing a table with Selenium commands and assertions. The default tests and test suite are in /selenium/tests. You can take any file that starts with Test as a model for writing your own tests. To create a test suite, just create a file similar to the default TestSuite.html, and include links to your custom tests. Assuming you created CustomTestSuite.html, you can see it in the TestRunner page by going to this URL:

For more introductory articles on Selenium, see the Getting Started section of the OpenQA Selenium Wiki.

Update: At Jason Huggins's suggestion, here is a screencast that shows the whole enchilada -- downloading Selenium, installing it and running the default test suite. The whole thing takes a mere 2 minutes and 12 seconds: Quicktime format (warning: 36 MB file), Windows Media format (1 MB file), AVI format (8 MB file).

Bunch O'Links on technical book writing and publishing

As synchronicity would have it, I've seen a lot of blog posts and articles lately on technical book writing and publishing. The one that has generated a lot of discussion was DHH's post on "Shaking up tech publishing". David talks about the success that 37signals have had in selling their Getting Real book online, in PDF format, and bypassing the traditional book publishing channels (here's a post from the 37signals blog with a 30-day update on their book sales). It's fascinating to read the comments on David's post, especially the ones from Tim O'Reilly (who doesn't need any introduction) and Gary Cornell (who does -- he's the publisher of APress). Tim talks at length about bestsellers, economies of scale, royalties, coping with too much success, etc. Gary Cornell responds with his point of view, then DHH throws in another wrench, and so on. Very entertaining.

I wish Tim and Gary would post their comments in their blogs, so that people could then link to them. As it is, their thoughts are scattered throughout the Comments section of David's post.

At least one book publisher did record his comments in a blog post format -- that would be Daniel Read, the creator of the developerdotstar blog/community/publishing house. In his "Tech publishing and developer.* Books" post, Daniel talks about his experiences as a "small, independent publisher using digital printing and on-demand distribution for niche titles." Developer.* just published their first book, "Software conflict 2.0: The Art and Science of Software Engineering", containing 30 essays by Robert L. Glass.

So there you have it: self-publishing via PDF downloads, traditional book publishing at O'Reilly and APress, and printing-on-demand. At least you have lots of options if you're thinking about writing a technical book. Of course, before you start on that road, make sure you actually have something worthy to write about :-) And if you want your book to be successful, pay attention to the style of your writing, not only to the substance. Here's an enlightening post from Kathy Sierra -- she of "Head First" book fame and Creating Passionate Users blog -- on "Two more reasons why so many tech docs suck".

To end on a "Joel-on-software"-esque note, here's a quote from Joel Spolsky himself:

"The software development world desperately needs better writing. If I have to read another 2000 page book about some class library written by 16 separate people in broken ESL, I’m going to flip out."

Update 4/5/06

Via Weinberg on Writing, a great post from Steve York on "Writers and other delusional people". Must read for all aspiring writers out there.

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...