Saturday, December 22, 2007

The power of checklists (especially when automated)

Just stumbled on this post at InfoQ on the power of checklists. It talks about a low-tech approach to improving care in hospitals, by writing down the steps needed in various medical procedures and putting together a checklist for each case. I've seen the power of this approach at my own company -- until we put together checklists with things we have to do when setting up various servers or applications, we were guaranteed to skip one or more small but important steps.

I'd like to take this approach up a notch though: if you're in the software business, you actually need to AUTOMATE your checklists. Otherwise it's still very easy for a human being to skip a step. Scripts don't usually make that mistake. Yes, a human being still needs to run the script and to make intelligent decisions about the overall outcome of its execution. If you do take this approach, make sure your scripts also have checks and balances embedded in them -- also known as tests. For example, if your script retrieves a file over the network with wget, make sure the file actually gets on your local file system. A simple 'ls' of the file will convince you that the operation succeeded.

As somebody else once said, the goal here is to replace you (the sysadmin or the developer) with a small script. That will free you up to do more fun work.

Wednesday, December 05, 2007

GHOP students ROCK!

I've been involved in the GHOP project for the last couple of weeks (although not as much as I'd have liked, due to time constraints) and I've been constantly amazed by the high quality of the work produced by the GHOP participants, who, let's not forget, are all still in high-school! I think all mentors were surprised at the speed with which the tasks were claimed, and at the level of proficiency showed by the students.

This bodes very well for Open Source in general, and for the Python community in particular. I hope that the students will continue to contribute to existing Python projects and start their own.

Here are some examples from tasks that I've been involved with:
Other students have already submitted patches that were accepted and applied to projects such as the stdlib logging module.

If you want to witness all this for yourself, and maybe get some help for your project from some really smart students, send an email with your proposal for tasks to the GHOP discussion list.

Thursday, November 29, 2007

Interview with Jerry Weinberg at Citerus

Read it here. As usual, Jerry Weinberg has many thought-provoking things to say. My favorite:

"Q: If you're the J.K Rowling of software development, who's Harry P then?

A: Well, first of all, I'm not a billionaire, so it's probably not correct to say I'm the J.K. Rowling of software development. But if I were, I suspect my Harry Potter would be a test manager, expected to do magic but discounted by software developers because "he's only a tester." As for Voldemort, I think he's any project manager who can't say "no" or hear what Harry is telling him."

Testers are finally redeemed :-)

Vonnegut's last interview

Via Tim Ferriss's blog, an inspiring interview with Kurt Vonnegut. BTW, if you haven't read Tim's '4-hour workweek' book, I highly recommend it. It will make you green with envy, but it will also offer you some good ideas about organizing your work and your life a bit differently.

Monday, November 12, 2007

PyCon'08 Testing Tutorial proposal

It's that time of the year, when the PyCon organizers are asking for talk, panel, and tutorial proposals. Titus and I are thinking about doing a three-peat of our Testing tutorial, but this time....with a twist. Read about it on Titus's blog; then send us your code/application that you'd like to test, or problems you have related to testing. Should be lots of fun.

Friday, October 19, 2007

Pybots updates

Time for the periodical update on the Pybots project. Since my last post in July, John Hampton added a buildslave running Gentoo on x86 and testing Trac and SQLAlchemy. A belated thank you to John.

I also had to disable the tests for bzr dev on my RH 9 buildslave, because for some reason they were leaving a lot of orphaned/zombie processes around.

With help from Jean-Paul Calderone from the Twisted team, we managed to get the Twisted buildslave (running RH 9) past some annoying multicast-related failures. Jean-Paul had me add an explicit iptables rule to allow multicast traffic. The rule is:
iptables -A INPUT -j ACCEPT -d

This seemed to have done the trick. There are some Twisted unit tests that still fail -- some of them are apparently due to the fact that raising string exceptions is now illegal in the Python trunk (2.6). Jean-Paul will investigate and I'll report on the findings -- after all, this type of issues is exactly why we set up the Pybots farm in the first place.

As usual, I end with a plea to people interested in running Pybots buidlslaves to either send a message to the mailing list, or contact me directly at grig at gheorghiu dot net.

Compiling mod_python on RHEL 64 bit

I just went through the fairly painful exercise of compiling mod_python 3.3.1 on a 64-bit RHEL 5 server. RHEL 5 ships with Python 2.4.3 and mod_python 3.2.8. I needed mod_python to be compiled against Python 2.5.1. I had already compiled and installed Python 2.5.1 from source into /usr/local/bin/python2.5. The version of Apache on that server is 2.2.3.

I first tried this:

# tar xvfz mod_python-3.3.1.tar.gz
# cd mod_python-3.3.1
# ./configure --with-apxs==/usr/sbin/apxs --with-python=/usr/local/bin/python2.5
# make which point I got this ugly error:

/usr/lib64/apr-1/build/libtool --silent --mode=link gcc -o \
-rpath /usr/lib64/httpd/modules -module -avoid-version finfoobject.lo \
hlistobject.lo hlist.lo filterobject.lo connobject.lo serverobject.lo util.lo \
tableobject.lo requestobject.lo _apachemodule.lo mod_python.lo\
-L/usr/local/lib/python2.5/config -Xlinker -export-dynamic -lm\
-lpython2.5 -lpthread -ldl -lutil -lm

/usr/bin/ld: /usr/local/lib/python2.5/config/libpython2.5.a(abstract.o):
relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object;
recompile with -fPIC

/usr/local/lib/python2.5/config/libpython2.5.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
apxs:Error: Command failed with rc=65536

I googled around for a bit, and I found this answer courtesy of Martin von Loewis. To quote:

It complains that some object file of Python wasn't compiled
with -fPIC (position-independent code). This is a problem only if
a) you are linking a static library into a shared one (mod_python, in this case), and
b) the object files in the static library weren't compiled with -fPIC, and
c) the system doesn't support position-dependent code in a shared library

As you may have guessed by now, it is really c) which I
blame. On all other modern systems, linking non-PIC objects
into a shared library is supported (albeit sometimes with a
performance loss on startup).

So your options are
a) don't build a static libpython, instead, build Python
with --enable-shared. This will give you
which can then be linked "into" mod_python
b) manually add -fPIC to the list of compiler options when
building Python, by editing the Makefile after configure has run
c) find a way to overcome the platform limitation. E.g. on
Solaris, the linker supports an impure-text option which
instructs it to accept relocations in a shared library.

You might wish that the Python build process supported
option b), i.e. automatically adds -fPIC on Linux/AMD64.
IMO, this would be a bad choice, since -fPIC itself usually
causes a performance loss, and isn't needed when we link
libpython24.a into the interpreter (which is an executable,
not a shared library).

Therefore, I'll close this as "won't fix", and recommend to
go with solution a).

So I proceeded to reconfigure Python 2.5 via './configure --enable-shared', then the usual 'make; make install'. However, I hit another snag right away when trying to run the new python2.5 binary:

# /usr/local/bin/python
python: error while loading shared libraries: cannot open shared object file: No such file or directory

I remembered from other issues I had similar to this that I have to include the path to (which is /usr/local/lib) in a ldconfig configuration file.

I created /etc/ with the contents '/usr/local/lib' and I ran

# ldconfig

At this point, I was able to run the python2.5 binary successfully.

I then re-configured and compiled mod_python with

# ./configure --with-apxs=/usr/sbin/apxs --with-python=/usr/local/bin/python2.5
# make

Finally, I copied from mod_python-3.3.1/src/.libs to /etc/httpd/modules and restarted Apache.

Not a lot of fun, that's all I can say.

Update 10/23/07

To actually use mod_python, I had to also copy the directory mod_python-3.3.1/lib/python/mod_python to /usr/local/lib/python2.5/site-packages. Otherwise I would get lines like these in the apache error_log when trying to hit a mod_python-enabled location:

[Mon Oct 22 19:41:20 2007] [error] make_obcallback: \
could not import mod_python.apache.\n \
ImportError: No module named mod_python.apache
[Mon Oct 22 19:41:20 2007] [error] make_obcallback:
Python path being used \
"['/usr/local/lib/python2.5/site-packages/setuptools-0.6c6-py2.5.egg', \
'/usr/local/lib/', '/usr/local/lib/python2.5', \
'/usr/local/lib/python2.5/plat-linux2', \
'/usr/local/lib/python2.5/lib-tk', \
'/usr/local/lib/python2.5/lib-dynload', '/usr/local/lib/python2.5/site-packages']".
[Mon Oct 22 19:41:20 2007] [error] get_interpreter: no interpreter callback found.

Update 01/29/08

I owe Graham Dumpleton (the creator of mod_python and mod_wsgi) an update to this post. As he added in the comments, instead of manually copying directories around, I could have simply said:

make install

and the installation would have properly updated the site-packages directory under the correct version of python (2.5 in my case) -- this is because I specified that version in the --with-python option of ./configure.

Another option for the installation, if you want to avoid copying the file in the Apache modules directory, and only want to copy the Python files in the site-packages directory, is:

make install_py_lib

Update 06/18/10

From Will Kessler:

"You might also want to add a little note though. The error message may actually be telling you that Python itself was not built with --enable-shared. To get mod_python-3.3.1 working you need to build Python with -fPIC (use enable-shared) as well."

Thursday, October 04, 2007

What's more important: TDD or acceptance testing?

The answer, as far as I'm concerned, is 'BOTH'. Read these entertaining blog posts to see why: Roy Osherove's JAOO conference writeup (his take on Martin Fowler's accent cracked me up), Martin Jul's take on the pull-no-punches discussions on TDD between Roy O. and Jim Coplien, and also Martin Jul's other blog post on why acceptance tests are important.

As I said before, holistic testing is the way to go.

Wednesday, September 26, 2007

Roy Osherove book on "The art of unit testing"

Just found out from Roy Osherove's blog that his book on "The Art of Unit Testing" is available for purchasing online -- well, the first 5 chapters are, but then you get the next as they're being published. Roy uses NUnit to illustrate unit testing concepts and techniques, but that shouldn't deter you from buying the book, because the principles are pretty much the same in all languages. I'm a long time reader of Roy's blog and I can say this is good stuff, judging by his past posts on unit testing and mock testing techniques.

Wednesday, September 19, 2007

Beware of timings in your tests

Finally I get to write a post about testing. Here's the scenario I had to troubleshoot yesterday: a client of ours has a Web app that uses a java applet for FTP transfers to a back-end server. The java applet presents a nice GUI to end-users, allowing them to drag and drop files from their local workstation to the server.

The problem was that some file transfers were failing in a mysterious way. We obviously looked at the network connectivity between the user reporting the problem initially and our data center, then we looked at the size of the files he was trying to transfer (he thought files over 10 MB were the culprit). We also looked at the number of files transferred, both multiple files in one operation and single files in consecutive operations. We tried transferring files using both a normal FTP client, and the java applet. Everything seemed to point in the direction of 'works for me' -- a stance well-known to testers around the world. All of a sudden, around an hour after I started using the java applet to transfer files, I got the error 'unable to upload one or more files', followed by the message 'software caused connection abort: software write error'. I thought OK, this may be due to web sessions timing out after an hour. I did some more testing, and the second time I got the error after half an hour. I also noticed that I let some time pass between transfers. This gave me the idea of investigating timeout setting on the FTP server side (which was running vsftpd). And lo and behold, here's what I found in the man page for vsftpd.conf:

The timeout, in seconds, which is the maximum time a remote client may spend between FTP commands. If the timeout triggers, the remote client is kicked off.

Default: 300

My next step was of course to wait 5 minutes between file transfers, and sure enough, I got the 'unable to upload one or more files' error.

Lesson learned: pay close attention to the timing of your tests. Also look for timeout settings both on the client and on the server side, and write corner test cases accordingly.

In the end, it was by luck that I discovered the cause of the problems we had, but as Louis Pasteur said, "Chance favors the prepared mind". I'll surely be better prepared next time, timing-wise.

Thursday, September 13, 2007

Barack Obama is now a connection

That's the message I see on my LinkedIn home page. How could this be possible, you ask? Well, yesterday I checked out my home page, and I noticed the 'featured question of the day' asked by Barack Obama himself (of course, the question was "how can the next president better help small businesses and entrepreneurs thrive".) A co-worker decided to send a LinkedIn invite to Barack. A little while later, he got the acceptance in his inbox. I followed his example, just for fun, and what do you know, I got back the acceptance in a matter of SECONDS, not even minutes! It seems that B.O. has set his LinkedIn account to accept each and every invite he gets. I guess when you're running for president, every little statistic counts. He already has 500+ connections, and I'm sure the time will come when he'll brag to the other candidates that his LinkedIn account is bigger than theirs.

The bottom line is that YOU TOO can have Barack as your connection, if only to brag to your friends about it.

Wednesday, September 12, 2007

Thursday, September 06, 2007

Security testing book review on Dr. Dobbs site

I wrote a review for "The Art of Security Testing" a while ago for Dr. Dobbs. I found out only now that it's online at the Dr. Dobbs's Portal site. Read it here.

Wednesday, September 05, 2007

Weinberg on Agile

A short but sweet PM Boulevard interview with Jerry Weinberg on Agile management/methods. Of course, he says we need to drop the A and actually drop 'agile' altogether at some point, and just talk about "normal, sensible, professional methods of developing software." Count me in.

Tuesday, September 04, 2007

Jakob Nielsen on fancy formatting and fancy words

Just received the latest Alertbox newsletter from Jakob Nielsen. The topic is "Fancy Formatting, Fancy Words = Ignored". I'd have put 2 equal signs in there, but anyway....The 'ignored' in question is your web site, if you're trying to draw attention to important facts/figures by using red bold letters and pompous language. Nielsen's case study in the article is the U.S. Census Bureau's homepage, which displayed the current population of the US in big red bold letters, and called it "Population clock". As a result, users were confused as to the meaning of that number, and what's more, they didn't bother to even read the full number, because they thought it's an ad of some sort. Interesting stuff.

Friday, August 24, 2007

Some notes from the August SoCal Piggies meeting

Read them here.

Put your Noonhat on

You may have seen this already, but here's another short blurb from me: Brian Dorsey, a familiar face to those of you who have been at the last 2 or 3 PyCon conferences, has launched a Django-based Web site he called Noonhat. The tagline says it all: "Have a great lunch conversation". It's a simple but original, fun and hopefully viral idea: you specify your location on a map, then you indicate your availability for lunch, and Noonhat puts you in touch with other users who have signed up and are up for lunch in your area at that time.

This has potential not only for single people trying to find a date, but also for anybody who's unafraid of stepping out of their comfort zone and strike interesting conversations over lunch. Brian and his site have already been featured on a variety of blogs and even in mainstream media around Seattle. Check out the Noonhat blog for more details.

Well done, Brian, and may your site prosper (of course, the ultimate in prosperity is being bought by Google :-)

Tuesday, August 21, 2007

Fuzzing in Python

I just bought "Fuzzing: Brute Force Vulnerability Discovery" and skimmed it a bit. I was pleasantly surprised to see that Python is the language of choice for many fuzzing tools, and clearly the favorite language of the authors, since they implemented many of their tools in Python. See the site/blog also, especially the Fuzzing software page. Sulley in particular seems a very powerful fuzzing framework. I need to look more into it (so much cool stuff, so little time.)

Update: got through the first 5-6 chapters of the book. Highly entertaining and educational so far.

Wednesday, August 08, 2007

Werner Vogels talk at QCon

Werner Vogels is the CTO of Amazon. You can watch a talk he gave at the QCon conference on the topics of Availability and Consistency. The bottom line is that, as systems scale (and for that means hundreds of thousands of systems), you have to pick 2 of the following 3: Consistency, Availability, Partitioning (actually the full name of the third one is "Tolerance to network partitioning.) This is called the CAP theorem, and Eric Brewer from Inktomi first came up with it.

Vogels pretty much equated partitioning with failure. Failure is inevitable, so you have to choose it out of those 3 properties. You're left with a choice between consistency and availability, or between ACID and BASE. According to Vogels, it turns out there's also a middle-of-the-road approach, where you choose a specific approach based on the needs of a particular service. He gave the example of the checkout process on When customers want to add items to their shopping cart, you ALWAYS want to honor that request (obviously because that's $$$ in the bank for you). So you choose high availability, and you hide errors from the customers in the hope that the system will sort out the errors at a later stage. When the customer hits the 'Submit order' button, you want high consistency for the next phase, because several sub-systems access that data at the same time (credit card processing, shipping and handling, reporting, etc.).

I also liked the approach Amazon takes when splitting people into teams. They have the 2-pizza rule: if it takes more than 2 pizzas to feed a team, it means the team is too large and needs to be split up. This equates to about 8 people per team. They actually make architectural decisions based on team size. If a feature is deemed to large to be comprehended by a team of 8 people, they split the feature into smaller pieces that can be digested more easily. Very agile approach :-)

Anyway, good presentation, highly recommended.

Tuesday, August 07, 2007

Automating tasks with pexpect

I started to use pexpect for some of the automation needs I have, especially for tasks that involve logging into a remote device and running commands there. I found the module extremely easy to use, and the documentation on the module's home page is very good. Basically, if you follow the recipe shown there for logging into an FTP server, you're set.

A couple of caveats I discovered so far:
  • make sure you specify correctly the text you expect back; even an extra space can be costly, and make your script wait forever; you can add '.*' to the beginning or to the end of the text you're expecting to make sure you're catching unexpected characters
  • if you want to print the output from the other side of the connection, use child.before (where child is the process spawned by pexpect)
Here's a complete script for logging into a load balancer and showing information about a load balanced server and its real servers:

#!/usr/bin/env python

import pexpect

def show_virtual(child, virtual):
child.sendline ('show server virtual %s' % virtual)
print child.before

def show_real(child, real):
child.sendline ('show server real %s' % real)
print child.before

virtuals = ['']
reals = ['web01', 'web02']

child = pexpect.spawn ('ssh myadmin@myloadbalancer')
child.expect ('.* password:')
child.sendline ('mypassword')
child.expect ('SSH@MyLoadBalancer>')

for virtual in virtuals:
show_virtual(child, virtual)

for real in reals:
show_real(child, real)

child.sendline ('exit')

Think twice before working from a Starbucks

Here's an eye-opening article talking about a tool called Hamster that sniffs wireless traffic and reveals plain-text cookies which can then be used to impersonate users. The guy running the tool was able to log in into some poor soul's Gmail account during a BlackHat presentation.

Pretty scary, and it makes me think twice before firing up my laptop in a public wireless hotspot. The people who wrote Hamster, from Errata Security, already released another tool called Ferret, which intercepts juicy bits of information -- they call it 'information seepage'. You can see a presentation on Ferret here. They're supposed to release Hamster into the wild any day now.

Update: If the above wasn't enough to scare you, here's another set of wireless hacking tools called Karma (see the presentation appropriately called "All your layers are belong to us".)

Thursday, August 02, 2007

That's what I call system testing

According to, the IT systems for the 2008 Olympics in Beijing will be put through rigorous testing which will take more than 1 year! The people at Atos Origin, the company in charge of setting up the IT for the 2008 Olympics, clearly know what they are doing.

It's also interesting that the article mentions insiders as a security threat -- namely, that insiders will try to print their own accreditation badges, or do it for their friends, etc. As always, the human factor is the hardest to deal with. They say they resort to extensive background checks for the 2,500 or so IT volunteers, but I somehow doubt that will be enough.

Tuesday, July 31, 2007

For your summer reading list: book on Continuous Integration

I found out about this book from the InfoQ blog -- the book is called Continuous Integration: Improving Software Quality and Reducing Risk and it is written by 3 guys from Stelligent, who also blog regularly on Seems like a very interesting and timely read for people interested in automated testing and obviously in continuous integration (which to me are the 2 first stepping stones on the path to 'agile testing'). You can also read a chapter from the book in PDF format: "Continuous testing".

Monday, July 30, 2007

Saturday, July 28, 2007

Dilbert, the PHB, and automated tests

Today's Dilbert cartoon shows that even the PHB can think "agile". He tells Dilbert to go write his own automated test software, instead of buying off-the-shelf. That's got to be agile, with small "a" :-) Of course, it's not recommended to call your team members "big babies" during the stand-up meeting.

Friday, July 27, 2007

Your purpose is the Python group

At our last SoCal Piggies meeting 2 days ago, Diane Trout showed us some Jabber bots, one of them based on PyAIML, an Eliza/AI kind of bot. When Diane asked this awfully intelligent little bot to smile for the Python group, this is what it replied:

How did it guess??? I used to not be a big believer in AI, but now I'm sold.

Pybots updates

After a long hibernation period, the Pybots project shows some signs of life -- I should probably say mixed with signs of death. Elliot Murphy from Canonical added the Storm ORM project to his AMD64 Ubuntu Gutsy buildslave, while Manuzhai and Jeff McNeil had to drop their buildslaves out of the mix, hopefully only temporarily. In Manuzhai's case though, the project he was testing -- Trac -- proved to have maintainers that were not interested in fixing their failing tests. In this case, there is no point in testing that project in Pybots. Hopefully Manuzhai will find a different, more test-infected project to run in Pybots.

Speaking of test-infected projects, it was nice to see unit testing topping the list of topics in the Django tutorial given at OSCON by Jeremy Dunck, Jacob Kaplan-Moss and Simon Willison. In fact, Titus is quoted too on this slide, which seems to be destined to be a classic (and I'm proud he uttered those words during the Testing Tools Panel that I moderated at PyCon07). Way to go, Titus, but I'd really like to see some T-shirts sporting that quote :-)

Saturday, July 07, 2007

Another Django success story

Pownce is yet another social networking site, but with the added twist that the creator of digg is one of its founders. Read about the technologies used to build it (Django included) here.

Thursday, June 21, 2007

Interested in a book on automated Web app testing?

Want to know more about twill and Selenium? In this case, I happen to know a very good book just published in the O'Reilly Short Cuts series :-) It's cheap too, $9.99, so please go buy it!

Wednesday, June 06, 2007

Brian Marick on "Four implementation styles for workflow tests"

Brian Marick just posted a very insightful article on various types of Web application tests that he calls 'workflow tests'. If you're in the Web application testing business, this should help you decide which type -- or better which TYPES (plural) -- of tests to use in your specific application scenario.

Friday, May 25, 2007

Consulting opportunities

If you're a Django or Selenium expert and live in the Los Angeles area (or even if you live in a different area, but can meet periodically with clients in Los Angeles), please send me an email at grig at I know of a couple of great consulting gigs.

Tuesday, May 15, 2007

Eliminating dependencies with regenerative build tools

This just in from Michael Feathers of "Working Effectively with Legacy Code" fame: a blog post on regenerative build tools. In the post, Michael describes an eye-opening practice related to continuous integration. Some smart people had the idea of running a script as part of a continuous build system that would comment out #include lines, one at a time, and then run the build. If the build succeeded, it meant that the include line in question was superfluous and thus could be deleted. Very interesting idea.

I bet this idea could be easily applied to Python projects, where you would comment out import statements and see if your unit test suite still passes. Of course, you can combine it with snakefood, a very interesting dependency graphing tool just released by Martin Blais. And you can also combine it with fault injection tools (aka fuzzers) such as Pester -- which belongs to the Jester family of tools also mentioned by Michael Feathers in his blog post.

Monday, May 14, 2007

Brian Marick has a new blog

If you're serious about testing, you need to read Brian Marick's blog, which he recently moved to from the old URL. Brian says:

" This is the another step in my multi-decade switch from to "Exampling", though not a verb, is a better description of what I do now. It includes testing, but has a larger scope."

And by the way, the style of technical writing that Brian describes in his latest post bugs me no end too...

Thursday, May 10, 2007

Resetting MySQL account passwords

I recently needed to reset the MySQL root account password. Here are the steps, for future reference:

1) Stop mysqld, for example via 'sudo /etc/init.d/mysqld stop'

2) Create text file /tmp/mysql-init with the following contents (note that the file needs to be in a location that is readable by user mysql, since it will be read by the mysqld process running as that user):

SET PASSWORD FOR 'root'@'localhost' = PASSWORD('newpassword');

3) Start mysqld_safe with following option, which will set the password to whatever was specified in /tmp/mysql-init:

$ sudo /usr/bin/mysqld_safe --init-file=/tmp/mysql-init &

4) Test connection to mysqld:

$ sudo mysql -uroot -pnewpassword

5) If connection is OK, restart mysqld server:

$ sudo /etc/init.d/mysqld restart

Also for future reference, here's how to reset a normal user account password in MySQL:

Connect to mysqld as root (I assume you know the root password):

$ mysql -uroot -prootpassword

Use the SET PASSWORD command:

mysql> SET PASSWORD for 'myuser'@'localhost' = PASSWORD('newuserpassword');

Wednesday, May 09, 2007

Apache virtual hosting with Tomcat and mod_jk

In a previous post I talked about "Configuring Apache 2 and Tomcat 5.5 with mod_jk". I'll revisit some of the topics in there, but within a slightly different scenario.

Let's say you want to configure virtual hosts in Apache, with each virtual host talking to a different Tomcat instance via the mod_jk connector. Each virtual host serves up a separate application via an URL such as This URL needs to be directly mapped to a Tomcat application. This is a fairly important requirement, because you don't want to go to a URL such as to see your application. This means that your application will need to be running in the ROOT of the Tomcat webapps directory.

You also want Apache to serve up some static content, such as images.

Running multiple instances of Tomcat has a couple of advantages: 1) you can start/stop your Tomcat applications independently of each other, and 2) if a Tomcat instance goes down in flames, it won't take with it the other ones.

Here is a recipe that worked for me. My setup is: CentOS 4.4, Apache 2.2 and Tomcat 5.5, with mod_jk tying Apache and Tomcat together (mod_jk2 has been deprecated).

Scenario: we want to go to a Tomcat instance running on port 8080, and to go to a Tomcat instance running on port 8081. Apache will serve up and

1) Install Apache and mod_jk. CentOS has the amazingly useful yum utility (similar to apt-get for you Debian/Ubuntu fans), which makes installing packages a snap:

# yum install httpd
# yum install mod_jk-ap20

2) Get the tar.gz for Tomcat 5.5 -- you can download it from the Apache Tomcat download site. The latest 5.5 version as of now is apache-tomcat-5.5.23.tar.gz.

3) Unpack apache-tomcat-5.5.23.tar.gz under /usr/local. Rename apache-tomcat-5.5.23 to tomcat8080. Unpack the tar.gz one more time, rename it to tomcat8081.

4) Change the ports tomcat is listening on for the instance that will run on port 8081.

# cd /usr/local/tomcat8081/conf
- edit server.xml and change following ports:
8005 (shutdown port) -> 8006
8080 (non-SSL HTTP/1.1 connector) -> 8081
8009 (AJP 1.3 connector) -> 8010

There are other ports in server.xml, but I found that just changing the 3 ports above does the trick.

I won't go into the details of getting the 2 Tomcat instances to run. You need to create a tomcat user, make sure you have a Java JDK or JRE installed, etc., etc.

The startup/shutdown scripts for Tomcat are /usr/local/tomcat808X/bin/|

I will assume that at this point you are able to start up the 2 Tomcat instances. The first one will listen on port 8080 and will have an AJP 1.3 connector (used by mod_jk) listening on port 8009. The second one will listen on port 8081 and will have the AJP 1.3 connector listening on port 8010.

5) Deploy your applications.

Let's say you have war files called app1.war for your first application and app2.war for your second application. As I mentioned in the beginning of this post, your goal is to serve up these applications directly under URLs such as, as opposed to One solution I found for this is to rename app1.war to ROOT.war and put it in /usr/local/tomcat8080/webapps. Same thing with app2.war: rename it to ROOT.war and put it in /usr/local/tomcat8081/webapps.

You may also need to add one line to the Tomcat server.xml file, which is located in /usr/local/tomcat808X/conf. The line in question is the one starting with Context, and you need to add it to the Host section similar to this one in server.xml. I say 'you may also need' because I've seen cases where it worked without it. But better safe than sorry. What the Context element does is it specifies ROOT as the docBase of your Web application (similar if you will to the Apache DocumentRoot directory).

<Host name="localhost" appBase="webapps"
unpackWARs="true" autoDeploy="true"
xmlValidation="false" xmlNamespaceAware="false">
<Context path="" docBase="ROOT" debug="0"/>
At this point, if you restart the 2 Tomcat instances, you should be able to go to and and see your 2 Web applications.

6) Create Apache virtual hosts for and and tie them to the 2 Tomcat instances via mod_jk.

Here is the general mod_jk section in httpd.conf -- note that it needs to be OUTSIDE of the virtual host sections:

# Mod_jk settings
# Load mod_jk module
LoadModule jk_module modules/
# Where to find
JkWorkersFile conf/
# Where to put jk logs
JkLogFile logs/mod_jk.log
# Set the jk log level [debug/error/info]
JkLogLevel emerg
# Select the log format
JkLogStampFormat "[%a %b %d %H:%M:%S %Y] "
# JkOptions indicate to send SSL KEY SIZE,
JkOptions +ForwardKeySize +ForwardURICompat -ForwardDirectories
# JkRequestLogFormat set the request format
JkRequestLogFormat "%w %V %T"

Note that the section above has an entry called JkWorkersFile, referring to a file called, which I put in /etc/httpd/conf. This file contains information about so-called workers, which correspond to the Tomcat instances we're running on that server. Here are the contents of my file:
# This file provides minimal jk configuration properties needed to
# connect to Tomcat.
# The workers that jk should create and work with

worker.list=app1, app2


The file declares 2 workers that I named app1 and app2. The first worker corresponds to the AJP 1.3 connector running on port 8009 (which is part of the Tomcat instance running on port 8080), and the second worker corresponds to the AJP 1.3 connector running on port 8010 (which is part of the Tomcat instance running on port 8081).

The way Apache ties into Tomcat is that each of the VirtualHost sections configured for and declares a specific worker. Here is the VirtualHost section I have in httpd.conf for

<VirtualHost *:80>
DocumentRoot "/usr/local/tomcat8080/webapps/ROOT"
<Directory "usr/local/tomcat8080/webapps/ROOT">
# Options Indexes FollowSymLinks MultiViews
Options None
AllowOverride None
Order allow,deny
allow from all
ErrorLog logs/app1-error.log
CustomLog logs/app1-access.log combined
# Send ROOT app. to worker named app1
JkMount /* app1
JkUnMount /images/* app1
RewriteEngine On
RewriteRule ^/(images/.+);jsessionid=\w+$ /$1

The 2 important lines as far as the Apache/mod_jk/Tomcat configuration is concerned are:

JkMount /* app1
JkUnMount /images/* app1

The line "JkMount /* app1" tells Apache to send everything to the worker app1, which then ties into the Tomcat instance on port 8080.

The line "JkUnMount /images/* app1" tells Apache to handle everything under /images itself -- which was one of our goals.

At this point, you need to restart Apache, for example via 'sudo service httpd restart'. If everything went well, you should be able to go to and and see your 2 Web applications running merrily.

You may have noticed a RewriteRule in each of the 2 VirtualHost sections in httpd.conf. What happens with many Java-based Web application is that when a user first visits a page, the application does not know yet if the user has cookies enabled or not, so the application will use a session ID mechanism fondly known as jsessionid. If the user does have cookies enabled, the application will not use jsessionid the second time a page is loaded. If cookies are not enabled, the application (Tomcat in our example) will continue generating URLs such as;jsessionid=0E45D13A0815A172BD1DC1D985793D02

In our example, we told Apache to process all URLs that start with 'images'. But those URLs have already been polluted by Tomcat with jsessionid the very first time they were hit. As a result, Apache was trying to process them, and was failing miserably, so images didn't get displayed the first time a user hit a page. If the user refreshed the page, images would get displayed properly (if the user had cookies enabled).

The solution I found for this issue was to use a RewriteRule that would get rid of the jsessionid in every URL that starts with 'images'. This seemed to do the trick.

That's about it. I hope this helps somebody. It's the result of some very intense googling :-)

If you have comments or questions, please leave them here and I'll try to answer them.

Monday, May 07, 2007

JRuby buzz

I wish I could put "Jython buzz" as the title of my post, but unfortunately I can't seem to detect any Jython buzz anywhere. JRuby though seems to generate a lot of it, judging by this InfoQ article on Mingle, a commercial application based on JRuby and created by ThoughtWorks Studios.

One thing I found very interesting in the InfoQ article was that ThoughtWorks preferred to develop Mingle with JRuby (which is the JVM-based version of Ruby) over writing it on top of Ruby on Rails. They cite ease of deployment as a factor in favor of JRuby:

"In particular, the deployment story for Ruby on Rails applications is still significantly more complex than it should be. This is fine for a hosted application where the deployment platform is in full control of a single company, but Mingle isn't going to be just hosted. Not only is it going to need to scale ‘up’ to the sizes of Twitter (okay, that's wishful thinking and maybe it won't need to scale that much) but it's also going to need to scale ‘down’ to maybe a simple Windows XP machine with just a gig of RAM. On top of that, it's going to be installed by someone who doesn't understand anything about Ruby on Rails deployment and, well, possibly not much about deployment either."

They continue by saying that their large commercial customers wanted to be able to deploy Mingle by dropping a Java .war file under any of the popular Java application servers.

So, for all the talk about Ruby on Rails and the similarly hot Python frameworks, Java and J2EE are far from dead.

Here's wishing that Jython will start generating the same amount of buzz.

Tuesday, May 01, 2007

Michael Dell uses Ubuntu on his home laptop

Found this on Planet Ubuntu, which is buzzing with the news that Dell will be offering laptops preloaded with Feisty Fawn. I still use Edgy on my Inspiron 6000, but I'll probably upgrade to Feisty soon. Or maybe I'll just wait for a Gutsy Gibbon to burst on the scene :-)

Saturday, April 28, 2007

Thursday, April 26, 2007

PyCon07 Testing Tools Tutorial slides up

Spurred by a request from David Brochu, I put the PyCon 07 Testing Tools Tutorial slides up on Titus's slides are named titus-tutorial-a.pdf through -e.pdf. My slides are only one PDF file, as my portion of the tutorial consisted mainly in Selenium and FitNesse demos. Enjoy!

Mounting local file systems using the 'bind' mount type

Sometimes paths are hardcoded in applications -- let's say you have the path to the Apache DocumentRoot directory hardcoded inside a web application to /home/apache/ You can't change the code of the web app, but you want to migrate it. You don't want to use the same path on the new server, for reasons of standardization across servers. Let's say you want to set DocumentRoot to /var/www/

But /home is NFS-mounted, so that all users can have their home directory kept in one place. One not-so-optimal solution would be to create an apache directory under /home on the NFS server. At that point, you can create a symlink to /var/www/ inside /home/apache. This is suboptimal because the production servers will come to depend on the NFS-mounted directory. You would like to keep things related to your web application local to each server running that application.

A better solution (suggested by my colleague Chris) is to mount a local directory, let's call it /opt/apache_home, as /home/apache. Since the servers are already using automount, this is a question of simply adding this line as the first line in /etc/auto.home:

apache -fstype=bind :/opt/apache_home

/etc/auto.home was already referenced in /etc/auto.master via this line:

/home /etc/auto.home

Note that we're using the neat trick of mounting a local file system via the 'bind' mount type. This can be very handy in situations where symbolic links don't help, because you want to reference a real directory, not a file pointing to a directory. See also this blog post for other details and scenarios where this trick is helpful.

Now all applications that reference /home/apache will actually use /opt/apache_home.

For the specific case of the DocumentRoot scenario above, all we needed to do at this point was to create a symlink inside /opt/apache_home, pointing to the real DocumentRoot of /var/www/

Thursday, March 29, 2007

Dell to offer pre-installed Linux on desktops

I knew about the Dell survey which asked people whether they'd like to see Linux pre-installed on Dell desktops and laptops. Looks like more than 70% of the respondents said yes (see this BBC story) -- so Dell is going for it. Now the question is which flavor(s) of Linux will be offered. From what I remember, the survey mentioned Fedora Core, Ubuntu and OpenSuse. Regardless, this is a pretty big win for Linux.

Wednesday, March 28, 2007

OLPC and the Romanian politicians

Interesting blog post from Jani Monoses on how the Romanian parliament rejected the country's participation in the OLPC program. All the arguments centered around cost and lack of applications such as....MS Word! As Jani says -- cluelessness abounds.

Having seen Ivan Krstic's keynote on OLPC at PyCon this year, I realize that the One Laptop Per Child program is mainly about re-introducing kids to their intuitive ways of learning, through play, peer activities and free exploration, as opposed to the centralized, one-to-many teaching method that is used in schools everywhere. The laptop becomes in this case just a tool for facilitating the new ways of learning -- or I should say the old ways, since this is what kids do naturally. But this is one of those disruptive ideas that is hard to grasp by serious grown-up people, especially politicians...

Thursday, March 22, 2007

File sharing with Apache and WebDAV

If you want to share files from a Linux box to Windows clients, Samba is a popular solution. However, it can also be done with Apache and WebDAV. Here is a short HOWTO.

1) Let's say we want to share files in a directory named /usr/share/myfiles. I created a sub-directory called dav in that directory, and then I ran:
# chmod 775 dav
# chgrp apache dav
2) Make sure httpd.conf loads the mod_dav modules:
LoadModule dav_module modules/
LoadModule dav_fs_module modules/
3) Create an Apache password file (if you want to use basic authentication) and a user -- let's call it webdav:
# htpasswd -c /etc/httpd/conf/.htpasswd webdav

4) Create a virtual host entry in httpd.conf, similar to this one:
<VirtualHost *>
DocumentRoot "/usr/share/myfiles"
<Directory "/usr/share/myfiles">
Options Indexes FollowSymLinks MultiViews
AllowOverride AuthConfig
Order allow,deny
allow from all
ErrorLog share-error.log
CustomLog share-access.log combined

DavLockDB /tmp/DavLock

<Location /dav>
Dav On
AuthType Basic
AuthName DAV
AuthUserFile /etc/httpd/conf/.htpasswd
Require valid-user

5) Restart httpd, verify that if you go to you are prompted for a user name and password, and that once you get past the security dialog you can see something like 'Index of /dav'.

Now it's time to configure your Windows client to see the shared WebDAV resource. On the Windows client, either:
  • go to "My network connections" and add a new connection, or
  • go to Windows Explorer->Tools->Map Network Drive, then click on "Signup for online storage or connect to a network server"
Either option will bring up the "Add Network Place Wizard".
  • Click Next, then select "Choose another network location", then click Next.
  • For "Internet or network address", set At this point you'll be prompted for a user name/password; specify the ones you defined above.
  • After mapping the resource, you should be able to read/write to it.

Sometimes the Windows dialog asking for a user name and password will say "connecting to" and will keep asking you for the user name/password. The dialog is supposed to show the text you set in AuthName (DAV in my case). If it doesn't, click Cancel, then try again. You can also try to force HTTP basic authentication (as opposed to Windows authentication, which is what Windows tries to do) by specifying as the URL. See also this entry on the WebDAV Wikipedia page.


Wednesday, March 21, 2007

Ubuntu "command not found" magic

Via Alan Pope's blog: Ubuntu Edgy and above includes a "command not found magic" -- a bash hook that intercepts 'command not found' errors and replaces them with more useful messages, such as what packages you need to install to get that command. I tried it on my Edgy laptop and what do you know, it actually works.

First you need to apt-get the command-not-found package:

$ sudo apt-get install command-not-found
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following extra packages will be installed:
The following NEW packages will be installed:
command-not-found command-not-found-data
0 upgraded, 2 newly installed, 0 to remove and 15 not upgraded.
Need to get 471kB/475kB of archives.
After unpacking 6263kB of additional disk space will be used.
Do you want to continue [Y/n]? y
Get:1 edgy/universe command-not-found-data 0.1.0 [471kB]
Fetched 300kB in 2s (109kB/s)
Selecting previously deselected package command-not-found-data.
(Reading database ... 92284 files and directories currently installed.)
Unpacking command-not-found-data (from .../command-not-found-data_0.1.0_i386.deb) ...
Selecting previously deselected package command-not-found.
Unpacking command-not-found (from .../command-not-found_0.1.0_all.deb) ...
Setting up command-not-found-data (0.1.0) ...
Setting up command-not-found (0.1.0) ...

Then you need to open a new shell window, so that the hook gets installed. In that window, try running some commands which are part of packages that you don't yet have installed. For example:

$ nmap
The program 'nmap' is currently not installed, you can install it by typing:
sudo apt-get install nmap

$ snort
The program 'snort' can be found in the following packages:
* snort-pgsql
* snort-mysql
* snort
Try: sudo apt-get install

Pretty cool, huh. And of course the bash hook is written in Python.

Monday, March 19, 2007

Founder of Debian joins Sun

Interesting announcement from Ian Murdoch, the founder of Debian: he's joining Sun as "Chief Operating Platforms Officer" (now that's a mouthful.) I've been really skeptical so far about Sun's embrace of Open Source, but this seems like a major step in the right direction.

CheeseRater - voting in the CheeseShop

Found via Joe Gregorio's blog: CheeseRater, a slick-looking Django app that lets you vote on CheeseShop packages. Interesting idea. Since tagging/redditing/social networking/web2.0ing/etc. are all the rage these days, this might prove more popular than the automatic scoring that Michał is doing with the Cheesecake Service. Maybe we can combine the two....that would be interesting.

Sunday, March 18, 2007

Stone soup as a cure for broken windows

Stones are used to break windows, but in this insightful blog post, Dave Nicolette shows how making stone soup (i.e. getting everybody to contribute a bit of something) can help deal with the broken windows syndrome. Getting everybody to contribute is an art I've been trying to master myself, but like any art, it's far from easy...

Wednesday, March 14, 2007

A few good agile men

Kumar sent me a link to a hilarious blog post: A few good managers. It's a gem. Excerpt:

"Marketing: "Did you cut the automated, edit sync [insert favorite feature here] feature?"

Development: "I did the job I was hired to do."

Marketing: "Did you cut the automated, edit sync feature?"
Development: "I delivered the release on time."

Marketing: "Did you cut the automated, edit sync feature?"
Development: "You're g%$#@*& right I did!""

Tuesday, February 27, 2007

testing-in-python mailing list

Titus created a mailing list dedicated to topics related to testing in Python. If you're interested, you can subscribe via its mailman interface here. We hope it will become a discussion forum for things such as:
  • testing tools that people have successfully used in their projects
  • testing techniques that help in certain situations (mocking for example)
  • real life scenarios where a specific type of testing (e.g. functional) helped more than another type of testing (e.g. unit)
  • etc. etc.
I think I will affectionately refer to this list as TIP from now on :-)

Update 02/28/07

What do you know, the name TIP struck a chord, so Titus created an alias for it. You can now send email to the list via tip at too.

William McVey's PyCon notes as mindmap

Via Elliot Murphy's blog, a very nice mindmap created by William McVey and showing his notes from PyCon07. It demonstrates the power of mindmaps: a lot of information condensed in one page, with links to more detailed information below.

Also from William, other PyCon07 notes.

Monday, February 26, 2007

Testing Tools Panel at PyCon

I'd like to thank the participants in the Testing Tools Panel at PyCon07 for sharing their insights into testing with the audience. Here they are, in alphabetical order of first name:
  • Benji York: zope.testbrowser
  • Brian Dorsey: py.test (representing Holger Krekel)
  • Chad Whitacre: testosterone (created) switched to nose
  • Ian Bicking: paste.test.fixture, minimock, FitLoader
  • Jeff Younker: PyMock
  • Kumar McMillan: fixture - module for loading and referencing test data
  • Martin Taylor: test framework within TI
  • Neal Norwitz: PyChecker
  • Tim Couper: WATSUP (Windows GUI Testing)
  • Titus Brown: twill, scotch, figleaf, pinocchio

Matt Harrison has a very good write-up on the discussions we had during the panel (actually I lifted the list above from his blog post, because he summarized it so well).

One thing that I think all the participants felt, and maybe the audience too, was that 45 minutes was totally not sufficient for this kind of panel. And I know I felt the same thing with the other two panels, for Python-dev and for the Web frameworks. So I'd like to ask people to leave some comments on this post, with ideas about turning these panels into discussions that would last longer, at least 1 hour and even more.

My gut feeling is that there would be a lot of interest in getting framework/library/tool creators together and having a discussion/Q&A with them in front of the audience, with audience participation of course. I'm not sure what the best format would be for this kind of thing -- maybe a round table? But if we get enough ideas, maybe we can fit something like this in next year's PyCon schedule, and allocate it a generous amount of time.


During the Testing Tools tutorial that Titus and I gave at PyCon, there was a short discussion on testability -- what makes software more testable? I mentioned a list put together by Michael Bolton, and summarized/enhanced by Adam Goucher in this blog post. Recommended reading, both for developers who want to add testing hooks into their software, and for testers who want to know what to ask for from developers so that their life gets easier (and if you're one of the unfortunate souls who have to deal with Java or .NET, this blog post by Roy Osherove talks about testability and pure OOP.)

Although our tutorial was focused on tools and techniques for implementing test automation, we also mentioned that you will never be able to get rid of manual testing. Even though the Google testing team says that 'Life is too short for manual testing' (and I couldn't agree more with them), they hasten to qualify this slogan by adding that automated testing frees you up to do more meaningful exploratory testing.

My experience as a tester shows that the nastiest bugs are often discovered by manual testing. But when you do discover them manually, the best strategy is to write automated tests for them, so that you'll check your application in that particular area from that moment on, via an automated test suite which runs in your continuous integration system.

You do have an automated test suite, right? And it does run periodically (daily or upon on every check-in) in a continuous integration system, right? And you have everything set up so that you're notified by email or RSS feeds when something fails, right? And you fix failures quickly so that everything turns back to green, because you know that too much red, too often, leads to broken windows and bit rot, right?

If you answered No to any of these questions, then you are not testing your application, period (but you already knew this if you took our tutorial -- it was on the last slide :-)

Friday, February 23, 2007

Photos from PyCon panels

Here are photos I took today at the python-dev panel and at the Web frameworks panel. Enjoy!

BTW, here are two very good write-ups on the Web framework panel: one from Matt Harrison, the other from James Bennett.

PyCon day 1

Gave the testing tutorial with Titus yesterday; went pretty well from the feedback we got. We'll publish the slides soon.

Just got out of the first keynote, Ivan Krstic's talk on the "One Laptop Per Child" project. Pretty interesting -- here are some tidbits I remember:
  • OLPC wants to change the way teaching and learning is done these days; they want to go back to the time when preschool kids interacted with each other by playing, and learned naturally peer-to-peer (as opposed to institutionalized teaching, which is one-to-many)
  • contrary to popular opinion, the laptop does not have a hand crank (it would wear down too fast if it had one); however, the laptop can be powered by a pull string that reacts to the puller's strength and powers the device accordingly
  • the 2 rabbit ears are used for wireless; the laptop can speak 802.11s, a new protocol that can be used for fully meshed networking; as soon as one laptop is connected to the internet, all the other ones in its mesh will be connected too
  • the CPU is an AMD Geode at 366 MHz (not 400 or 500, actually 366)
  • no hard drive, uses 512 MB of flash storage
  • OS is a stripped-down version of Fedora
  • runs Python wherever it can (including the init boot daemon); some exceptions are the windowing system, the mDNS daemon, and the bus communication; pretty much all other user-level software, including the file system, is written in Python
  • the laptop has a 'show source' button which obviously shows Python source code that can be edited, etc.
  • no adult has ever been able to open the laptop in less than 2 minutes
  • no child has ever needed more than 30 seconds to open the laptop
  • two fortunate souls got an OLPC XO laptop today: Guido (as the creator of Python), and a guy who was able to recognize a very complicated formula that Ivan showed on a slide (the BBP formula for computing the n-th decimal place of pi); the guy needed approx. 1 minute to open the laptop; Guido's was already open, in a sign of respect I guess
  • OLPC needs good Python developers; if you're interested, check out

Saturday, February 17, 2007


Via Grady Booch's blog, a site which looks very promising: Wikipatterns. It identifies patterns and anti-patterns for wiki adoption. Here's an excerpt:

"Any grassroots, or bottom-up, strategy is the best place to start since the success of a wiki depends on building active, sustainable participation and this only happens when people see that the software is simple enough to immediately be useful, and meets their needs without requiring them to spend lots of extra time.

A good first step is to identify a group or department who would likely benefit the most from using a wiki, and whose people are open to trying new tools. If you're looking to expand wiki use in another group, look for the thought leader in the group - someone who is very forward thinking, respected by peers, and willing to Champion a new idea and get others around them involved."

Seems like a very good complement to a book I read recently: "Fearless Change: Patterns for Introducing New Ideas" by Mary Lynn Manns and Linda Rising.

Friday, February 16, 2007

Anybody doing LDom on Solaris Sparc?

Lazy Web-type question: has anybody tried out the brand new LDom functionality available in Solaris 10? From some googling around I've done, it seems Sun hasn't yet shipped the full-fledged LDom functionality in Solaris 10 version 11/06.

LDom seems like a cool way to partition some big Solaris Sparc boxes, if you have them. I wonder if the logical domains/virtual machines created with LDom can have Ubuntu installed on top of them (because Ubuntu supports Solaris Sparc).

The Buildbot project has a Trac instance as its home

Brian Warner just announced on the buildbot-devel mailing list that his Buildbot project has a brand new home in the form of a Trac instance. Check it out here. Glad to see the move, as Trac is so much nicer to work with than the clunky SourceForge interface.

Tuesday, February 13, 2007

Ubuntu not to activate proprietary drivers by default

This just in via Jonathan Carter's blog on Planet Ubuntu: the Ubuntu Technical Board has decided that Ubuntu will not activate proprietary drivers by default. Proprietary drivers will be provided for convenience, and users will of course be free to install them if they so choose. This should settle a controversy which has been raging for a while in the Ubuntu community.

Friday, February 09, 2007

Cheesecake Service launched

The Cheesecake Service is a result of Michał Kwiatkowski's hard work during the Google Summer of Code 2006. The Web interface for the service is based on and it has been up and running since last August, but we're making it public now, to coincide with the release of Cheesecake 0.6.1.

The back-end of the Cheesecake Service talks directly to the PyPI repository using the PyPI API in order to find out about packages newly posted to PyPI. Then the service uses Cheesecake and tries to download, install, and score the package. If your package is not there, it might mean you haven't released a new version after August 10, 2006, date from which we started to score packages. Let us know and we can manually score it so that it appears in the list.

Michał just posted a blog entry on Cheesecake and its Service, so please read it and let us know how we can improve on the various things he describes. We are aware that scoring packages is controversial, and we've been called names before because of it, but as Michał also says, the Cheesecake score is meant to be used as a relative number by creators of Python packages who can try to improve it, and not as an absolute ranking among packages. Think of it as an Apgar score for your software.

Kudos to Michał for his continuing hard work on new Cheesecake features and improvements.

Thursday, February 08, 2007

Internal blogs as project tracking tools

I'm reading "The Corporate Blogging Book" by Debbie Weil. She talks about two different types of blogs -- internal (Intranet-type) and external (public access) -- and she mentions how some big name companies use internal blogs for all sorts of purposes: knowledge/information sharing, email replacement, and, the thing that caught my attention, project management. Apparently IBM is big on all these things when it comes to internal blogs.

So I had a mini-revelation: an internal blog is a very good tool for tracking time you spend on various projects. Take the example of a hosting company -- it can set up an internal blog and have categories corresponding to various projects/customers; employees can jot down a summary of what they worked on each day, and put it in the appropriate category. After a while, a timeline of work done on particular projects emerges. Because posts are automatically dated, it's easy to see what you were working on 3 weeks or 3 months ago. And each blog post can contain links to more detailed howtos that are kept in a wiki which serves as a knowledge base. Blogs and wikis make entering information a snap, as opposed to more complicated project management/tracking tools. Blogs and wikis are also searchable, so finding information is easy. To me, this is a lean/agile way of keeping track of your work.

Anyway, maybe this is an obvious use of blogs, but to me it's new, and of course I'm going to implement it :-)

Monday, February 05, 2007

New job

Today was my first day at RIS Technology, a Web hosting/managed services company which hosts and, among other sites. My responsibilities include selecting and helping implement various technologies that make up our offerings to our customers. Of course, if you follow my blog, you know that Python and automated testing are big in my book, so you can be sure that our offerings will include both :-) Stay tuned for more details.

Sunday, January 28, 2007

Connecting to people on LinkedIn

I wasn't such a big LinkedIn fan until a short time ago, when a post by Guy Kawasaki caught my attention. Then synchronicity kicked in and Tennessee Leeuwenburg sent a message to the python-advocacy mailing list, asking Python developers to connect to each other on LinkedIn; here's what Tennessee had to say:

"One way to help spread Python would be to have a strong presence of Python developers in various online networks. One that springs to mind is LinkedIn, a job related social networking site.

If we could encourage Python developers to start adding eachother to their LinkedIn network, then we shoud be able to create a well-connected developer network with business and industry contacts. This should benefit everyone -- both people looking for Python developers, and also people looking for work."

So in the past week or so I started to send LinkedIn invitations to people I know, either by having worked with them, or through the various forums, mailing lists and Open Source communities I have been part of. It's amazing how many people we all know, if we think about it.

LinkedIn has several nice features that can help when you're looking for people to hire, or when you're looking for a job. Perhaps the easiest way to find people is to click on 'Advanced search' (the small link next to the main search box) and type something in the Keywords field. Try it with 'python' for example -- you'll see that a lot of people whose blogs are aggregated on Planet Python have a LinkedIn profile. Your next step, if you are a Python developer yourself, is to send invitations to people you want to connect with. If enough of us Pythonistas do this, our networks will become more and more interconnected, to everybody's advantage. And you can replace 'Pythonistas' with 'agilistas', 'rubyistas' or whatever your interest is.

It's also interesting to see how LinkedIn displays the number of the degrees of separation between yourself and people you are searching for. Amazingly enough, that number is usually 2 or 3, if not 1. This makes me think of Malcolm Gladwell's theory about Connectors in 'The Tipping Point', namely that there is a small number of people that have a LOT of connections. If you are connected to one of these Connectors, then all of a sudden you have a huge number of people in your network, and you can potentially benefit by introducing yourself to them as someone only 2 or 3 degrees of separation apart from them. This is true in my own network, where I am only 2 degrees of separation away from Guy Kawasaki for example. Why? Because a long long time ago I accepted a LinkedIn invitation from one Paul Davis, who has 500+ LinkedIn connections.

If I made you curious about LinkedIn, I'd advise you one more time to read Guy Kawasaki's blog post on how to improve your LinkedIn profile.

Speaking of jobs and hiring, if you are a hardcore Python programmer looking for work, especially in the D.C. area, the Zope Corporation is hiring.

Saturday, January 20, 2007

Pybots updates

Since my last Pybots-related blog post, a lot has happened. We added 2 buildslaves, a Sparc Solaris 10 host running the Django unit test suite (courtesy of Matthew Flanagan) and a G5 OSX host running the SQLAlchemy test suite (courtesy of Skip Montanaro). So now we have 10 buildslaves altogether. I also added more test suites to my Ubuntu Breezy buildslave. In addition to the Cheesecake test suite, I'm now running the unit test suites for the py library, nose, twill and Testoob.

Email notifications finally started to work too, after I finally figured out I was passing the wrong builder names to the MailNotification class. And we also have RSS feeds available. If you want to be notified of failures from all builders, subscribe to: or

To be notified of failures from the trunk builders, subscribe to: or

To be notified of failures from the 2.5 branch builders, subscribe to: or

Matthew Flanagan also added functionality that allows you to subscribe to a feed for a particular builder. For example, to subscribe to the feed for the "x86 Ubuntu Breezy trunk" builder, use this URL:

If you are interested in contributing a buildslave to the Pybots project, please send a message to the Pybots mailing list, or to me (grig at gheorghiu dot net), or leave a comment here.

Tuesday, January 09, 2007

Steve Rowe on "Letting test drive the process"

Just came across a blog post by Microsoft' Steve Rowe called "Letting test drive the process". Steve quotes an article by Richard Collins -- "Test, test and test again" -- and adds his own observations on the practices of involving testers early in the development process, and of building testable interfaces into the product instead of heavy UIs.

According to Steve Rowe, Microsoft's development and testing process follows these recommended practices. I quote Steve:

"Also, at Microsoft, testing begins from day one. Every product I've ever been involved with at Microsoft has had daily builds from very early on. Every product has also had what we call BVTs (build verfication tests) that are run every day right after the build completes. If any of their tests fail, the product is held until they can be fixed."

Hmmm...I would expect Microsoft to have less problems with their products in this case. But I think a couple of problems that plague Microsoft in particular are backwards compatibility and the sheer amount of hardware/OS/service pack combinations that they need to test.

Speaking of Microsoft and testing, I found The Braidy Tester's blog very informative.

Monday, January 08, 2007

Testing tutorial at PyCon07

Titus and I will present a tutorial on "Testing Tools in Python" at PyCon07. It is scheduled for the afternoon session on Thursday Feb.22. It will be an improved version of our "Agile Development and Testing in Python" tutorial from last year.

Here is the tutorial outline (courtesy of Titus). If you have any suggestions, please leave a comment.

* Why test?
* What to test?
* Using testing to boost maintainability of code.

Setting up a project
* Source control management with Subversion.
* A brief introduction to using Trac for project
documentation and ticket management.
* Packaging with distutils
* Packaging with setuptools
* Registering your project with the Python Cheeseshop
* What else is out there? (distributed vs svn, roundup, ...)

Unit testing
* How to think about unit testing
* Using nose to run unit tests
* doctest-style unit tests
* What else is out there? (unittest, py.test, testosterone...)

Functional Web testing with twill
* Writing twill scripts
* Running twill scripts
* Using scotch to record actions
* Using wsgi_intercept to avoid network sockets
* What else is out there? (zope.testbrowser, mechanize, mechanoid)

Using code coverage in conjunction with unit/functional testing
* Basic code coverage with figleaf
* Monitoring code coverage in remote servers
* Combining figleaf code coverage analyses
* What else is out there (coverage)

** BREAK **

Acceptance testing with FitNesse/PyFit
* How FitNesse works
* Writing fixtures
* Running Python fixtures

Web application testing with Selenium
* How Selenium works
* Writing and recording Selenium tests
* Scripting Selenium tests remotely with SeleniumRC
* What else is out there? (Sahi, Watir)

Continuous integration with buildbot
* Introduction to buildbot
* Discussion of concepts, demonstration.
* Integrating tests into buildbot.
* GUI testing in buildbot.
* Using pybots to test your open source project

* Why test, revisited
* Maintainability and testing

Testing Tools Panel at PyCon07: questions needed

I will be moderating the Testing Tools Panel at PyCon07, currently scheduled from 11:40 AM to 12:25 PM on Sat. Feb. 24th (and immediately followed by Guido's keynote). I put together a Wiki page for the panel, with questions and topics that I thought would be interesting for the audience (and with some input from Ian Bicking.)

I'd be very grateful if people who plan on attending the panel could add more questions or topics of interest either by directly editing the Wiki page, or by leaving a comment here, or by sending me an email at grig at Thanks in advance!

Sunday, January 07, 2007

New Year's resolution

Here's one New Year resolution I'm trying to keep: each day, read the corresponding page for that date from "The Daily Drucker". I blogged about this book before, and I continue to be amazed at the insight and wisdom that Drucker manages to pack in almost every sentence he writes. Although Drucker writes about general practices of management and leadership, many of his ideas can be easily applied to software development in general, and testing in particular.

Here are some fragments from January 4th, on "Organizational inertia", which can be applied just as well to any software project ("bitrot" and "goldplating" come to mind):

"All organizations need to know that virtually no program or activity will perform effectively for a long time without modifications and redesign. Eventually every activity becomes obsolete."

"Businessmen are just as sentimental about yesterday as bureaucrats. They are just as likely to respond to the failure of a product or program by doubling the efforts invested in it. But they are, fortunately, unable to indulge freely in their predilections. They stand under an objective discipline, the discipline of the market. They have an objective outside measurement, profitability.And so they are forced to slough off the unsuccessful and unproductive sooner or later."

And how do you measure the efficiency of an organization? By testing, testing, testing:

"All organizations must be capable of change. We need concepts and measurements that give to other kinds of organizations what the market test and profitability yardstick give to business. Those tests and yardsticks will be quite different."

Friday, January 05, 2007

Cheesecake now including PEP8 checks

The inclusion actually happened a couple of weeks ago. I saw Johann Rocholl's message on conp.lang.python.announce where he talked about his module -- a tool which checks Python modules against some of the style conventions in PEP8.

Here's a sample output of running against one of the modules in the Cheesecake project. By default, pep8 reports only the first occurrence of the error or warning. The numbers after the file name represent the line and column where the error/warning occurred:

$ python E401 multiple imports on one line W291 trailing whitespace E301 expected 1 blank line, found 0 W602 deprecated form of raising exception E302 expected 2 blank lines, found 1 E501 line too long (85 characters)
If you want to see all occurrences, use the --repeat flag.

If you just want to see how many lines in a given file have PEP8-related errors/warnings, use the --statistics flag, along with -qq, which quiets the default output:

$ python --statistics -qq
3 E301 expected 1 blank line, found 0
4 E302 expected 2 blank lines, found 1
1 E401 multiple imports on one line
1 E501 line too long (85 characters)
40 W291 trailing whitespace
1 W602 deprecated form of raising exception
You can also pass multiple file and directory names to, and it will give you an overall line count when you use the --statistics flag.

So now includes a check for PEP8 compatibility as part of the 'code kwalitee' index. To compute the PEP8 score, it only looks at types of errors and warnings, not at the line count for each type. It subtracts 1 from the code kwalitee score for each warning type reported by pep8, and 2 for each error type reported. Johann told me he'll try to come up with a scoring scheme within the pep8 module, so when that's ready I'll just use it instead of my ad-hoc one. Kudos to Johann for creating a very useful module.

Tuesday, January 02, 2007

Martin Fowler's "Mocks Aren't Stubs" -- updated version

As the title says, Martin Fowler just announced a significant update -- indeed, a rewrite -- of his classic "Mocks Aren't Stubs". Most of the terminology he uses in the new version of the article is borrowed from Gerard Meszaros' xUnit Patterns jargon. Highly recommended. I'm not so vain as to believe that my post on mock testing influenced Martin's rewrite, so I'm just going to invoke synchronicity here :-)

Using AWS CloudWatch Logs and AWS ElasticSearch for log aggregation and visualization

If you run your infrastructure in AWS, then you can use CloudWatch Logs and AWS ElasticSearch + Kibana for log aggregation/searching/visuali...