Agile Testing

Monday, May 05, 2008

Ruby to Python bytecode compiler

Kumar beat me to it, but I'll mention it here too: Why the Lucky Stiff published a Ruby-to-Python-bytecode compiler, as well as tools to decompile the byte code into source code. According to the README file, he based his work on blog posts by Ned Batchelder related to dissecting Python bytecode. I wholeheartedly agree with Why's comment at the end of the README file:

  You know, it's crazy that Python
 and Ruby fans find themselves
 battling so much.  While syntax
 is different, this exercise
 proves how close they are to
 each other!  And, yes, I like
 Ruby's syntax and can think much
 better in it, but it would be
 nice to share libs with Python
 folk and not have to wait forever
 for a mythical VM that runs all
 possible languages.

Tuesday, April 29, 2008

Special guest for next SoCal Piggies meeting

We'll have the SoCal Piggies meeting this Thursday May 1st at the Gorilla Nation office in Culver City. Our special guest will be Ben Bangert, the creator of Pylons, who will give us an introduction to his framework. We'll also have a presentation from Pablo Noego from Gorilla Nation on a chat application he wrote using Google App Engine. We'll probably also have an informal discussion on Python mock testing tools and techniques.

BTW, I am putting together a Google code project for mock testing techniques in Python, in preparation for a presentation I would like to give to the group at some point. I called the project moctep, in honor of that ancient Egyptian deity, the protector of testers (or mockers, or maybe both). It doesn't have much so far, but there's some sample code you can browse through in the svn repository if you're curious. I'll be adding more meat to it soon.

Anyway, if you're a Pythonista who happens to be in the L.A. area on Thursday, please consider attending our meeting. It will be lots of fun, guaranteed.

Tuesday, April 22, 2008

"OLPC Automated Testing" project accepted for SoC

I'm happy to say that Zach Riggle's application for this year's Google Summer of Code, "OLPC Project Automated Testing", was accepted. I'm looking forward to mentoring Zach, and having Titus as a backup mentor. There's some very cool stuff that can be done in this area, and I hope that at the end of the summer we'll have some solid automated testing techniques and tools that can be applied to any Python project, not only to the OLPC Sugar environment. Stay tuned for more info on this project. BTW, here is the list of PSF-sponsored applications accepted for this years' SoC.

Thursday, April 17, 2008

Come work for RIS Technology

We just posted this on craigslist, but it never hurts to blog about it too. If you're interested, send an email to techjobs at ristech.net. You and I might get to work together on the same team!

Open Source Tech Top Guns Wanted

Are you a passionate Linux user? Are you running the latest Ubuntu alpha release on your laptop just because you can? Are you wired to the latest technologies -- things like Amazon EC2/S3 and Google AppEngine? Are you a virtuoso when it comes to virtualization (Xen/VMWare)?

Do you program in Python? Do you take hard problems as personal challenges and don't give up until you solve them?

RIS Technology Inc. is a rapidly growing Los Angeles-based premium managed hosting provider that hosts and manages internet applications for medium to large size organizations nationwide. We have grown consistently at 100% each of the past four years and are currently hiring for additional growth at our corporate operations center near LAX, in Los Angeles, CA. We have immediate openings for dedicated and knowledgeable technology engineers. If the answer to the questions above is YES, then we'd like to extend an invitation to interview with us.

We are an equal opportunity employer and have excellent benefits. We realize that one of the main things that makes us excellent are the people we choose to work with. We look for the best and brightest and our goal is to make work less "work" and more fun.

Wednesday, April 16, 2008

Google App Engine feels constrictive

I've been toying a bit with Google App Engine. I was lucky enough to score one of the 10,000 developer accounts. I first went through their tutorial, which was fine. Then I tried to port a simple application that I used to run from the command line, which queried a range of IP addresses for their reverse DNS names. No luck. I was using the dnspython module, which in turn uses the Python socket module -- and socket is not available within the Google App Engine sandbox environment.

Also, I was talking to Michał on rewriting the Cheesecake service to run on Google App Engine, but he pointed out that cron jobs are not allowed, so that won't work either... It seems that with everything I've tried with GAE I've run into a wall so far. I know it's a 'paradigm change' for Web development, but still, I can't help wishing I had my favorite Python modules to play with.

What has your experience been with GAE so far? I know Kumar wrote a cool PyPI mirror in GAE, but I haven't seen many other 'real life' applications mentioned on Planet Python.

Friday, April 11, 2008

Ubuntu Gutsy woes with Intel 801 graphics card

I just upgraded my Dell Inspiron 6000 laptop to Ubuntu Gutsy last night. My graphics card is based on the Intel 810 chipset. After the upgrade, everything graphics-related was dog-slow. Scrolling in Firefox was choppy, IM-ing was choppy, even typing at the console was choppy. Surprisingly, I didn't find a lot of solutions to this problem. But many people on Ubuntu forums suggested disabling compiz/xgl, so that's what I ended up doing. In fact, I uninstalled all compiz and xgl-related packages, rebooted, and graphics became snappy again. Now back to trying to write an application to run on THE GOOGLE.

Thursday, April 10, 2008

Meme du jour: shell history

Here's mine from my Ubuntu laptop:


$ history|awk '{a[$2]++ } END{for(i in a){print a[i] " " i}}' |sort -rn|head
121 cd
91 ssh
82 ls
46 vi
28 python
26 scp
16 dig
12 more
7 twistd
6 rm

Thursday, April 03, 2008

Steve Loughran on 'Farms, Fabrics and Clouds'

Yesterday I and my colleagues at RIS Technology had the pleasure of attending a remote presentation given to us by Steve Loughran, who works as a researcher at HP Labs and is also a committer on the Ant project. I had seen Steve's slides from a presentation he gave at the University of Bristol on 'Farms, Fabrics and Clouds' back in December 2007, and I have been pestering him via email ever since, hoping to have him release a screencast. After much back and forth, Steve offered to simply present for now directly to us via Skype. He did it out of the goodness of his heart, but both he and I realized that there's a nice little business opportunity in this type of presentation: you release the slides with no audio, then you get hired to present to interested parties in person, remotely, via Skype and a shared set of slides, with a Q&A session at the end. Everybody wins in this scenario. Filing it in the 'ideas worth trying' category.

To come back to Steve's presentation -- here are the slides from a previous version. I hope he will soon post the updated version we saw yesterday, but the differences are not major. The co-author of the talk is Julio Guijarro. Their area of interest within HP Labs is the deployment of large applications across distributed resources and the management of these apps/resources with an eye to maximizing their output and minimizing their cost. A familiar (and hard) problem for everybody who works in the hosting industry.

Steve talked about how the infrastructure architectures have changed over the years from a single web server talking to a single database server, to clustering, and finally to server farms and computing-on-demand. The challenge for us 'server farmers' is to figure a way to manage thousands of servers, heaps of storage, a myriad of network infrastructure devices, and large distributed applications on top of that -- all while keeping everything purring and happy, running to their maximum potential. Sounds impossible, but Amazon seems to be doing a decent job at it. And in fact Steve spent quite some time talking about how Amazon changed the game by their S3 and EC2 offerings. Even though they're not quite ready for prime time in terms of production deployments, Amazon will soon get there. As a proof, see their recent introduction of static IP addresses in EC2, and of the possibility of running your application in different data centers.

In my opinion, the best of Steve's slides are the 'Assumptions that are now invalid' ones. They really turn the 'established facts and best practices' of infrastructure and application design on their heads. Here are some examples of assumptions that don't hold anymore in our day and time:

it is expensive to create, deploy and duplicate a new system, running a Linux image of your choice (see Instalinux as a counter-example)
system failure is unusal and 100% availability can be achieved
databases are the best form of storage
you need physical access to the data center
a single server farm needs to scale to infinity

My other favorite part, which is not in the online slides yet, is the concept of 'agile infrastructure'. I haven't seen this concept before applied to server hosting, but Steve has a great point here. If you look at something like Amazon EC2, where you can pay as you go, you can test you application in a smaller environment and then scale it up, you can move your application between data centers -- this is indeed an agile environment that also imposes some new demands on your application.

I really recommend that you check out Steve's slides. There's a lot to chew on, but you can't afford not to chew on it, if you have anything to do with the IT industry these days.

Here are a couple more links that might prove useful:

Anubis: a tuple-space implementation that uses multicast to share information between hosts within a site
SmartFrog: a technology from HP used to distribute and manage applications (think puppet but geared towards application deployment); see also Google video

Thanks again to Steve for presenting to us. Now, as a server farmer, I need to go back to my plow and try to improve it (maybe buy a tractor?)

Update: Steve has some more thoughts on the Agile Infrastructure concept. Intriguing. This is something I'll definitely keep a very close eye on and tinker with.

Wednesday, April 02, 2008

For you students interested in GSoC

If you're a student and you want to apply for a Python-related project for Google Summer of Code 2008, Matt Harrison has just the project for you. The project has to do with branch coverage analysis and reporting. Matt is willing to mentor too. It's a really good opportunity, so don't hesitate to apply. Hurry up though, the deadline is April 8th.

Tuesday, April 01, 2008

TurboGears and Pylons finally merging

This has been a long time coming, and fans of both projects have been eagerly waiting for it, but it's finally happened. Not sure if you've seen the announcements from Kevin Dangoor, Mark Ramm and Ben Bangert on their projects' mailing lists, but basically they boil down to "we feel like after the sprints at PyCon we made enough progress so that we can pull the trigger on merging the source code from the 2 projects in one common trunk." They make it sound like it was purely a technological problem, but I have my doubts about that. I think it was driven in part by the increasing popularity of Django. Unifying TurboGears and Pylons is a somewhat desperate measure to chip away at the Django market share. We'll see if it works or not. Check out the brand new page of the TurboPylons project.

Monday, March 31, 2008

ReviewBoard: open source code review tool

Via Marc Hedlund's post on O'Reilly Radar, here's an open source code review tool from VMWare: ReviewBoard. For all of us non-googlers out there, it's probably the next best thing to Guido's Mondrian (question: why has that tool not been released as open source?). Check out the sweet screenshots. The kicker though is that it uses Python and Django. Way to go, VMWare!

Python code complexity metrics and tools

There's a buzz in the air around code complexity, metrics, code coverage, etc. It started with Matt Harrison's PyCon presentation, then Ned Batchelder jumped in with a nice McCabe cyclomatic complexity computation/visualization tool, and now David Stanek posted about his pygenie tool -- which also measures the McCabe cyclomatic complexity of Python code. Now it's time to unify all these ideas in one powerful tool that computes not only complexity but also path or at least branch coverage. This would make a nice Google Summer of Code project. Too bad the deadline for 2008 GSoC applications is in 7 hours...Maybe for next year.

Update: David Goodger left a comment pointing me to Martin Blais's snakefood package, which computes and shows dependencies for your Python code. It's a good complement to the tools I mentioned above.

Friday, March 28, 2008

Recommended testing conference: CAST 2008

If you're a tester and are serious about learning and advancing in your trade, I warmly recommend the CAST 2008 conference which will be held in Toronto, July 14-16. The theme of the conference is "Beyond the Boundaries: Interdisciplinary Approaches to Software Testing" and the keynote speaker is none other than Jerry Weinberg. And it's REALLY hard to get Jerry Weinberg to speak at a conference, so you might as well take advantage of this opportunity. For more details on CAST 2008, download the PDF brochure.

It's a good time to be a Python programmer

We had the SoCal Piggies meeting at the Disney Animation Studios last night. It was a great meeting -- great presentations from Disney engineers on how they use Python at Disney (and they use it A LOT!), great food, great turnout, and great atmosphere. Let me tell you -- the Disney Animation Studios are *lush*. Thanks to Paul Hildebrandt for organizing the meeting.

I'll probably blog separately about the technical content of the presentations, but for now I just wanted to comment on the fact that everybody seems to be hiring Python programmers -- Gorilla Nation and Virgin Charter are just two companies in the L.A. area that are aggressively looking to hire Python talent. Another thing: we used to have difficulties in finding venues for our meetings. We used to meet at either USC or Caltech, and around 10-12 people max. would show up. Now companies are clamoring for organizing the meetings at their offices, and we have 20-30 people in the audience, with many new faces at every meeting. Even more: Ruby on Rails programmers are showing up at our meetings, looking for an opportunity to be more involved with Python!

I take that as a sign that Python has arrived. It's a good time to be a Python programmer (or tester, for that matter.)

Tuesday, March 25, 2008

Easy parsing with pyparsing

If you haven't used Paul McGuire's pyparsing module yet, you've been missing out on a great tool. Whenever you hit a wall trying to parse text with regular expressions or string operations, 'think pyparsing'.

I had the need to parse a load balancer configuration file and save certain values in a database. Most of the stuff I needed was fairly easily obtainable with regular expressions or Python string operations. However, I was stumped when I encountered a line such as:


bind http "Customer Server 1" http "Customer Server 2" http

This line 'binds' a 'virtual server' port to one or more 'real servers' and their ports (I'm using here this particular load balancer's jargon, but the concepts are the same for all load balancers.)

The syntax is 'bind' followed by a word denoting the virtual server port, followed by one or more pairs of real server names and ports. The kicker is that the real server names can be either a single word containing no whitespace, or multiple words enclosed in double quotes.

Splitting the line by spaces or double quotes is not the solution in this case. I started out by rolling my own little algorithm and keeping track of where I am inside the string, then I realized that I'm actually writing my own parser at this point. Time to reach for pyparsing.

I won't go into the details of how to use pyparsing, since there is great documentation available (see Paul's PyCon06 presentation, the examples on the pyparsing site, and also Paul's O'Reilly Shortcut book). Basically you need to define your grammar for the expression you need to parse, then translate it into pyparsing-specific constructs. Because pyparsing's API is so intuitive and powerful, the translation process is straightforward.

Here's how I ended up implementing my pyparsing grammar:


from pyparsing import *

def parse_bind_line(line):
   quoted_real_server = dblQuotedString.setParseAction(removeQuotes)
   real_server = Word(alphas, printables) | quoted_real_server
   port = Word(alphanums)
   real_server_port = Group(real_server + port)
   bind_expr = Suppress(Literal("bind")) + \
               port + \
               OneOrMore(real_server_port)
   return bind_expr.parseString(line)

That's all there is to it. You need to read it from the bottom up to see how the expression gets decomposed into elements, and elements get decomposed into sub-elements.

I'll explain each line, starting with the last one before the return:


   bind_expr = Suppress(Literal("bind")) + \
               port + \
               OneOrMore(real_server_port)

A bind expression starts with the literal "bind", followed by a port, followed by one or more real server/port pairs. That's pretty much what the line above actually says, isn't it. The Suppress construct tells pyparsing that we're not interested in returning the literal "bind" in the final token list.


   real_server_port = Group(real_server + port)

A real server/port pair is simply a real server name followed by a port. The Group construct tells pyparsing that we want to group these 2 tokens in a list inside the final token list.


   port = Word(alphanums)

A port is a word composed of alphanumeric characters. In general, word means 'a sequence of characters containing no whitespace'. The 'alphanums' variable is a special pyparsing variable already containing the list of alphanumeric characters.


   real_server = Word(alphas, printables) | quoted_real_server

A real server is either a single word, or an expression in quotes. Note that we can declare a pyparsing Word with 2 arguments; the 1st argument specifies the allowed characters for the initial character of the word, whereas the 2nd argument specified the allowed characters for the body of the word. In this case, we're saying that we want a real server name to start with an alphabetical character, but other than that it can contain any printable character.


   quoted_real_server = dblQuotedString.setParseAction(removeQuotes)

Here is where you can glimpse the power of pyparsing. With this single statement we're parsing a sequence of words enclosed in double quotes, and we're saying that we're not interested in the quotes. There's also a sglQuotedString class for words enclosed in single quotes. Thanks to Paul for bringing this to my attention. My clumsy attempt at manually declaring a sequence of words enclosed in double quotes ran something like this:


no_quote_word = Word(alphanums+"-.")
quoted_real_server = Suppress(Literal("\"")) + \
                      OneOrMore(no_quote_word) + \
                      Suppress(Literal("\""))
quoted_real_server.setParseAction(lambda tokens: " ".join(tokens))

The only useful thing you can take away from this mumbo-jumbo is that you can associate an action with each token. When pyparsing will encounter that token, it will apply the action (function or class) you specified on that token. This is useful for doing validation of your tokens, for example for a date. Very powerful stuff.

Now it's time to test my function on a few strings:


if __name__ == "__main__":
   tests = """\
bind http "Customer Server 1" http "Customer Server 2" http
bind http "Customer Server - 11" 81 "Customer Server  12" 82
bind http www.mywebsite.com-server1 http www.mywebsite.com-server2 http
bind ssl www.mywebsite.com-server1 ssl www.mywebsite.com-server2 ssl
bind http TEST-server http
bind http MY-cluster-web11 83 MY-cluster-web-12 83
bind http cust1-server1.site.com http cust1-server2.site.com http
""".splitlines()

   for t in tests:
       print parse_bind_line(t)

Running the code above produces this output:


$ ./parse_bind.py
['http', ['Customer Server 1', 'http'], ['Customer Server 2', 'http']]
['http', ['Customer Server - 11', '81'], ['Customer Server  12', '82']]
['http', ['www.mywebsite.com-server1', 'http'], ['www.mywebsite.com-server2', 'http']]
['ssl', ['www.mywebsite.com-server1', 'ssl'], ['www.mywebsite.com-server2', 'ssl']]
['http', ['TEST-server', 'http']]
['http', ['MY-cluster-web11', '83'], ['MY-cluster-web-12', '83']]
['http', ['cust1-server1.site.com', 'http'], ['cust1-server2.site.com', 'http']]

From here, I was able to quickly identify for a given virtual server everything I needed: a virtual server port, and all the real server/port pairs associated with it. Inserting all this into a database was just another step. The hard work had already been done by pyparsing.

Once more, kudos to Paul McGuire for creating such an useful and fun tool.

Sunday, March 23, 2008

PyCon08 gets great coverage

Reports on the death of the PyCon conference as a community experience have been greatly exaggerated. I personally have never seen any PyCon edition as well covered in the blogs aggregated in Planet Python as the 2008 PyCon. If you don't believe me, maybe you'll believe Google Blog Search. I think the Python community is alive and well, and ready to rock at PyCon conferences for the foreseeable future. I'm looking forward to PyCon09 in Chicago, and then probably in San Francisco for 2010/11.

Wednesday, March 19, 2008

PyCon presenters, unite!

If you gave a talk at PyCon and haven't uploaded your slides to the official PyCon website yet, but you have posted them online somewhere else, please leave a comment to this post with the location of your slides. I'm helping Doug Napoleone upload the slides, since some authors have experienced issues when trying to upload the slides using their PyCon account. Thanks!

Tuesday, March 18, 2008

Links to resources from PyCon talks

I took some notes at the PyCon talks I've been to, and I'm gathering links to resources referenced in these talks. Hopefully they'll be useful to somebody (I know they will be to me at least.)

"MPI Cluster Programming with Python and Amazon EC2" by Pete Skomoroch

* slides in PDF format
* Message Passing Interface (MPI) modules for Python: mpi4py, pympi
* ElasticWulf project (Beowulf-like setup on Amazon EC2)
* IPython1: parallel computing in Python
* EC2 gotchas

"Like Switching on the Light: Managing an Elastic Compute Cluster with Python" by George Belotsky

* S3FS: mount S3 as a local file system using Fuse (unstable)
* EC2UI: Firefox extension for managing EC2 clusters
* S3 Organizer: Firefox extension for managing S3 storage
* bundling an EC2 AMI and storing it to S3
* the boto library, which allows programmatic manipulation of Amazon Web services such as EC2, S3, SimpleDB etc. (a python-boto package is available for most Linux distributions too; for example 'yum install python-boto)

* UML image repository

"PyTriton: building a petabyte storage system" by Jonathan Ellis

* All this was done at Mozy (online remote backup, now owned by EMC, just like Avamar, the company I used to work for)
* They maxed out Foundry load balancers, so they ended up using LVS + ipvsadm
* They used erasure coding for data integrity -- rolled their own algorithm but Jonathan recommended that people use zfec developed by AllMyData
* An alternative to erasure coding would be to use RAID6, which is used by Carbonite

"Use Google Spreadsheets API to create a database in the cloud" by Jeffrey Scudder

* slides online
* APIs and documentation on google code

"Supervisor as a platform" by Chris McDonough and Mike Naberezny

* slides online
* supervisord home page

"Managing complexity (and testing)" by Matt Harrison

* slides online
* PyMetrics module for measuring the McCabe complexity of your code
* coverage module and figleaf module for measuring your code coverage

Resources from lightning talks

* bug.gd -- online repository of solutions to bugs, backtraces, exceptions etc (you can easy_install bug.gd, then call error_help() after you get a traceback to try to get a solution)
* geopy -- geocode package
* pvote.org -- Ka-Ping Yee's electronic voting software in 460 lines of Python (see also Ping's PhD dissertation on the topic of Building Reliable Voting Machine Software)
* bitsyblog -- a minimalist approach to blog software

JP posted nose talk slides and code

If you attended Jason Pellerin's talk on the nose test framework at PyCon08, you'll be glad to know he just posted his slides and the sample app that shows how he writes and runs unit and functional tests under nose. I'm advertising this here because he only sent a message to the nose mailing list.

Monday, March 17, 2008

Slides and links from the Testing Tools tutorial

Here are my slides from the Testing Tools tutorial in PDF format. Not very informative I'm afraid -- I didn't actually show them to the attendees, I just talked about those topics while demo-ing them. If you want to find out more about the state of the Selenium project, watch this YouTube video of the Selenium Meetup at Google.

Here are some random thoughts on Selenium testing which I mentioned during the tutorial:

composing Selenium tests, especially for Ajax functionality, is HARD; the Selenium IDE helps a bit, but you still have to figure out how to wait for certain HTML elements to either appear or disappear from the page under test
version 1.0 of the Sel. IDE, soon to be released, will record Ajax actions by default, so hopefully this will speed up Selenium test creation
if you already have a Selenium Core test in HTML format, an easy way to obtain a Selenium RC test in Python is to open the HTML file in the Selenium IDE, then export the test case as Python; however, to actually make the resulting code readable/reusable, you have to do some pretty major refactoring
identifying HTML elements (or locators, as Selenium calls them) by their XPath value is hard, but it's sometimes the only way to get to them in order to assert something about them; I found tools such as XPath Checker, XPather and Firebug invaluable (they all happen to be Mozilla add-ons, but you can use Firefox to compose your tests, then run them in any browser supported by Selenium; however, YMMV especially when it comes to evaluating XPath expressions in IE)
because XPath locators are brittle in the face of constant HTML changes, please use HTML ID tags to identify your elements; I know at least one company (hi, Terry) where testers do not even start writing Selenium tests until developers have identified all elements of interest with an HTML ID

For people interested in FitNesse and on various acceptance testing topics, please see my blog posts in the Acceptance Testing and Web Application Testing sections of this page. If you are interested in how Titus and I tested the MailOnnaStick application, we have a whole wiki dedicated to this topic. Another resource of interest might be the Python Testing Tools Taxonomy wiki, with links to a myriad of Python testing tools.

I'll post soon on some topics that were discussed during the tutorial, especially on how to test an application against external interfaces or resources that are not under your control (think 'mocking').

Back from PyCon

PyCon08 is over. It's been an enjoyable experience, but a crowded one, with more than 1,000 people in attendance. The testing tutorial that Titus and I gave on Thursday went well. We tried to have it a bit more interactive than in the last 2 years, so we asked for people to send us their Web apps so we can test them 'in real time'. As it turned out, we only got back a handful of replies and almost no apps, but Christian Long sent me an app that I used to show some nifty Ajax testing with Selenium. We took a lot of questions from the audience on real problems they were facing, and I think we came up with some satisfactory answers/solutions. We need to think of what format we'll choose for next year (if any). Steve Holden and Doug Napoleone were happy with what they got out of the tutorial, Mike Pirnat was not because he had seen the same material last year. I think we did show some new techniques with the same tools that we've been showing for the last 3 years. But the overall content of the tutorial hasn't changed much. If you have any ideas of what you'd like to see next year, drop us a note.

If you read Planet Python or comp.lang.python, you know there's been a whirlwind of discussions around Bruce Eckel's "Pycon disappointment" post. My take on the commercialization of PyCon is that it hasn't been as bad as Bruce makes it sound. The vendor area was very isolated, in a corner room, and you could have easily missed it if you didn't see the people pouring out of there with all kinds of swag. And everybody was hiring! This is a good thing, people! Getting paid to work with your favorite programming language is a privilege that not many people are enjoying.

The other controversial aspect this year was the vendor involvement in the lightning talks. That was indeed highly annoying, and hopefully it won't happen again. I think the solution the organizers found last year -- separating the vendor-sponsored lightning talks from the regular ones -- worked pretty well. I didn't go to all the lightning talk sessions this year, but the one on Sunday was very enjoyable (once the sponsored ones were over). I liked the one on bitsyblog (a minimalist approach to building blog software), and the one by Martin v. Loewis on using Roundup for various non-bugtrack-related projects such as homework assignments. I also found out that Larry Hastings from Facebook is leading an effort to switch Facebook from PHP to a more enlightened language (you know which one), and also that slide.com, which offers what is apparently one of the most (if not THE most) popular Facebook apps, is using Python throughout its development process.

The technical talks were a mixed bag, just like at every other PyCon I participated in. No big surprise here, you can't make everybody happy no matter how hard you try (and the PyCon commitee tried hard, believe me). The critics have a point though, in that we need more advanced topics. Maybe we need a beginner track, an intermediate track and an advanced track. Also, 4 tracks is too much, I think 3 is a better number. My top 3 favorite talks were, in chronological order: "Supervisor as a platform" by Chris McDonough and Mike Naberezny, "Managing complexity (and testing)" by Matt Harrison, and "Introducing agile testing techniques to the OLPC project" by Titus Brown (I hope Titus will make good on his promise of putting together a screencast of the demo he showed, since it's mighty cool).

But the best part of PyCon is always meeting people. It's great to put a face to a name, especially when you're familiar with that person's work or blog. It's great to meet with people you met the previous years too. If nothing else, the socializing alone makes it worth attending PyCon. Think about it: where else would you meet Zed Shaw and have a chance to experience his colorful language and personality, and also hear him rant a bit about the ways in which Python sucks (although I hasten to add that he likes it a lot and he's starting to hack seriously in it). Can't wait to use one of Zed's first contributions to the Python world, a nifty utility called easy_f__ing_uninstall (feel free to fill in the blanks :-)

Whatever the critics say, I know I'll be back in Chicago next year for sure. I just want better network connectivity (why is it so hard to ensure decent wireless connectivity at PyCon year after year? it's a mystery) and better food.

Tuesday, March 04, 2008

Bright days for Python

Yes, the title is related to the news that were all over Planet Python yesterday, that Sun hired two prominent pythonistas (Ted Leung and Frank Wierzbicki) to work on Jython. Great news indeed. Jython seemed to lose steam and momentum as compared to other dynamic JVM languages (JRuby mainly), so it's very good to see that Sun is putting its weight behind it. Congrats to Ted and Frank.

Also, in other news, Greg Wilson just announced on his blog that he signed a contract with the Pragmatic Programmers to co-author a textbook for CS 101-type classes using Python as the programming language. Can't wait to see it in print!

Monday, March 03, 2008

Notes from the SoCal Piggies meeting on Feb. 28th

Here are the notes that I posted on the 'Happenings in Python Usergroups' blog.

Tuesday, February 12, 2008

Getting Things Done via your Inbox

I've been Getting Things Done long before it was cool to GTD. My method is simple: I keep my list of things to get done in my email Inbox. Once I get a thing done, I move it to a different folder, and I forget about it. This forces me to deal with incoming email at a very rapid pace. I either keep it in the Inbox because I know I'll work on it in the next few hours (or days), or I reply to it and I delete it, or I file it away, or I simply delete it. There are of course cases when my Inbox grows, but then I take drastic measures to reduce its size. I try to have no more than 25 messages in my Inbox at all times. OK, sometimes I have up to 50, but that's my limit.

Just thought I'd throw this out there in case it might help somebody in organizing their workload.

MS SQL Server is brain dead

Another post in the series "Why does this have to be so painful???"

I had to insert some data in a MS SQL Server database (don't ask, I just had to.) The same data was inserted into a MySQL database using a simple statement of the form:

INSERT INTO table1 (col1, col2, col3)
VALUES
(a,b,c),
(d,e,f),
(x,y,z);

I tried to do the same in MS SQL Server, and I was getting mysterious 'syntax error' messages, with no other explanation. In desperation, I only left one row of values in the statement, and boom, the syntax errors disappeared. I thought -- wait a minute, there's no way I can only insert one row at a time in MS SQL Server! But yes, that turned out to be the sad truth. I finally found a slightly better way to deal with multiple rows at a time in this blog post -- something extremely convoluted like this:

INSERT INTO table1 (col1, col2, col3)
SELECT
a, b, c
UNION ALL
SELECT
d, e, f
UNION ALL
SELECT
x, y, z

(be very careful not to end your statement with a UNION ALL)

What can I say, my post title says it all. Another thing I noticed while desperately googling around: there is precious little information about SQL Server on forums etc., in comparison with a huge wealth of info on MySQL. And the best information on SQL Server is on pay-if-you-want-to-see-the-answers forums such as experts-exchange.com. Not surprising, but sad.

Monday, February 11, 2008

Installing the TinyMCE spellchecker in SugarCRM

I had to go through the exercise of installing and configuring the TinyMCE spellchecker in a SugarCRM 5.0 installation today, so I'm blogging it here for future reference and google karma.

Here's my exact scenario:

- OS: CentOS 4.4 32-bit
- SugarCRM version: SugarCE-Full-5.0.0a (under /var/www/html/SugarCRM)

I first downloaded the TinyMCE spellchecker plugin v. 1.0.5 by following the link from the TinyMCE download page.

I unzipped tinymce_spellchecker_php_1_0_5.zip and copied the resulting spellchecker directory under my SugarCRM installation (in my case under /var/www/html/SugarCRM/include/javascript/tiny_mce/plugins).

I edited spellchecker/config.php and specified:


require_once("classes/TinyPspellShell.class.php"); // Command line pspell
$spellCheckerConfig['enabled'] = true;

The command-line pspell uses the binary /usr/bin/aspell, which in my case was already installed, but YMMV. If you want the embedded PHP spellchecker, you need to recompile PHP with the pspell module.

The main configuration of the layout of TinyMCE happens in /var/www/html/SugarCRM/include/SugarTinyMCE.php. I chose to add the spellchecker button in the line containing the definition of buttonConfig3:


var $buttonConfig3 = "tablecontrols,spellchecker,separator,advhr,hr,removeformat,separator,insertdate,inserttime,separator,preview";

I also added the spellchecker plugin in the same file, in the line containing the plugins definitions inside the defaultConfig definition:


'plugins' => 'advhr,insertdatetime,table,preview,paste,searchreplace,directionality,spellchecker',

I restarted Apache, just to make sure all the include files are re-read (not sure if it was necessary, but better safe than sorry), and lo and behold, the spellcheck button was there when I went to the Compose Email form. After entering some misspelled text, I clicked the spellcheck button. I was then able to right click the misspelled words and was offered a list of words to choose from.

If it sounds painless, rest assured, it wasn't. The documentation helped a bit, but it was sufficiently different from what I was seeing that I had to do some serious forensics-type searches within the SugarCRM file hierarchy. But all is well that ends well.

Saturday, February 09, 2008

Dilbert should get a life

I'm subscribed to the RSS feed for the daily Dilbert comic. I'm still enjoying it, but lately I've started to think -- why on Earth doesn't Dilbert just quit? Why do all the characters in that comic need to be so passive-aggressive about everything? Yes, I know it wouldn't be Dilbert without it, and Scott Adams wouldn't put so much bacon on the table either. But I can't help thinking that there's a big lie in it too, as if there's no escape from that shitty workplace. Dilbert dude, just don't go back there tomorrow! Let's see how your life turns around.

Update 2/12/08

Thanks to all who commented on my post. Many people seem to think, as I do, that Scott Adams is hypocritical and profits quite handsomely from the corporate culture he apparently despises. I also decided in the mean time to walk the walk, not only talk the talk, so I unsubscribed from the RSS feed.

Friday, February 08, 2008

And then there were 10

...where 10 is the current number of people who signed up for the Practical Applications of Agile (Web) Testing Tools tutorial that Titus and I will present at PyCon08. What this means is that the tutorial is in good standing, since each tutorial needs at least 10 attendees in order to actually happen. If you haven't signed up for a tutorial yet, check out ours -- it'll be fun.

Thursday, February 07, 2008

Got OpenID?

After reading this post from O'Reilly Radar on several big-name companies joining the OpenID Foundation, I decided to give OpenID a try myself. One of the providers recommended by the foundation was VeriSign, which sounded reputable to me, so I went ahead and got one from their Personal Identity Provider (PIP) site. Easy enough -- you choose a user name, a password, put your email in, then you get a so-called "OpenID access URL" of the form username.pip.verisignlabs.com.

To test my brand new OpenID, I went to one of the sites referenced on the PIP page, stikis.com. I chose the OpenID authentication method, entered the id I got from VeriSign, then I was redirected to the VeriSign PIP page, which asked me to specify for how long I want my trust relationship with stikis.com to last. I chose 1 year, clicked 'Allow', then was redirected back to the stikis.com site, already logged in to my personal start page. When I clicked on 'My Account' inside stikis.com, I saw my VeriSign access URL page at http://username.pip.verisignlabs.com. Pretty clean process. I hope more and more Web sites will accept OpenIDs. Since Google, Yahoo and Microsoft all joined the OpenID foundation, I'm pretty sure you'll be able to use an account you have with any of them as your OpenID (and indeed all of the above companies have announced that they'll be offering OpenIDs).

Monday, February 04, 2008

Podcasts from PyCon 2007 are available

Some podcasts from PyCon2007 are starting to trickle in on this page. You can listen for example to the Web Frameworks Panel and to the Testing Tools Panel.

James Shore's litmus test for a new language

Implement FIT. Sounds reasonable to me. After all, self-testing and acceptance testing should be a big part of a new and cool language. I'll make sure I ask Chuck Esterbrook this question about his Cobra language during our next SoCal Piggies meeting :-)

Saturday, February 02, 2008

Next SoCal Piggies meeting: Feb. 28th

Steve Wagner from Gorilla Nation graciously offered to host the next SoCal Piggies meeting at his company's offices in Culver City. We're still waiting for the full line-up of talks, but Chuck Esterbrook, of Webware fame, already offered to present an introduction to a new programming language he created, Cobra. Should be a lot of fun. To whet your appetite, Cobra is, in Chuck's words, "a new programming language with a Python-inspired syntax and first class support for unit tests and design-by-contracts."

So if you're in the area, please consider attending our meeting on Thursday, Feb. 28th, starting at 7 PM. Check out our wiki or the mailing list for more details.

Monday, January 28, 2008

How modern Web design is conducted

Via the 40. (with egg) blog, a time breakdown on how Web design is conducted in our day and time. Hmmm...maybe there's room for allocating some slice of that pie to testing, using Firebug for example. UPDATE: I meant debugging with Firebug and testing with twill and Selenium of course :-)

Friday, January 25, 2008

Checklist automation and testing

This is a follow-up to my previous post on writing automated tests for sysadmin-related checklists. That post seems to have struck a chord, judging by the comments it generated.

Here's the scenario I'm thinking about: you need to deploy a standardized set of packages and configurations to a bunch of servers. You put together a checklist detailing the steps you need to take on each server -- kickstart the box, run some post-install scripts, do some configuration customization, etc. At this point, you're already ahead of the game, and you're not relying solely on human memory. However, if you rely on a human being going manually through each step of the checklist on each server, you're in for some surprises in the guise of missed steps. The answer of course is to automate as many steps as you can, ideally all of them.

Now we're getting to the main point of my post: assuming you did automate all the steps of the checklist, and you ran your scripts on each server, do you REALLY have that warm and fuzzy feeling that everything is OK? You don't, unless you also have a comprehensive automated test suite that runs on every server and actually checks that stuff happened the way you intended.

Here are some concrete examples of stuff I'm verifying after the deployment of a certain type of our servers (running Apache/Tomcat).

OS-specific tests

* does the sudoers file contain certain users that need those rights
* is the sshd process set to start at boot time
* is the ClientAliveInterval variable set correctly in /etc/sshd/sshd_config
* are certain NFS mount points defined in /etc/fstab, and do they actually exist on the server
* is sendmail set to start at boot time, and running
* are iptables and/or SELinux configured the way they should be
* ....and more

Apache-specific tests

* is httpd set to start at boot time
* do the virtual host configuration files in /etc/httpd/conf.d contain the expected information
* has mod_jk been installed and configured properly (mod_jk provides the glue between Apache and Tomcat)
* is SSL configured properly
* does the /etc/logrotate.d/httpd configuration file contain the correct options (for example keep the logs for N days and compress them)
* etc.

Tomcat-specific tests

* has a specific version of Java been installed
* has Tomcat been installed in the correct directory, with the correct permissions
* has Tomcat been set to start at boot time
* etc.

Media-specific tests

* has ImageMagick been installed in the correct location
* does ImageMagick support certain file formats (JPG, PNG, etc)
* can ImageMagick actually process certain types of files (JPG, PNG, etc.)

Some of these tests could be run from a monitoring system (one of the commenters on my previous post mentioned that their sysadmins use Zimbrix; an Open Source alternative is Nagios, and there are many others.) However, a monitoring system typically doesn't go into the level of detail I described, especially when it comes to configuration files and other more advanced customizations. That's why I think it's important to use a real test framework and a real scripting language for this type of automated tests.

In my case, each type of test resides in its own file -- for example test_os.py, test_apache.py, test_tomcat.py, test_media.py. I run the tests using the nose test framework.

Here are some examples of small test functions. I'm using sets for making sure that expected lines are in certain files or in the output of certain commands, since most of the time I don't care about the order in which those lines appear.

From test_os.py:


def test_sshd_on():
    stdout, stdin = popen2.popen2('chkconfig sshd --list')
    lines = stdout.readlines()
    assert "sshd           \t0:off\t1:off\t2:on\t3:on\t4:on\t5:on\t6:off\n" in lines

From test_apache.py:


def test_logrotate_httpd():
    lines = open('/etc/logrotate.d/httpd').readlines()
    lines = set(lines)
    expected = set([
        "    rotate 100\n",
        "    compress\n",
        ])
    assert lines.issuperset(expected)

From test_tomcat.py:


def test_homedir():
    target_dir = '/opt/target'
    assert os.path.isdir(target_dir)
    (st_mode, st_ino, st_dev, st_nlink, st_uid, st_gid, st_size, st_atime, st_mtime, st_ctime) = os.stat(target_dir)
    assert st_uid == TARGET_UID, 'User wrong for %s' % pathname
    assert st_gid == TARGET_GID, 'Group wrong for %s' % pathname

From test_media.py:


def test_ImageMagick_listformat():
    stdout, stdin = popen2.popen2(''/usr/local/bin/identify --list format'')
    lines = stdout.readlines()
    lines = set(lines)
    expected = set([
    "      JNG* PNG       rw-   JPEG Network Graphics\n",
    "     JPEG* JPEG      rw-   Joint Photographic Experts Group JFIF format (62)\n",
    "      JPG* JPEG      rw-   Joint Photographic Experts Group JFIF format\n",
    "    PJPEG* JPEG      rw-   Progessive Joint Photographic Experts Group JFIF\n",
    "      JNG* PNG       rw-   JPEG Network Graphics\n",
    "      MNG* PNG       rw+   Multiple-image Network Graphics (libpng 1.2.10)\n",
    "      PNG* PNG       rw-   Portable Network Graphics (libpng 1.2.10)\n",
    "    PNG24* PNG       rw-   24-bit RGB PNG, opaque only (zlib 1.2.3)\n",
    "    PNG32* PNG       rw-   32-bit RGBA PNG, semitransparency OK\n",
    "     PNG8* PNG       rw-   8-bit indexed PNG, binary transparency only\n",
        ])
    assert lines.issuperset(expected)

As always, comments and suggestions are very welcome! Also see Titus's post for some sysadmin-related automated tests that he's running on a regular basis.

Tuesday, January 22, 2008

Stay away from the AT&T Tilt phone

Why? Because it's extremely fragile. I got one a couple of weeks ago (sponsored by my company, otherwise I wouldn't have shelled out $399) and after just 2 days I found the screen cracked in 2 places. It's true I carried it in my jacket's pocket while driving, and it probably jammed against my leg or something, but I've done that with other phones and didn't have this issue.

Of course, calls to the store, AT&T warranty and the manufacturer were all fruitless. The store won't accept it back because it's not in 'like new' state, and AT&T's warranty doesn't cover cracks in the screen. My best option at this point is to send it to the manufacturer for repairs, which will run me another $190. For now, I'm just using it as is, but I just want to tell whoever bothers to read this that I'm not happy -- not with the Tilt, and not with AT&T's customer service. There. Take that, AT&T.

Joel on checklists

Another entertaining blog post from Joel Spolsky, this time on some issues they had with servers and networking equipment hosted at a data center in Manhattan. It all comes down to a network switch which had its ports configured to automatically negotiate their speed. As a result, one port was misbehaving and brough their whole web site down. The conclusion reached by Joel and his sysadmin team: we need documentation, we need checklists. I concur, but as I said in a recent post, this is still not enough. Human beings are notoriously prone to skipping tests on checklists. What Joel and his team really need are AUTOMATED TESTS that run periodically and check every single thing on those checklists. You can easily automate the step which verifies that a port on the switch is set to 100 Mbps or 1 Gbps; you can either use SNMP, or some expect-like script.

In fact, at my own company I'm developing a pretty extensive automated test suite (written in Python of course, and using nose) that verifies all the steps we go through whenever we deploy a server or a network device. It's very satisfying to see those dots and have a total of N passed tests and 0 failed tests, with N increasing daily. Automated tests for sysadmin tasks is an area little explored, so there's lots of potential for cool stuff to happen. If you're doing something similar and have ideas to share, please leave a comment.

Wednesday, January 16, 2008

MySQL has been assimilated

...by Sun, for $1 billion. Bummer. I shudder whenever I see companies at the forefront of Open Source being gobbled up by giants such as Sun. I still don't know what Sun's Open Source strategy is -- they've been going back and forth with their support for Linux, and they seem to be pushing Open Solaris pretty heavily these days, although I personally don't know anybody in the OSS community who is using Open Solaris. UPDATE: Tim O'Reilly thinks this is a great fit for both Sun and MySQL, and says that "Sun has staked its future on open source, releasing its formerly proprietary crown jewels, including Solaris, Java, and the Ultra-Sparc processor design." Hmmm...maybe, but Sun has always struck me as being bent on world domination, just as bad as Microsoft.

Update 01/18/08: Here's a really good recap on the MySQL acquisition at InfoQ. Most people express a warm fuzzy feeling about this whole thing. I hope my apprehensions are unfounded.

In other acquisition-related news, Oracle agreed to buy BEA (the makers of WebLogic) for a paltry $7.85 billion.

Friday, January 11, 2008

Looking to hire a MySQL DBA

If you're based in the Los Angeles area and are looking for a job as a MySQL DBA, send me an email at grig at gheorghiu dot net. RIS Technology, the Web hosting company I'm working for, is looking for somebody to administer MySQL databases -- things such as database design, replication, data migration, SQL query analysis and optimization. The position can be either contract-based or full time. Experience with PostgreSQL is a plus. Experience with Python is a huge plus :-)

Wednesday, January 02, 2008

Testing Tutorial accepted at PyCon08

"Practical Applications of Agile (Web) Testing Tools", the tutorial that Titus and I proposed to the PyCon08 organizers, has been accepted -- cool! Should be a lot of fun. The list of accepted tutorials looks really good.

Here's the summary of our tutorial

Practical Applications of Agile (Web) Testing Tools
---------------------------------------------------

Have Web site?  Need testing?  Bring your tired (code), huddled (unit
tests), and cranky AJAX to us; we'll help you come up with tactics,
techniques, and infrastructure to help solve your problems.  This
includes integration with a unit test runner (nose); use of coverage
analysis (figleaf); straight HTTP driver Web testing (twill); Web
recording, examination, and playback (scotch); Selenium and Selenium
RC test script development; and continuous integration (buildbot).
We will focus on techniques for automating your Web testing for quick
turnaround, i.e. "agile" test automation.

If you have an application that needs automated tests, and if you're planning to attend our tutorial, drop us a line or leave a comment here with some details about your application.

10 technologies that will change your future

...according to an article in the Sydney Morning Herald (found via the O'Reilly Radar). Personally, I just want a chumby.

What's with the rants?

Some stars may have been aligned in a particularly nasty way on December 31, 2007, which might explain some rants that were published on various blogs. One of them at least has the quality of being humorous in a scatological sort of way: 'Rails is a Ghetto' by Zed Shaw, the creator of Mongrel -- although I'm fairly sure it doesn't seem humorous to the people he names in his post; and BTW, make sure there are no kids around when you read that post. I'd like to meet Zed one day, he's an interesting character who sure wears his heart on his sleeve.

I didn't find much humor though in James Bennett's rant against a blog post written by Noah Gift. I did find many gratuitous insults and uncalled for name-calling. As somebody said already -- chill, James! The comments on James's post are also revealing in their variety. Good thing the Python community also contains people like Ian Bicking who are trying to inject some civility and sanity into this.

For the record, I agree with Noah that documentation and marketing are two extremely important driving forces in the adoption of any framework. RoR's success is in no small part due to documentation, flashy screencasts and tireless marketing. And I also agree that Zope and its descendants would be so much better off with more marketing.

Saturday, December 22, 2007

The power of checklists (especially when automated)

Just stumbled on this post at InfoQ on the power of checklists. It talks about a low-tech approach to improving care in hospitals, by writing down the steps needed in various medical procedures and putting together a checklist for each case. I've seen the power of this approach at my own company -- until we put together checklists with things we have to do when setting up various servers or applications, we were guaranteed to skip one or more small but important steps.

I'd like to take this approach up a notch though: if you're in the software business, you actually need to AUTOMATE your checklists. Otherwise it's still very easy for a human being to skip a step. Scripts don't usually make that mistake. Yes, a human being still needs to run the script and to make intelligent decisions about the overall outcome of its execution. If you do take this approach, make sure your scripts also have checks and balances embedded in them -- also known as tests. For example, if your script retrieves a file over the network with wget, make sure the file actually gets on your local file system. A simple 'ls' of the file will convince you that the operation succeeded.

As somebody else once said, the goal here is to replace you (the sysadmin or the developer) with a small script. That will free you up to do more fun work.

Wednesday, December 05, 2007

GHOP students ROCK!

I've been involved in the GHOP project for the last couple of weeks (although not as much as I'd have liked, due to time constraints) and I've been constantly amazed by the high quality of the work produced by the GHOP participants, who, let's not forget, are all still in high-school! I think all mentors were surprised at the speed with which the tasks were claimed, and at the level of proficiency showed by the students.

This bodes very well for Open Source in general, and for the Python community in particular. I hope that the students will continue to contribute to existing Python projects and start their own.

Here are some examples from tasks that I've been involved with:

Michael Kremer profiled effbot's widefinder implementations and discussed the results superbly
Eren Turkay wrote unit tests for the pydigg module, obtaining almost 90% code coverage; he's currently tackling a 2nd task, writing unit tests for SimpleXMLRPC
iammisc (not sure what the real name is) is writing an API for interacting with MySpace

Other students have already submitted patches that were accepted and applied to projects such as the stdlib logging module.

If you want to witness all this for yourself, and maybe get some help for your project from some really smart students, send an email with your proposal for tasks to the GHOP discussion list.

Thursday, November 29, 2007

Interview with Jerry Weinberg at Citerus

Read it here. As usual, Jerry Weinberg has many thought-provoking things to say. My favorite:

"Q: If you're the J.K Rowling of software development, who's Harry P then?

A: Well, first of all, I'm not a billionaire, so it's probably not correct to say I'm the J.K. Rowling of software development. But if I were, I suspect my Harry Potter would be a test manager, expected to do magic but discounted by software developers because "he's only a tester." As for Voldemort, I think he's any project manager who can't say "no" or hear what Harry is telling him."

Testers are finally redeemed :-)

Vonnegut's last interview

Via Tim Ferriss's blog, an inspiring interview with Kurt Vonnegut. BTW, if you haven't read Tim's '4-hour workweek' book, I highly recommend it. It will make you green with envy, but it will also offer you some good ideas about organizing your work and your life a bit differently.

Monday, November 12, 2007

PyCon'08 Testing Tutorial proposal

It's that time of the year, when the PyCon organizers are asking for talk, panel, and tutorial proposals. Titus and I are thinking about doing a three-peat of our Testing tutorial, but this time....with a twist. Read about it on Titus's blog; then send us your code/application that you'd like to test, or problems you have related to testing. Should be lots of fun.

Friday, October 19, 2007

Pybots updates

Time for the periodical update on the Pybots project. Since my last post in July, John Hampton added a buildslave running Gentoo on x86 and testing Trac and SQLAlchemy. A belated thank you to John.

I also had to disable the tests for bzr dev on my RH 9 buildslave, because for some reason they were leaving a lot of orphaned/zombie processes around.

With help from Jean-Paul Calderone from the Twisted team, we managed to get the Twisted buildslave (running RH 9) past some annoying multicast-related failures. Jean-Paul had me add an explicit iptables rule to allow multicast traffic. The rule is:

iptables -A INPUT -j ACCEPT -d 225.0.0.0/24

This seemed to have done the trick. There are some Twisted unit tests that still fail -- some of them are apparently due to the fact that raising string exceptions is now illegal in the Python trunk (2.6). Jean-Paul will investigate and I'll report on the findings -- after all, this type of issues is exactly why we set up the Pybots farm in the first place.

As usual, I end with a plea to people interested in running Pybots buidlslaves to either send a message to the mailing list, or contact me directly at grig at gheorghiu dot net.

Compiling mod_python on RHEL 64 bit

I just went through the fairly painful exercise of compiling mod_python 3.3.1 on a 64-bit RHEL 5 server. RHEL 5 ships with Python 2.4.3 and mod_python 3.2.8. I needed mod_python to be compiled against Python 2.5.1. I had already compiled and installed Python 2.5.1 from source into /usr/local/bin/python2.5. The version of Apache on that server is 2.2.3.

I first tried this:


# tar xvfz mod_python-3.3.1.tar.gz
# cd mod_python-3.3.1
# ./configure --with-apxs==/usr/sbin/apxs --with-python=/usr/local/bin/python2.5
# make

...at which point I got this ugly error:


/usr/lib64/apr-1/build/libtool --silent --mode=link gcc -o mod_python.la \
-rpath /usr/lib64/httpd/modules -module -avoid-version    finfoobject.lo \
hlistobject.lo hlist.lo filterobject.lo connobject.lo serverobject.lo util.lo \
tableobject.lo requestobject.lo _apachemodule.lo mod_python.lo\
-L/usr/local/lib/python2.5/config -Xlinker -export-dynamic -lm\
-lpython2.5 -lpthread -ldl -lutil -lm
/usr/bin/ld: /usr/local/lib/python2.5/config/libpython2.5.a(abstract.o):
relocation R_X86_64_32 against `a local symbol' can not be used when making a shared object;
recompile with -fPIC
/usr/local/lib/python2.5/config/libpython2.5.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
apxs:Error: Command failed with rc=65536

I googled around for a bit, and I found this answer courtesy of Martin von Loewis. To quote:

It complains that some object file of Python wasn't compiled
with -fPIC (position-independent code). This is a problem only if
a) you are linking a static library into a shared one (mod_python, in this case), and
b) the object files in the static library weren't compiled with -fPIC, and
c) the system doesn't support position-dependent code in a shared library

As you may have guessed by now, it is really c) which I
blame. On all other modern systems, linking non-PIC objects
into a shared library is supported (albeit sometimes with a
performance loss on startup).

So your options are
a) don't build a static libpython, instead, build Python
with --enable-shared. This will give you libpython24.so
which can then be linked "into" mod_python
b) manually add -fPIC to the list of compiler options when
building Python, by editing the Makefile after configure has run
c) find a way to overcome the platform limitation. E.g. on
Solaris, the linker supports an impure-text option which
instructs it to accept relocations in a shared library.

You might wish that the Python build process supported
option b), i.e. automatically adds -fPIC on Linux/AMD64.
IMO, this would be a bad choice, since -fPIC itself usually
causes a performance loss, and isn't needed when we link
libpython24.a into the interpreter (which is an executable,
not a shared library).

Therefore, I'll close this as "won't fix", and recommend to
go with solution a).

So I proceeded to reconfigure Python 2.5 via './configure --enable-shared', then the usual 'make; make install'. However, I hit another snag right away when trying to run the new python2.5 binary:


# /usr/local/bin/python
python: error while loading shared libraries: libpython2.5.so.1.0: cannot open shared object file: No such file or directory

I remembered from other issues I had similar to this that I have to include the path to libpython2.5.so.1.0 (which is /usr/local/lib) in a ldconfig configuration file.

I created /etc/ld.so.conf.d/python2.5.conf with the contents '/usr/local/lib' and I ran


# ldconfig

At this point, I was able to run the python2.5 binary successfully.

I then re-configured and compiled mod_python with


# ./configure --with-apxs=/usr/sbin/apxs --with-python=/usr/local/bin/python2.5
# make

Finally, I copied mod_python.so from mod_python-3.3.1/src/.libs to /etc/httpd/modules and restarted Apache.

Not a lot of fun, that's all I can say.

Update 10/23/07

To actually use mod_python, I had to also copy the directory mod_python-3.3.1/lib/python/mod_python to /usr/local/lib/python2.5/site-packages. Otherwise I would get lines like these in the apache error_log when trying to hit a mod_python-enabled location:

[Mon Oct 22 19:41:20 2007] [error] make_obcallback: \
could not import mod_python.apache.\n \
ImportError: No module named mod_python.apache
[Mon Oct 22 19:41:20 2007] [error] make_obcallback:
Python path being used \
"['/usr/local/lib/python2.5/site-packages/setuptools-0.6c6-py2.5.egg', \
'/usr/local/lib/python25.zip', '/usr/local/lib/python2.5', \
'/usr/local/lib/python2.5/plat-linux2', \
'/usr/local/lib/python2.5/lib-tk', \
'/usr/local/lib/python2.5/lib-dynload', '/usr/local/lib/python2.5/site-packages']".
[Mon Oct 22 19:41:20 2007] [error] get_interpreter: no interpreter callback found.

Update 01/29/08

I owe Graham Dumpleton (the creator of mod_python and mod_wsgi) an update to this post. As he added in the comments, instead of manually copying directories around, I could have simply said:

make install

and the installation would have properly updated the site-packages directory under the correct version of python (2.5 in my case) -- this is because I specified that version in the --with-python option of ./configure.

Another option for the installation, if you want to avoid copying the mod_python.so file in the Apache modules directory, and only want to copy the Python files in the site-packages directory, is:

make install_py_lib

Update 06/18/10

From Will Kessler:

"You might also want to add a little note though. The error message may actually be telling you that Python itself was not built with --enable-shared. To get mod_python-3.3.1 working you need to build Python with -fPIC (use enable-shared) as well."

Thursday, October 04, 2007

What's more important: TDD or acceptance testing?

The answer, as far as I'm concerned, is 'BOTH'. Read these entertaining blog posts to see why: Roy Osherove's JAOO conference writeup (his take on Martin Fowler's accent cracked me up), Martin Jul's take on the pull-no-punches discussions on TDD between Roy O. and Jim Coplien, and also Martin Jul's other blog post on why acceptance tests are important.

As I said before, holistic testing is the way to go.

Wednesday, September 26, 2007

Roy Osherove book on "The art of unit testing"

Just found out from Roy Osherove's blog that his book on "The Art of Unit Testing" is available for purchasing online -- well, the first 5 chapters are, but then you get the next as they're being published. Roy uses NUnit to illustrate unit testing concepts and techniques, but that shouldn't deter you from buying the book, because the principles are pretty much the same in all languages. I'm a long time reader of Roy's blog and I can say this is good stuff, judging by his past posts on unit testing and mock testing techniques.

Wednesday, September 19, 2007

Beware of timings in your tests

Finally I get to write a post about testing. Here's the scenario I had to troubleshoot yesterday: a client of ours has a Web app that uses a java applet for FTP transfers to a back-end server. The java applet presents a nice GUI to end-users, allowing them to drag and drop files from their local workstation to the server.

The problem was that some file transfers were failing in a mysterious way. We obviously looked at the network connectivity between the user reporting the problem initially and our data center, then we looked at the size of the files he was trying to transfer (he thought files over 10 MB were the culprit). We also looked at the number of files transferred, both multiple files in one operation and single files in consecutive operations. We tried transferring files using both a normal FTP client, and the java applet. Everything seemed to point in the direction of 'works for me' -- a stance well-known to testers around the world. All of a sudden, around an hour after I started using the java applet to transfer files, I got the error 'unable to upload one or more files', followed by the message 'software caused connection abort: software write error'. I thought OK, this may be due to web sessions timing out after an hour. I did some more testing, and the second time I got the error after half an hour. I also noticed that I let some time pass between transfers. This gave me the idea of investigating timeout setting on the FTP server side (which was running vsftpd). And lo and behold, here's what I found in the man page for vsftpd.conf:

idle_session_timeout
The timeout, in seconds, which is the maximum time a remote client may spend between FTP commands. If the timeout triggers, the remote client is kicked off.

Default: 300

My next step was of course to wait 5 minutes between file transfers, and sure enough, I got the 'unable to upload one or more files' error.

Lesson learned: pay close attention to the timing of your tests. Also look for timeout settings both on the client and on the server side, and write corner test cases accordingly.

In the end, it was by luck that I discovered the cause of the problems we had, but as Louis Pasteur said, "Chance favors the prepared mind". I'll surely be better prepared next time, timing-wise.

Thursday, September 13, 2007

Barack Obama is now a connection

That's the message I see on my LinkedIn home page. How could this be possible, you ask? Well, yesterday I checked out my home page, and I noticed the 'featured question of the day' asked by Barack Obama himself (of course, the question was "how can the next president better help small businesses and entrepreneurs thrive".) A co-worker decided to send a LinkedIn invite to Barack. A little while later, he got the acceptance in his inbox. I followed his example, just for fun, and what do you know, I got back the acceptance in a matter of SECONDS, not even minutes! It seems that B.O. has set his LinkedIn account to accept each and every invite he gets. I guess when you're running for president, every little statistic counts. He already has 500+ connections, and I'm sure the time will come when he'll brag to the other candidates that his LinkedIn account is bigger than theirs.

The bottom line is that YOU TOO can have Barack as your connection, if only to brag to your friends about it.

Wednesday, September 12, 2007

I must be bored out of my mind

...otherwise why would I have taken the time to get my nerd score?

NerdTests.com says I'm an Uber Cool High Nerd. What are you? Click here!

Now YOU do it!

Thursday, September 06, 2007

Security testing book review on Dr. Dobbs site

I wrote a review for "The Art of Security Testing" a while ago for Dr. Dobbs. I found out only now that it's online at the Dr. Dobbs's Portal site. Read it here.

Wednesday, September 05, 2007

Weinberg on Agile

A short but sweet PM Boulevard interview with Jerry Weinberg on Agile management/methods. Of course, he says we need to drop the A and actually drop 'agile' altogether at some point, and just talk about "normal, sensible, professional methods of developing software." Count me in.

Tuesday, September 04, 2007

Jakob Nielsen on fancy formatting and fancy words

Just received the latest Alertbox newsletter from Jakob Nielsen. The topic is "Fancy Formatting, Fancy Words = Ignored". I'd have put 2 equal signs in there, but anyway....The 'ignored' in question is your web site, if you're trying to draw attention to important facts/figures by using red bold letters and pompous language. Nielsen's case study in the article is the U.S. Census Bureau's homepage, which displayed the current population of the US in big red bold letters, and called it "Population clock". As a result, users were confused as to the meaning of that number, and what's more, they didn't bother to even read the full number, because they thought it's an ad of some sort. Interesting stuff.

Friday, August 24, 2007

Some notes from the August SoCal Piggies meeting

Read them here.

Put your Noonhat on

You may have seen this already, but here's another short blurb from me: Brian Dorsey, a familiar face to those of you who have been at the last 2 or 3 PyCon conferences, has launched a Django-based Web site he called Noonhat. The tagline says it all: "Have a great lunch conversation". It's a simple but original, fun and hopefully viral idea: you specify your location on a map, then you indicate your availability for lunch, and Noonhat puts you in touch with other users who have signed up and are up for lunch in your area at that time.

This has potential not only for single people trying to find a date, but also for anybody who's unafraid of stepping out of their comfort zone and strike interesting conversations over lunch. Brian and his site have already been featured on a variety of blogs and even in mainstream media around Seattle. Check out the Noonhat blog for more details.

Well done, Brian, and may your site prosper (of course, the ultimate in prosperity is being bought by Google :-)

Tuesday, August 21, 2007

Fuzzing in Python

I just bought "Fuzzing: Brute Force Vulnerability Discovery" and skimmed it a bit. I was pleasantly surprised to see that Python is the language of choice for many fuzzing tools, and clearly the favorite language of the authors, since they implemented many of their tools in Python. See the fuzzing.org site/blog also, especially the Fuzzing software page. Sulley in particular seems a very powerful fuzzing framework. I need to look more into it (so much cool stuff, so little time.)

Update: got through the first 5-6 chapters of the book. Highly entertaining and educational so far.

Wednesday, August 08, 2007

Werner Vogels talk at QCon

Werner Vogels is the CTO of Amazon. You can watch a talk he gave at the QCon conference on the topics of Availability and Consistency. The bottom line is that, as systems scale (and for amazon.com that means hundreds of thousands of systems), you have to pick 2 of the following 3: Consistency, Availability, Partitioning (actually the full name of the third one is "Tolerance to network partitioning.) This is called the CAP theorem, and Eric Brewer from Inktomi first came up with it.

Vogels pretty much equated partitioning with failure. Failure is inevitable, so you have to choose it out of those 3 properties. You're left with a choice between consistency and availability, or between ACID and BASE. According to Vogels, it turns out there's also a middle-of-the-road approach, where you choose a specific approach based on the needs of a particular service. He gave the example of the checkout process on amazon.com. When customers want to add items to their shopping cart, you ALWAYS want to honor that request (obviously because that's $$$ in the bank for you). So you choose high availability, and you hide errors from the customers in the hope that the system will sort out the errors at a later stage. When the customer hits the 'Submit order' button, you want high consistency for the next phase, because several sub-systems access that data at the same time (credit card processing, shipping and handling, reporting, etc.).

I also liked the approach Amazon takes when splitting people into teams. They have the 2-pizza rule: if it takes more than 2 pizzas to feed a team, it means the team is too large and needs to be split up. This equates to about 8 people per team. They actually make architectural decisions based on team size. If a feature is deemed to large to be comprehended by a team of 8 people, they split the feature into smaller pieces that can be digested more easily. Very agile approach :-)

Anyway, good presentation, highly recommended.

Tuesday, August 07, 2007

Automating tasks with pexpect

I started to use pexpect for some of the automation needs I have, especially for tasks that involve logging into a remote device and running commands there. I found the module extremely easy to use, and the documentation on the module's home page is very good. Basically, if you follow the recipe shown there for logging into an FTP server, you're set.

A couple of caveats I discovered so far:

make sure you specify correctly the text you expect back; even an extra space can be costly, and make your script wait forever; you can add '.*' to the beginning or to the end of the text you're expecting to make sure you're catching unexpected characters
if you want to print the output from the other side of the connection, use child.before (where child is the process spawned by pexpect)

Here's a complete script for logging into a load balancer and showing information about a load balanced server and its real servers:


#!/usr/bin/env python

import pexpect

def show_virtual(child, virtual):
    child.sendline ('show server virtual %s' % virtual)
    child.expect('SSH@MyLoadBalancer>')
    print child.before

def show_real(child, real):
    child.sendline ('show server real %s' % real)
    child.expect('SSH@MyLoadBalancer>')
    print child.before

virtuals = ['www.mysite.com']
reals = ['web01', 'web02']

child = pexpect.spawn ('ssh myadmin@myloadbalancer')
child.expect ('.* password:')
child.sendline ('mypassword')
child.expect ('SSH@MyLoadBalancer>')

for virtual in virtuals:
    show_virtual(child, virtual)

for real in reals:
    show_real(child, real)

child.sendline ('exit')

Think twice before working from a Starbucks

Here's an eye-opening article talking about a tool called Hamster that sniffs wireless traffic and reveals plain-text cookies which can then be used to impersonate users. The guy running the tool was able to log in into some poor soul's Gmail account during a BlackHat presentation.

Pretty scary, and it makes me think twice before firing up my laptop in a public wireless hotspot. The people who wrote Hamster, from Errata Security, already released another tool called Ferret, which intercepts juicy bits of information -- they call it 'information seepage'. You can see a presentation on Ferret here. They're supposed to release Hamster into the wild any day now.

Update: If the above wasn't enough to scare you, here's another set of wireless hacking tools called Karma (see the presentation appropriately called "All your layers are belong to us".)

Thursday, August 02, 2007

That's what I call system testing

According to news.com, the IT systems for the 2008 Olympics in Beijing will be put through rigorous testing which will take more than 1 year! The people at Atos Origin, the company in charge of setting up the IT for the 2008 Olympics, clearly know what they are doing.

It's also interesting that the article mentions insiders as a security threat -- namely, that insiders will try to print their own accreditation badges, or do it for their friends, etc. As always, the human factor is the hardest to deal with. They say they resort to extensive background checks for the 2,500 or so IT volunteers, but I somehow doubt that will be enough.

Tuesday, July 31, 2007

For your summer reading list: book on Continuous Integration

I found out about this book from the InfoQ blog -- the book is called Continuous Integration: Improving Software Quality and Reducing Risk and it is written by 3 guys from Stelligent, who also blog regularly on testearly.com. Seems like a very interesting and timely read for people interested in automated testing and obviously in continuous integration (which to me are the 2 first stepping stones on the path to 'agile testing'). You can also read a chapter from the book in PDF format: "Continuous testing".

Monday, July 30, 2007

Notes from the SoCal Piggies meeting

Just published the notes from the SoCal Piggies meeting we had last week on the "Happenings in Python Usergroups" blog. Keywords: jabber, xmpppy, orbited, comet.

Saturday, July 28, 2007

Dilbert, the PHB, and automated tests

Today's Dilbert cartoon shows that even the PHB can think "agile". He tells Dilbert to go write his own automated test software, instead of buying off-the-shelf. That's got to be agile, with small "a" :-) Of course, it's not recommended to call your team members "big babies" during the stand-up meeting.

Friday, July 27, 2007

Your purpose is the Python group

At our last SoCal Piggies meeting 2 days ago, Diane Trout showed us some Jabber bots, one of them based on PyAIML, an Eliza/AI kind of bot. When Diane asked this awfully intelligent little bot to smile for the Python group, this is what it replied:

How did it guess??? I used to not be a big believer in AI, but now I'm sold.

Pybots updates

After a long hibernation period, the Pybots project shows some signs of life -- I should probably say mixed with signs of death. Elliot Murphy from Canonical added the Storm ORM project to his AMD64 Ubuntu Gutsy buildslave, while Manuzhai and Jeff McNeil had to drop their buildslaves out of the mix, hopefully only temporarily. In Manuzhai's case though, the project he was testing -- Trac -- proved to have maintainers that were not interested in fixing their failing tests. In this case, there is no point in testing that project in Pybots. Hopefully Manuzhai will find a different, more test-infected project to run in Pybots.

Speaking of test-infected projects, it was nice to see unit testing topping the list of topics in the Django tutorial given at OSCON by Jeremy Dunck, Jacob Kaplan-Moss and Simon Willison. In fact, Titus is quoted too on this slide, which seems to be destined to be a classic (and I'm proud he uttered those words during the Testing Tools Panel that I moderated at PyCon07). Way to go, Titus, but I'd really like to see some T-shirts sporting that quote :-)

Saturday, July 07, 2007

Another Django success story

Pownce is yet another social networking site, but with the added twist that the creator of digg is one of its founders. Read about the technologies used to build it (Django included) here.