Thursday, May 08, 2008
...have been posted to the "Happenings in Python User groups" blog.
Monday, May 05, 2008
Guido open sources Code Review app running on GAPE
Not sure why this wasn't publicized more, but Guido van Rossum announced today that he open sourced the code for Code Review, a Google AppEngine app he released last week. Code Review is based on Mondrian, the internal code review tool that Guido wrote for Google. The relationship between the two apps in terms of features is: Code Review < Mondrian.
The code for Code Review is part of a Google code project called Rietveld. I haven't looked at it yet, but I'll certainly do so soon, just to see the master's view on how to write a GAPE application.
The code for Code Review is part of a Google code project called Rietveld. I haven't looked at it yet, but I'll certainly do so soon, just to see the master's view on how to write a GAPE application.
Ruby to Python bytecode compiler
Kumar beat me to it, but I'll mention it here too: Why the Lucky Stiff published a Ruby-to-Python-bytecode compiler, as well as tools to decompile the byte code into source code. According to the README file, he based his work on blog posts by Ned Batchelder related to dissecting Python bytecode. I wholeheartedly agree with Why's comment at the end of the README file:
You know, it's crazy that Python
and Ruby fans find themselves
battling so much. While syntax
is different, this exercise
proves how close they are to
each other! And, yes, I like
Ruby's syntax and can think much
better in it, but it would be
nice to share libs with Python
folk and not have to wait forever
for a mythical VM that runs all
possible languages.
Tuesday, April 29, 2008
Special guest for next SoCal Piggies meeting
We'll have the SoCal Piggies meeting this Thursday May 1st at the Gorilla Nation office in Culver City. Our special guest will be Ben Bangert, the creator of Pylons, who will give us an introduction to his framework. We'll also have a presentation from Pablo Noego from Gorilla Nation on a chat application he wrote using Google App Engine. We'll probably also have an informal discussion on Python mock testing tools and techniques.
BTW, I am putting together a Google code project for mock testing techniques in Python, in preparation for a presentation I would like to give to the group at some point. I called the project moctep, in honor of that ancient Egyptian deity, the protector of testers (or mockers, or maybe both). It doesn't have much so far, but there's some sample code you can browse through in the svn repository if you're curious. I'll be adding more meat to it soon.
Anyway, if you're a Pythonista who happens to be in the L.A. area on Thursday, please consider attending our meeting. It will be lots of fun, guaranteed.
BTW, I am putting together a Google code project for mock testing techniques in Python, in preparation for a presentation I would like to give to the group at some point. I called the project moctep, in honor of that ancient Egyptian deity, the protector of testers (or mockers, or maybe both). It doesn't have much so far, but there's some sample code you can browse through in the svn repository if you're curious. I'll be adding more meat to it soon.
Anyway, if you're a Pythonista who happens to be in the L.A. area on Thursday, please consider attending our meeting. It will be lots of fun, guaranteed.
Tuesday, April 22, 2008
"OLPC Automated Testing" project accepted for SoC
I'm happy to say that Zach Riggle's application for this year's Google Summer of Code, "OLPC Project Automated Testing", was accepted. I'm looking forward to mentoring Zach, and having Titus as a backup mentor. There's some very cool stuff that can be done in this area, and I hope that at the end of the summer we'll have some solid automated testing techniques and tools that can be applied to any Python project, not only to the OLPC Sugar environment. Stay tuned for more info on this project. BTW, here is the list of PSF-sponsored applications accepted for this years' SoC.
Thursday, April 17, 2008
Come work for RIS Technology
We just posted this on craigslist, but it never hurts to blog about it too. If you're interested, send an email to techjobs at ristech.net. You and I might get to work together on the same team!
Open Source Tech Top Guns Wanted
Are you a passionate Linux user? Are you running the latest Ubuntu alpha release on your laptop just because you can? Are you wired to the latest technologies -- things like Amazon EC2/S3 and Google AppEngine? Are you a virtuoso when it comes to virtualization (Xen/VMWare)?
Do you program in Python? Do you take hard problems as personal challenges and don't give up until you solve them?
RIS Technology Inc. is a rapidly growing Los Angeles-based premium managed hosting provider that hosts and manages internet applications for medium to large size organizations nationwide. We have grown consistently at 100% each of the past four years and are currently hiring for additional growth at our corporate operations center near LAX, in Los Angeles, CA. We have immediate openings for dedicated and knowledgeable technology engineers. If the answer to the questions above is YES, then we'd like to extend an invitation to interview with us.
We are an equal opportunity employer and have excellent benefits. We realize that one of the main things that makes us excellent are the people we choose to work with. We look for the best and brightest and our goal is to make work less "work" and more fun.
Open Source Tech Top Guns Wanted
Are you a passionate Linux user? Are you running the latest Ubuntu alpha release on your laptop just because you can? Are you wired to the latest technologies -- things like Amazon EC2/S3 and Google AppEngine? Are you a virtuoso when it comes to virtualization (Xen/VMWare)?
Do you program in Python? Do you take hard problems as personal challenges and don't give up until you solve them?
RIS Technology Inc. is a rapidly growing Los Angeles-based premium managed hosting provider that hosts and manages internet applications for medium to large size organizations nationwide. We have grown consistently at 100% each of the past four years and are currently hiring for additional growth at our corporate operations center near LAX, in Los Angeles, CA. We have immediate openings for dedicated and knowledgeable technology engineers. If the answer to the questions above is YES, then we'd like to extend an invitation to interview with us.
We are an equal opportunity employer and have excellent benefits. We realize that one of the main things that makes us excellent are the people we choose to work with. We look for the best and brightest and our goal is to make work less "work" and more fun.
Wednesday, April 16, 2008
Google App Engine feels constrictive
I've been toying a bit with Google App Engine. I was lucky enough to score one of the 10,000 developer accounts. I first went through their tutorial, which was fine. Then I tried to port a simple application that I used to run from the command line, which queried a range of IP addresses for their reverse DNS names. No luck. I was using the dnspython module, which in turn uses the Python socket module -- and socket is not available within the Google App Engine sandbox environment.
Also, I was talking to MichaĆ on rewriting the Cheesecake service to run on Google App Engine, but he pointed out that cron jobs are not allowed, so that won't work either... It seems that with everything I've tried with GAE I've run into a wall so far. I know it's a 'paradigm change' for Web development, but still, I can't help wishing I had my favorite Python modules to play with.
What has your experience been with GAE so far? I know Kumar wrote a cool PyPI mirror in GAE, but I haven't seen many other 'real life' applications mentioned on Planet Python.
Also, I was talking to MichaĆ on rewriting the Cheesecake service to run on Google App Engine, but he pointed out that cron jobs are not allowed, so that won't work either... It seems that with everything I've tried with GAE I've run into a wall so far. I know it's a 'paradigm change' for Web development, but still, I can't help wishing I had my favorite Python modules to play with.
What has your experience been with GAE so far? I know Kumar wrote a cool PyPI mirror in GAE, but I haven't seen many other 'real life' applications mentioned on Planet Python.
Friday, April 11, 2008
Ubuntu Gutsy woes with Intel 801 graphics card
I just upgraded my Dell Inspiron 6000 laptop to Ubuntu Gutsy last night. My graphics card is based on the Intel 810 chipset. After the upgrade, everything graphics-related was dog-slow. Scrolling in Firefox was choppy, IM-ing was choppy, even typing at the console was choppy. Surprisingly, I didn't find a lot of solutions to this problem. But many people on Ubuntu forums suggested disabling compiz/xgl, so that's what I ended up doing. In fact, I uninstalled all compiz and xgl-related packages, rebooted, and graphics became snappy again. Now back to trying to write an application to run on THE GOOGLE.
Thursday, April 10, 2008
Meme du jour: shell history
Here's mine from my Ubuntu laptop:
$ history|awk '{a[$2]++ } END{for(i in a){print a[i] " " i}}' |sort -rn|head
121 cd
91 ssh
82 ls
46 vi
28 python
26 scp
16 dig
12 more
7 twistd
6 rm
Thursday, April 03, 2008
Steve Loughran on 'Farms, Fabrics and Clouds'
Yesterday I and my colleagues at RIS Technology had the pleasure of attending a remote presentation given to us by Steve Loughran, who works as a researcher at HP Labs and is also a committer on the Ant project. I had seen Steve's slides from a presentation he gave at the University of Bristol on 'Farms, Fabrics and Clouds' back in December 2007, and I have been pestering him via email ever since, hoping to have him release a screencast. After much back and forth, Steve offered to simply present for now directly to us via Skype. He did it out of the goodness of his heart, but both he and I realized that there's a nice little business opportunity in this type of presentation: you release the slides with no audio, then you get hired to present to interested parties in person, remotely, via Skype and a shared set of slides, with a Q&A session at the end. Everybody wins in this scenario. Filing it in the 'ideas worth trying' category.
To come back to Steve's presentation -- here are the slides from a previous version. I hope he will soon post the updated version we saw yesterday, but the differences are not major. The co-author of the talk is Julio Guijarro. Their area of interest within HP Labs is the deployment of large applications across distributed resources and the management of these apps/resources with an eye to maximizing their output and minimizing their cost. A familiar (and hard) problem for everybody who works in the hosting industry.
Steve talked about how the infrastructure architectures have changed over the years from a single web server talking to a single database server, to clustering, and finally to server farms and computing-on-demand. The challenge for us 'server farmers' is to figure a way to manage thousands of servers, heaps of storage, a myriad of network infrastructure devices, and large distributed applications on top of that -- all while keeping everything purring and happy, running to their maximum potential. Sounds impossible, but Amazon seems to be doing a decent job at it. And in fact Steve spent quite some time talking about how Amazon changed the game by their S3 and EC2 offerings. Even though they're not quite ready for prime time in terms of production deployments, Amazon will soon get there. As a proof, see their recent introduction of static IP addresses in EC2, and of the possibility of running your application in different data centers.
In my opinion, the best of Steve's slides are the 'Assumptions that are now invalid' ones. They really turn the 'established facts and best practices' of infrastructure and application design on their heads. Here are some examples of assumptions that don't hold anymore in our day and time:
I really recommend that you check out Steve's slides. There's a lot to chew on, but you can't afford not to chew on it, if you have anything to do with the IT industry these days.
Here are a couple more links that might prove useful:
Update: Steve has some more thoughts on the Agile Infrastructure concept. Intriguing. This is something I'll definitely keep a very close eye on and tinker with.
To come back to Steve's presentation -- here are the slides from a previous version. I hope he will soon post the updated version we saw yesterday, but the differences are not major. The co-author of the talk is Julio Guijarro. Their area of interest within HP Labs is the deployment of large applications across distributed resources and the management of these apps/resources with an eye to maximizing their output and minimizing their cost. A familiar (and hard) problem for everybody who works in the hosting industry.
Steve talked about how the infrastructure architectures have changed over the years from a single web server talking to a single database server, to clustering, and finally to server farms and computing-on-demand. The challenge for us 'server farmers' is to figure a way to manage thousands of servers, heaps of storage, a myriad of network infrastructure devices, and large distributed applications on top of that -- all while keeping everything purring and happy, running to their maximum potential. Sounds impossible, but Amazon seems to be doing a decent job at it. And in fact Steve spent quite some time talking about how Amazon changed the game by their S3 and EC2 offerings. Even though they're not quite ready for prime time in terms of production deployments, Amazon will soon get there. As a proof, see their recent introduction of static IP addresses in EC2, and of the possibility of running your application in different data centers.
In my opinion, the best of Steve's slides are the 'Assumptions that are now invalid' ones. They really turn the 'established facts and best practices' of infrastructure and application design on their heads. Here are some examples of assumptions that don't hold anymore in our day and time:
- it is expensive to create, deploy and duplicate a new system, running a Linux image of your choice (see Instalinux as a counter-example)
- system failure is unusal and 100% availability can be achieved
- databases are the best form of storage
- you need physical access to the data center
- a single server farm needs to scale to infinity
I really recommend that you check out Steve's slides. There's a lot to chew on, but you can't afford not to chew on it, if you have anything to do with the IT industry these days.
Here are a couple more links that might prove useful:
- Anubis: a tuple-space implementation that uses multicast to share information between hosts within a site
- SmartFrog: a technology from HP used to distribute and manage applications (think puppet but geared towards application deployment); see also Google video
Update: Steve has some more thoughts on the Agile Infrastructure concept. Intriguing. This is something I'll definitely keep a very close eye on and tinker with.
Wednesday, April 02, 2008
For you students interested in GSoC
If you're a student and you want to apply for a Python-related project for Google Summer of Code 2008, Matt Harrison has just the project for you. The project has to do with branch coverage analysis and reporting. Matt is willing to mentor too. It's a really good opportunity, so don't hesitate to apply. Hurry up though, the deadline is April 8th.
Tuesday, April 01, 2008
TurboGears and Pylons finally merging
This has been a long time coming, and fans of both projects have been eagerly waiting for it, but it's finally happened. Not sure if you've seen the announcements from Kevin Dangoor, Mark Ramm and Ben Bangert on their projects' mailing lists, but basically they boil down to "we feel like after the sprints at PyCon we made enough progress so that we can pull the trigger on merging the source code from the 2 projects in one common trunk." They make it sound like it was purely a technological problem, but I have my doubts about that. I think it was driven in part by the increasing popularity of Django. Unifying TurboGears and Pylons is a somewhat desperate measure to chip away at the Django market share. We'll see if it works or not. Check out the brand new page of the TurboPylons project.
Monday, March 31, 2008
ReviewBoard: open source code review tool
Via Marc Hedlund's post on O'Reilly Radar, here's an open source code review tool from VMWare: ReviewBoard. For all of us non-googlers out there, it's probably the next best thing to Guido's Mondrian (question: why has that tool not been released as open source?). Check out the sweet screenshots. The kicker though is that it uses Python and Django. Way to go, VMWare!
Python code complexity metrics and tools
There's a buzz in the air around code complexity, metrics, code coverage, etc. It started with Matt Harrison's PyCon presentation, then Ned Batchelder jumped in with a nice McCabe cyclomatic complexity computation/visualization tool, and now David Stanek posted about his pygenie tool -- which also measures the McCabe cyclomatic complexity of Python code. Now it's time to unify all these ideas in one powerful tool that computes not only complexity but also path or at least branch coverage. This would make a nice Google Summer of Code project. Too bad the deadline for 2008 GSoC applications is in 7 hours...Maybe for next year.
Update: David Goodger left a comment pointing me to Martin Blais's snakefood package, which computes and shows dependencies for your Python code. It's a good complement to the tools I mentioned above.
Update: David Goodger left a comment pointing me to Martin Blais's snakefood package, which computes and shows dependencies for your Python code. It's a good complement to the tools I mentioned above.
Friday, March 28, 2008
Recommended testing conference: CAST 2008
If you're a tester and are serious about learning and advancing in your trade, I warmly recommend the CAST 2008 conference which will be held in Toronto, July 14-16. The theme of the conference is "Beyond the Boundaries: Interdisciplinary Approaches to Software Testing" and the keynote speaker is none other than Jerry Weinberg. And it's REALLY hard to get Jerry Weinberg to speak at a conference, so you might as well take advantage of this opportunity. For more details on CAST 2008, download the PDF brochure.
It's a good time to be a Python programmer
We had the SoCal Piggies meeting at the Disney Animation Studios last night. It was a great meeting -- great presentations from Disney engineers on how they use Python at Disney (and they use it A LOT!), great food, great turnout, and great atmosphere. Let me tell you -- the Disney Animation Studios are *lush*. Thanks to Paul Hildebrandt for organizing the meeting.
I'll probably blog separately about the technical content of the presentations, but for now I just wanted to comment on the fact that everybody seems to be hiring Python programmers -- Gorilla Nation and Virgin Charter are just two companies in the L.A. area that are aggressively looking to hire Python talent. Another thing: we used to have difficulties in finding venues for our meetings. We used to meet at either USC or Caltech, and around 10-12 people max. would show up. Now companies are clamoring for organizing the meetings at their offices, and we have 20-30 people in the audience, with many new faces at every meeting. Even more: Ruby on Rails programmers are showing up at our meetings, looking for an opportunity to be more involved with Python!
I take that as a sign that Python has arrived. It's a good time to be a Python programmer (or tester, for that matter.)
I'll probably blog separately about the technical content of the presentations, but for now I just wanted to comment on the fact that everybody seems to be hiring Python programmers -- Gorilla Nation and Virgin Charter are just two companies in the L.A. area that are aggressively looking to hire Python talent. Another thing: we used to have difficulties in finding venues for our meetings. We used to meet at either USC or Caltech, and around 10-12 people max. would show up. Now companies are clamoring for organizing the meetings at their offices, and we have 20-30 people in the audience, with many new faces at every meeting. Even more: Ruby on Rails programmers are showing up at our meetings, looking for an opportunity to be more involved with Python!
I take that as a sign that Python has arrived. It's a good time to be a Python programmer (or tester, for that matter.)
Tuesday, March 25, 2008
Easy parsing with pyparsing
If you haven't used Paul McGuire's pyparsing module yet, you've been missing out on a great tool. Whenever you hit a wall trying to parse text with regular expressions or string operations, 'think pyparsing'.
I had the need to parse a load balancer configuration file and save certain values in a database. Most of the stuff I needed was fairly easily obtainable with regular expressions or Python string operations. However, I was stumped when I encountered a line such as:
This line 'binds' a 'virtual server' port to one or more 'real servers' and their ports (I'm using here this particular load balancer's jargon, but the concepts are the same for all load balancers.)
The syntax is 'bind' followed by a word denoting the virtual server port, followed by one or more pairs of real server names and ports. The kicker is that the real server names can be either a single word containing no whitespace, or multiple words enclosed in double quotes.
Splitting the line by spaces or double quotes is not the solution in this case. I started out by rolling my own little algorithm and keeping track of where I am inside the string, then I realized that I'm actually writing my own parser at this point. Time to reach for pyparsing.
I won't go into the details of how to use pyparsing, since there is great documentation available (see Paul's PyCon06 presentation, the examples on the pyparsing site, and also Paul's O'Reilly Shortcut book). Basically you need to define your grammar for the expression you need to parse, then translate it into pyparsing-specific constructs. Because pyparsing's API is so intuitive and powerful, the translation process is straightforward.
Here's how I ended up implementing my pyparsing grammar:
That's all there is to it. You need to read it from the bottom up to see how the expression gets decomposed into elements, and elements get decomposed into sub-elements.
I'll explain each line, starting with the last one before the return:
A bind expression starts with the literal "bind", followed by a port, followed by one or more real server/port pairs. That's pretty much what the line above actually says, isn't it. The Suppress construct tells pyparsing that we're not interested in returning the literal "bind" in the final token list.
A real server/port pair is simply a real server name followed by a port. The Group construct tells pyparsing that we want to group these 2 tokens in a list inside the final token list.
A port is a word composed of alphanumeric characters. In general, word means 'a sequence of characters containing no whitespace'. The 'alphanums' variable is a special pyparsing variable already containing the list of alphanumeric characters.
A real server is either a single word, or an expression in quotes. Note that we can declare a pyparsing Word with 2 arguments; the 1st argument specifies the allowed characters for the initial character of the word, whereas the 2nd argument specified the allowed characters for the body of the word. In this case, we're saying that we want a real server name to start with an alphabetical character, but other than that it can contain any printable character.
Here is where you can glimpse the power of pyparsing. With this single statement we're parsing a sequence of words enclosed in double quotes, and we're saying that we're not interested in the quotes. There's also a sglQuotedString class for words enclosed in single quotes. Thanks to Paul for bringing this to my attention. My clumsy attempt at manually declaring a sequence of words enclosed in double quotes ran something like this:
The only useful thing you can take away from this mumbo-jumbo is that you can associate an action with each token. When pyparsing will encounter that token, it will apply the action (function or class) you specified on that token. This is useful for doing validation of your tokens, for example for a date. Very powerful stuff.
Now it's time to test my function on a few strings:
Running the code above produces this output:
From here, I was able to quickly identify for a given virtual server everything I needed: a virtual server port, and all the real server/port pairs associated with it. Inserting all this into a database was just another step. The hard work had already been done by pyparsing.
Once more, kudos to Paul McGuire for creating such an useful and fun tool.
I had the need to parse a load balancer configuration file and save certain values in a database. Most of the stuff I needed was fairly easily obtainable with regular expressions or Python string operations. However, I was stumped when I encountered a line such as:
bind http "Customer Server 1" http "Customer Server 2" http
This line 'binds' a 'virtual server' port to one or more 'real servers' and their ports (I'm using here this particular load balancer's jargon, but the concepts are the same for all load balancers.)
The syntax is 'bind' followed by a word denoting the virtual server port, followed by one or more pairs of real server names and ports. The kicker is that the real server names can be either a single word containing no whitespace, or multiple words enclosed in double quotes.
Splitting the line by spaces or double quotes is not the solution in this case. I started out by rolling my own little algorithm and keeping track of where I am inside the string, then I realized that I'm actually writing my own parser at this point. Time to reach for pyparsing.
I won't go into the details of how to use pyparsing, since there is great documentation available (see Paul's PyCon06 presentation, the examples on the pyparsing site, and also Paul's O'Reilly Shortcut book). Basically you need to define your grammar for the expression you need to parse, then translate it into pyparsing-specific constructs. Because pyparsing's API is so intuitive and powerful, the translation process is straightforward.
Here's how I ended up implementing my pyparsing grammar:
from pyparsing import *
def parse_bind_line(line):
quoted_real_server = dblQuotedString.setParseAction(removeQuotes)
real_server = Word(alphas, printables) | quoted_real_server
port = Word(alphanums)
real_server_port = Group(real_server + port)
bind_expr = Suppress(Literal("bind")) + \
port + \
OneOrMore(real_server_port)
return bind_expr.parseString(line)
That's all there is to it. You need to read it from the bottom up to see how the expression gets decomposed into elements, and elements get decomposed into sub-elements.
I'll explain each line, starting with the last one before the return:
bind_expr = Suppress(Literal("bind")) + \
port + \
OneOrMore(real_server_port)
A bind expression starts with the literal "bind", followed by a port, followed by one or more real server/port pairs. That's pretty much what the line above actually says, isn't it. The Suppress construct tells pyparsing that we're not interested in returning the literal "bind" in the final token list.
real_server_port = Group(real_server + port)
A real server/port pair is simply a real server name followed by a port. The Group construct tells pyparsing that we want to group these 2 tokens in a list inside the final token list.
port = Word(alphanums)
A port is a word composed of alphanumeric characters. In general, word means 'a sequence of characters containing no whitespace'. The 'alphanums' variable is a special pyparsing variable already containing the list of alphanumeric characters.
real_server = Word(alphas, printables) | quoted_real_server
A real server is either a single word, or an expression in quotes. Note that we can declare a pyparsing Word with 2 arguments; the 1st argument specifies the allowed characters for the initial character of the word, whereas the 2nd argument specified the allowed characters for the body of the word. In this case, we're saying that we want a real server name to start with an alphabetical character, but other than that it can contain any printable character.
quoted_real_server = dblQuotedString.setParseAction(removeQuotes)
Here is where you can glimpse the power of pyparsing. With this single statement we're parsing a sequence of words enclosed in double quotes, and we're saying that we're not interested in the quotes. There's also a sglQuotedString class for words enclosed in single quotes. Thanks to Paul for bringing this to my attention. My clumsy attempt at manually declaring a sequence of words enclosed in double quotes ran something like this:
no_quote_word = Word(alphanums+"-.")
quoted_real_server = Suppress(Literal("\"")) + \
OneOrMore(no_quote_word) + \
Suppress(Literal("\""))
quoted_real_server.setParseAction(lambda tokens: " ".join(tokens))
The only useful thing you can take away from this mumbo-jumbo is that you can associate an action with each token. When pyparsing will encounter that token, it will apply the action (function or class) you specified on that token. This is useful for doing validation of your tokens, for example for a date. Very powerful stuff.
Now it's time to test my function on a few strings:
if __name__ == "__main__":
tests = """\
bind http "Customer Server 1" http "Customer Server 2" http
bind http "Customer Server - 11" 81 "Customer Server 12" 82
bind http www.mywebsite.com-server1 http www.mywebsite.com-server2 http
bind ssl www.mywebsite.com-server1 ssl www.mywebsite.com-server2 ssl
bind http TEST-server http
bind http MY-cluster-web11 83 MY-cluster-web-12 83
bind http cust1-server1.site.com http cust1-server2.site.com http
""".splitlines()
for t in tests:
print parse_bind_line(t)
Running the code above produces this output:
$ ./parse_bind.py
['http', ['Customer Server 1', 'http'], ['Customer Server 2', 'http']]
['http', ['Customer Server - 11', '81'], ['Customer Server 12', '82']]
['http', ['www.mywebsite.com-server1', 'http'], ['www.mywebsite.com-server2', 'http']]
['ssl', ['www.mywebsite.com-server1', 'ssl'], ['www.mywebsite.com-server2', 'ssl']]
['http', ['TEST-server', 'http']]
['http', ['MY-cluster-web11', '83'], ['MY-cluster-web-12', '83']]
['http', ['cust1-server1.site.com', 'http'], ['cust1-server2.site.com', 'http']]
From here, I was able to quickly identify for a given virtual server everything I needed: a virtual server port, and all the real server/port pairs associated with it. Inserting all this into a database was just another step. The hard work had already been done by pyparsing.
Once more, kudos to Paul McGuire for creating such an useful and fun tool.
Sunday, March 23, 2008
PyCon08 gets great coverage
Reports on the death of the PyCon conference as a community experience have been greatly exaggerated. I personally have never seen any PyCon edition as well covered in the blogs aggregated in Planet Python as the 2008 PyCon. If you don't believe me, maybe you'll believe Google Blog Search. I think the Python community is alive and well, and ready to rock at PyCon conferences for the foreseeable future. I'm looking forward to PyCon09 in Chicago, and then probably in San Francisco for 2010/11.
Wednesday, March 19, 2008
PyCon presenters, unite!
If you gave a talk at PyCon and haven't uploaded your slides to the official PyCon website yet, but you have posted them online somewhere else, please leave a comment to this post with the location of your slides. I'm helping Doug Napoleone upload the slides, since some authors have experienced issues when trying to upload the slides using their PyCon account. Thanks!
Tuesday, March 18, 2008
Links to resources from PyCon talks
I took some notes at the PyCon talks I've been to, and I'm gathering links to resources referenced in these talks. Hopefully they'll be useful to somebody (I know they will be to me at least.)
"MPI Cluster Programming with Python and Amazon EC2" by Pete Skomoroch
* slides in PDF format
* Message Passing Interface (MPI) modules for Python: mpi4py, pympi
* ElasticWulf project (Beowulf-like setup on Amazon EC2)
* IPython1: parallel computing in Python
* EC2 gotchas
"Like Switching on the Light: Managing an Elastic Compute Cluster with Python" by George Belotsky
* S3FS: mount S3 as a local file system using Fuse (unstable)
* EC2UI: Firefox extension for managing EC2 clusters
* S3 Organizer: Firefox extension for managing S3 storage
* bundling an EC2 AMI and storing it to S3
* the boto library, which allows programmatic manipulation of Amazon Web services such as EC2, S3, SimpleDB etc. (a python-boto package is available for most Linux distributions too; for example 'yum install python-boto)
"PyTriton: building a petabyte storage system" by Jonathan Ellis
* All this was done at Mozy (online remote backup, now owned by EMC, just like Avamar, the company I used to work for)
* They maxed out Foundry load balancers, so they ended up using LVS + ipvsadm
* They used erasure coding for data integrity -- rolled their own algorithm but Jonathan recommended that people use zfec developed by AllMyData
* An alternative to erasure coding would be to use RAID6, which is used by Carbonite
"Use Google Spreadsheets API to create a database in the cloud" by Jeffrey Scudder
* slides online
* APIs and documentation on google code
"Supervisor as a platform" by Chris McDonough and Mike Naberezny
* slides online
* supervisord home page
"Managing complexity (and testing)" by Matt Harrison
* slides online
* PyMetrics module for measuring the McCabe complexity of your code
* coverage module and figleaf module for measuring your code coverage
Resources from lightning talks
* bug.gd -- online repository of solutions to bugs, backtraces, exceptions etc (you can easy_install bug.gd, then call error_help() after you get a traceback to try to get a solution)
* geopy -- geocode package
* pvote.org -- Ka-Ping Yee's electronic voting software in 460 lines of Python (see also Ping's PhD dissertation on the topic of Building Reliable Voting Machine Software)
* bitsyblog -- a minimalist approach to blog software
"MPI Cluster Programming with Python and Amazon EC2" by Pete Skomoroch
* slides in PDF format
* Message Passing Interface (MPI) modules for Python: mpi4py, pympi
* ElasticWulf project (Beowulf-like setup on Amazon EC2)
* IPython1: parallel computing in Python
* EC2 gotchas
"Like Switching on the Light: Managing an Elastic Compute Cluster with Python" by George Belotsky
* S3FS: mount S3 as a local file system using Fuse (unstable)
* EC2UI: Firefox extension for managing EC2 clusters
* S3 Organizer: Firefox extension for managing S3 storage
* bundling an EC2 AMI and storing it to S3
* the boto library, which allows programmatic manipulation of Amazon Web services such as EC2, S3, SimpleDB etc. (a python-boto package is available for most Linux distributions too; for example 'yum install python-boto)
"PyTriton: building a petabyte storage system" by Jonathan Ellis
* All this was done at Mozy (online remote backup, now owned by EMC, just like Avamar, the company I used to work for)
* They maxed out Foundry load balancers, so they ended up using LVS + ipvsadm
* They used erasure coding for data integrity -- rolled their own algorithm but Jonathan recommended that people use zfec developed by AllMyData
* An alternative to erasure coding would be to use RAID6, which is used by Carbonite
"Use Google Spreadsheets API to create a database in the cloud" by Jeffrey Scudder
* slides online
* APIs and documentation on google code
"Supervisor as a platform" by Chris McDonough and Mike Naberezny
* slides online
* supervisord home page
"Managing complexity (and testing)" by Matt Harrison
* slides online
* PyMetrics module for measuring the McCabe complexity of your code
* coverage module and figleaf module for measuring your code coverage
Resources from lightning talks
* bug.gd -- online repository of solutions to bugs, backtraces, exceptions etc (you can easy_install bug.gd, then call error_help() after you get a traceback to try to get a solution)
* geopy -- geocode package
* pvote.org -- Ka-Ping Yee's electronic voting software in 460 lines of Python (see also Ping's PhD dissertation on the topic of Building Reliable Voting Machine Software)
* bitsyblog -- a minimalist approach to blog software
JP posted nose talk slides and code
If you attended Jason Pellerin's talk on the nose test framework at PyCon08, you'll be glad to know he just posted his slides and the sample app that shows how he writes and runs unit and functional tests under nose. I'm advertising this here because he only sent a message to the nose mailing list.
Monday, March 17, 2008
Slides and links from the Testing Tools tutorial
Here are my slides from the Testing Tools tutorial in PDF format. Not very informative I'm afraid -- I didn't actually show them to the attendees, I just talked about those topics while demo-ing them. If you want to find out more about the state of the Selenium project, watch this YouTube video of the Selenium Meetup at Google.
Here are some random thoughts on Selenium testing which I mentioned during the tutorial:
For people interested in FitNesse and on various acceptance testing topics, please see my blog posts in the Acceptance Testing and Web Application Testing sections of this page. If you are interested in how Titus and I tested the MailOnnaStick application, we have a whole wiki dedicated to this topic. Another resource of interest might be the Python Testing Tools Taxonomy wiki, with links to a myriad of Python testing tools.
I'll post soon on some topics that were discussed during the tutorial, especially on how to test an application against external interfaces or resources that are not under your control (think 'mocking').
Here are some random thoughts on Selenium testing which I mentioned during the tutorial:
- composing Selenium tests, especially for Ajax functionality, is HARD; the Selenium IDE helps a bit, but you still have to figure out how to wait for certain HTML elements to either appear or disappear from the page under test
- version 1.0 of the Sel. IDE, soon to be released, will record Ajax actions by default, so hopefully this will speed up Selenium test creation
- if you already have a Selenium Core test in HTML format, an easy way to obtain a Selenium RC test in Python is to open the HTML file in the Selenium IDE, then export the test case as Python; however, to actually make the resulting code readable/reusable, you have to do some pretty major refactoring
- identifying HTML elements (or locators, as Selenium calls them) by their XPath value is hard, but it's sometimes the only way to get to them in order to assert something about them; I found tools such as XPath Checker, XPather and Firebug invaluable (they all happen to be Mozilla add-ons, but you can use Firefox to compose your tests, then run them in any browser supported by Selenium; however, YMMV especially when it comes to evaluating XPath expressions in IE)
- because XPath locators are brittle in the face of constant HTML changes, please use HTML ID tags to identify your elements; I know at least one company (hi, Terry) where testers do not even start writing Selenium tests until developers have identified all elements of interest with an HTML ID
For people interested in FitNesse and on various acceptance testing topics, please see my blog posts in the Acceptance Testing and Web Application Testing sections of this page. If you are interested in how Titus and I tested the MailOnnaStick application, we have a whole wiki dedicated to this topic. Another resource of interest might be the Python Testing Tools Taxonomy wiki, with links to a myriad of Python testing tools.
I'll post soon on some topics that were discussed during the tutorial, especially on how to test an application against external interfaces or resources that are not under your control (think 'mocking').
Back from PyCon
PyCon08 is over. It's been an enjoyable experience, but a crowded one, with more than 1,000 people in attendance. The testing tutorial that Titus and I gave on Thursday went well. We tried to have it a bit more interactive than in the last 2 years, so we asked for people to send us their Web apps so we can test them 'in real time'. As it turned out, we only got back a handful of replies and almost no apps, but Christian Long sent me an app that I used to show some nifty Ajax testing with Selenium. We took a lot of questions from the audience on real problems they were facing, and I think we came up with some satisfactory answers/solutions. We need to think of what format we'll choose for next year (if any). Steve Holden and Doug Napoleone were happy with what they got out of the tutorial, Mike Pirnat was not because he had seen the same material last year. I think we did show some new techniques with the same tools that we've been showing for the last 3 years. But the overall content of the tutorial hasn't changed much. If you have any ideas of what you'd like to see next year, drop us a note.
If you read Planet Python or comp.lang.python, you know there's been a whirlwind of discussions around Bruce Eckel's "Pycon disappointment" post. My take on the commercialization of PyCon is that it hasn't been as bad as Bruce makes it sound. The vendor area was very isolated, in a corner room, and you could have easily missed it if you didn't see the people pouring out of there with all kinds of swag. And everybody was hiring! This is a good thing, people! Getting paid to work with your favorite programming language is a privilege that not many people are enjoying.
The other controversial aspect this year was the vendor involvement in the lightning talks. That was indeed highly annoying, and hopefully it won't happen again. I think the solution the organizers found last year -- separating the vendor-sponsored lightning talks from the regular ones -- worked pretty well. I didn't go to all the lightning talk sessions this year, but the one on Sunday was very enjoyable (once the sponsored ones were over). I liked the one on bitsyblog (a minimalist approach to building blog software), and the one by Martin v. Loewis on using Roundup for various non-bugtrack-related projects such as homework assignments. I also found out that Larry Hastings from Facebook is leading an effort to switch Facebook from PHP to a more enlightened language (you know which one), and also that slide.com, which offers what is apparently one of the most (if not THE most) popular Facebook apps, is using Python throughout its development process.
The technical talks were a mixed bag, just like at every other PyCon I participated in. No big surprise here, you can't make everybody happy no matter how hard you try (and the PyCon commitee tried hard, believe me). The critics have a point though, in that we need more advanced topics. Maybe we need a beginner track, an intermediate track and an advanced track. Also, 4 tracks is too much, I think 3 is a better number. My top 3 favorite talks were, in chronological order: "Supervisor as a platform" by Chris McDonough and Mike Naberezny, "Managing complexity (and testing)" by Matt Harrison, and "Introducing agile testing techniques to the OLPC project" by Titus Brown (I hope Titus will make good on his promise of putting together a screencast of the demo he showed, since it's mighty cool).
But the best part of PyCon is always meeting people. It's great to put a face to a name, especially when you're familiar with that person's work or blog. It's great to meet with people you met the previous years too. If nothing else, the socializing alone makes it worth attending PyCon. Think about it: where else would you meet Zed Shaw and have a chance to experience his colorful language and personality, and also hear him rant a bit about the ways in which Python sucks (although I hasten to add that he likes it a lot and he's starting to hack seriously in it). Can't wait to use one of Zed's first contributions to the Python world, a nifty utility called easy_f__ing_uninstall (feel free to fill in the blanks :-)
Whatever the critics say, I know I'll be back in Chicago next year for sure. I just want better network connectivity (why is it so hard to ensure decent wireless connectivity at PyCon year after year? it's a mystery) and better food.
If you read Planet Python or comp.lang.python, you know there's been a whirlwind of discussions around Bruce Eckel's "Pycon disappointment" post. My take on the commercialization of PyCon is that it hasn't been as bad as Bruce makes it sound. The vendor area was very isolated, in a corner room, and you could have easily missed it if you didn't see the people pouring out of there with all kinds of swag. And everybody was hiring! This is a good thing, people! Getting paid to work with your favorite programming language is a privilege that not many people are enjoying.
The other controversial aspect this year was the vendor involvement in the lightning talks. That was indeed highly annoying, and hopefully it won't happen again. I think the solution the organizers found last year -- separating the vendor-sponsored lightning talks from the regular ones -- worked pretty well. I didn't go to all the lightning talk sessions this year, but the one on Sunday was very enjoyable (once the sponsored ones were over). I liked the one on bitsyblog (a minimalist approach to building blog software), and the one by Martin v. Loewis on using Roundup for various non-bugtrack-related projects such as homework assignments. I also found out that Larry Hastings from Facebook is leading an effort to switch Facebook from PHP to a more enlightened language (you know which one), and also that slide.com, which offers what is apparently one of the most (if not THE most) popular Facebook apps, is using Python throughout its development process.
The technical talks were a mixed bag, just like at every other PyCon I participated in. No big surprise here, you can't make everybody happy no matter how hard you try (and the PyCon commitee tried hard, believe me). The critics have a point though, in that we need more advanced topics. Maybe we need a beginner track, an intermediate track and an advanced track. Also, 4 tracks is too much, I think 3 is a better number. My top 3 favorite talks were, in chronological order: "Supervisor as a platform" by Chris McDonough and Mike Naberezny, "Managing complexity (and testing)" by Matt Harrison, and "Introducing agile testing techniques to the OLPC project" by Titus Brown (I hope Titus will make good on his promise of putting together a screencast of the demo he showed, since it's mighty cool).
But the best part of PyCon is always meeting people. It's great to put a face to a name, especially when you're familiar with that person's work or blog. It's great to meet with people you met the previous years too. If nothing else, the socializing alone makes it worth attending PyCon. Think about it: where else would you meet Zed Shaw and have a chance to experience his colorful language and personality, and also hear him rant a bit about the ways in which Python sucks (although I hasten to add that he likes it a lot and he's starting to hack seriously in it). Can't wait to use one of Zed's first contributions to the Python world, a nifty utility called easy_f__ing_uninstall (feel free to fill in the blanks :-)
Whatever the critics say, I know I'll be back in Chicago next year for sure. I just want better network connectivity (why is it so hard to ensure decent wireless connectivity at PyCon year after year? it's a mystery) and better food.
Tuesday, March 04, 2008
Bright days for Python
Yes, the title is related to the news that were all over Planet Python yesterday, that Sun hired two prominent pythonistas (Ted Leung and Frank Wierzbicki) to work on Jython. Great news indeed. Jython seemed to lose steam and momentum as compared to other dynamic JVM languages (JRuby mainly), so it's very good to see that Sun is putting its weight behind it. Congrats to Ted and Frank.
Also, in other news, Greg Wilson just announced on his blog that he signed a contract with the Pragmatic Programmers to co-author a textbook for CS 101-type classes using Python as the programming language. Can't wait to see it in print!
Also, in other news, Greg Wilson just announced on his blog that he signed a contract with the Pragmatic Programmers to co-author a textbook for CS 101-type classes using Python as the programming language. Can't wait to see it in print!
Monday, March 03, 2008
Notes from the SoCal Piggies meeting on Feb. 28th
Here are the notes that I posted on the 'Happenings in Python Usergroups' blog.
Tuesday, February 12, 2008
Getting Things Done via your Inbox
I've been Getting Things Done long before it was cool to GTD. My method is simple: I keep my list of things to get done in my email Inbox. Once I get a thing done, I move it to a different folder, and I forget about it. This forces me to deal with incoming email at a very rapid pace. I either keep it in the Inbox because I know I'll work on it in the next few hours (or days), or I reply to it and I delete it, or I file it away, or I simply delete it. There are of course cases when my Inbox grows, but then I take drastic measures to reduce its size. I try to have no more than 25 messages in my Inbox at all times. OK, sometimes I have up to 50, but that's my limit.
Just thought I'd throw this out there in case it might help somebody in organizing their workload.
Just thought I'd throw this out there in case it might help somebody in organizing their workload.
MS SQL Server is brain dead
Another post in the series "Why does this have to be so painful???"
I had to insert some data in a MS SQL Server database (don't ask, I just had to.) The same data was inserted into a MySQL database using a simple statement of the form:
INSERT INTO table1 (col1, col2, col3)
VALUES
(a,b,c),
(d,e,f),
(x,y,z);
I tried to do the same in MS SQL Server, and I was getting mysterious 'syntax error' messages, with no other explanation. In desperation, I only left one row of values in the statement, and boom, the syntax errors disappeared. I thought -- wait a minute, there's no way I can only insert one row at a time in MS SQL Server! But yes, that turned out to be the sad truth. I finally found a slightly better way to deal with multiple rows at a time in this blog post -- something extremely convoluted like this:
INSERT INTO table1 (col1, col2, col3)
SELECT
a, b, c
UNION ALL
SELECT
d, e, f
UNION ALL
SELECT
x, y, z
(be very careful not to end your statement with a UNION ALL)
What can I say, my post title says it all. Another thing I noticed while desperately googling around: there is precious little information about SQL Server on forums etc., in comparison with a huge wealth of info on MySQL. And the best information on SQL Server is on pay-if-you-want-to-see-the-answers forums such as experts-exchange.com. Not surprising, but sad.
I had to insert some data in a MS SQL Server database (don't ask, I just had to.) The same data was inserted into a MySQL database using a simple statement of the form:
INSERT INTO table1 (col1, col2, col3)
VALUES
(a,b,c),
(d,e,f),
(x,y,z);
I tried to do the same in MS SQL Server, and I was getting mysterious 'syntax error' messages, with no other explanation. In desperation, I only left one row of values in the statement, and boom, the syntax errors disappeared. I thought -- wait a minute, there's no way I can only insert one row at a time in MS SQL Server! But yes, that turned out to be the sad truth. I finally found a slightly better way to deal with multiple rows at a time in this blog post -- something extremely convoluted like this:
INSERT INTO table1 (col1, col2, col3)
SELECT
a, b, c
UNION ALL
SELECT
d, e, f
UNION ALL
SELECT
x, y, z
(be very careful not to end your statement with a UNION ALL)
What can I say, my post title says it all. Another thing I noticed while desperately googling around: there is precious little information about SQL Server on forums etc., in comparison with a huge wealth of info on MySQL. And the best information on SQL Server is on pay-if-you-want-to-see-the-answers forums such as experts-exchange.com. Not surprising, but sad.
Monday, February 11, 2008
Installing the TinyMCE spellchecker in SugarCRM
I had to go through the exercise of installing and configuring the TinyMCE spellchecker in a SugarCRM 5.0 installation today, so I'm blogging it here for future reference and google karma.
Here's my exact scenario:
- OS: CentOS 4.4 32-bit
- SugarCRM version: SugarCE-Full-5.0.0a (under /var/www/html/SugarCRM)
I first downloaded the TinyMCE spellchecker plugin v. 1.0.5 by following the link from the TinyMCE download page.
I unzipped tinymce_spellchecker_php_1_0_5.zip and copied the resulting spellchecker directory under my SugarCRM installation (in my case under /var/www/html/SugarCRM/include/javascript/tiny_mce/plugins).
I edited spellchecker/config.php and specified:
The command-line pspell uses the binary /usr/bin/aspell, which in my case was already installed, but YMMV. If you want the embedded PHP spellchecker, you need to recompile PHP with the pspell module.
The main configuration of the layout of TinyMCE happens in /var/www/html/SugarCRM/include/SugarTinyMCE.php. I chose to add the spellchecker button in the line containing the definition of buttonConfig3:
I also added the spellchecker plugin in the same file, in the line containing the plugins definitions inside the defaultConfig definition:
I restarted Apache, just to make sure all the include files are re-read (not sure if it was necessary, but better safe than sorry), and lo and behold, the spellcheck button was there when I went to the Compose Email form. After entering some misspelled text, I clicked the spellcheck button. I was then able to right click the misspelled words and was offered a list of words to choose from.
If it sounds painless, rest assured, it wasn't. The documentation helped a bit, but it was sufficiently different from what I was seeing that I had to do some serious forensics-type searches within the SugarCRM file hierarchy. But all is well that ends well.
Here's my exact scenario:
- OS: CentOS 4.4 32-bit
- SugarCRM version: SugarCE-Full-5.0.0a (under /var/www/html/SugarCRM)
I first downloaded the TinyMCE spellchecker plugin v. 1.0.5 by following the link from the TinyMCE download page.
I unzipped tinymce_spellchecker_php_1_0_5.zip and copied the resulting spellchecker directory under my SugarCRM installation (in my case under /var/www/html/SugarCRM/include/javascript/tiny_mce/plugins).
I edited spellchecker/config.php and specified:
require_once("classes/TinyPspellShell.class.php"); // Command line pspell
$spellCheckerConfig['enabled'] = true;
The command-line pspell uses the binary /usr/bin/aspell, which in my case was already installed, but YMMV. If you want the embedded PHP spellchecker, you need to recompile PHP with the pspell module.
The main configuration of the layout of TinyMCE happens in /var/www/html/SugarCRM/include/SugarTinyMCE.php. I chose to add the spellchecker button in the line containing the definition of buttonConfig3:
var $buttonConfig3 = "tablecontrols,spellchecker,separator,advhr,hr,removeformat,separator,insertdate,inserttime,separator,preview";
I also added the spellchecker plugin in the same file, in the line containing the plugins definitions inside the defaultConfig definition:
'plugins' => 'advhr,insertdatetime,table,preview,paste,searchreplace,directionality,spellchecker',
I restarted Apache, just to make sure all the include files are re-read (not sure if it was necessary, but better safe than sorry), and lo and behold, the spellcheck button was there when I went to the Compose Email form. After entering some misspelled text, I clicked the spellcheck button. I was then able to right click the misspelled words and was offered a list of words to choose from.
If it sounds painless, rest assured, it wasn't. The documentation helped a bit, but it was sufficiently different from what I was seeing that I had to do some serious forensics-type searches within the SugarCRM file hierarchy. But all is well that ends well.
Saturday, February 09, 2008
Dilbert should get a life
I'm subscribed to the RSS feed for the daily Dilbert comic. I'm still enjoying it, but lately I've started to think -- why on Earth doesn't Dilbert just quit? Why do all the characters in that comic need to be so passive-aggressive about everything? Yes, I know it wouldn't be Dilbert without it, and Scott Adams wouldn't put so much bacon on the table either. But I can't help thinking that there's a big lie in it too, as if there's no escape from that shitty workplace. Dilbert dude, just don't go back there tomorrow! Let's see how your life turns around.
Update 2/12/08
Thanks to all who commented on my post. Many people seem to think, as I do, that Scott Adams is hypocritical and profits quite handsomely from the corporate culture he apparently despises. I also decided in the mean time to walk the walk, not only talk the talk, so I unsubscribed from the RSS feed.
Update 2/12/08
Thanks to all who commented on my post. Many people seem to think, as I do, that Scott Adams is hypocritical and profits quite handsomely from the corporate culture he apparently despises. I also decided in the mean time to walk the walk, not only talk the talk, so I unsubscribed from the RSS feed.
Friday, February 08, 2008
And then there were 10
...where 10 is the current number of people who signed up for the Practical Applications of Agile (Web) Testing Tools tutorial that Titus and I will present at PyCon08. What this means is that the tutorial is in good standing, since each tutorial needs at least 10 attendees in order to actually happen. If you haven't signed up for a tutorial yet, check out ours -- it'll be fun.
Thursday, February 07, 2008
Got OpenID?
After reading this post from O'Reilly Radar on several big-name companies joining the OpenID Foundation, I decided to give OpenID a try myself. One of the providers recommended by the foundation was VeriSign, which sounded reputable to me, so I went ahead and got one from their Personal Identity Provider (PIP) site. Easy enough -- you choose a user name, a password, put your email in, then you get a so-called "OpenID access URL" of the form username.pip.verisignlabs.com.
To test my brand new OpenID, I went to one of the sites referenced on the PIP page, stikis.com. I chose the OpenID authentication method, entered the id I got from VeriSign, then I was redirected to the VeriSign PIP page, which asked me to specify for how long I want my trust relationship with stikis.com to last. I chose 1 year, clicked 'Allow', then was redirected back to the stikis.com site, already logged in to my personal start page. When I clicked on 'My Account' inside stikis.com, I saw my VeriSign access URL page at http://username.pip.verisignlabs.com. Pretty clean process. I hope more and more Web sites will accept OpenIDs. Since Google, Yahoo and Microsoft all joined the OpenID foundation, I'm pretty sure you'll be able to use an account you have with any of them as your OpenID (and indeed all of the above companies have announced that they'll be offering OpenIDs).
To test my brand new OpenID, I went to one of the sites referenced on the PIP page, stikis.com. I chose the OpenID authentication method, entered the id I got from VeriSign, then I was redirected to the VeriSign PIP page, which asked me to specify for how long I want my trust relationship with stikis.com to last. I chose 1 year, clicked 'Allow', then was redirected back to the stikis.com site, already logged in to my personal start page. When I clicked on 'My Account' inside stikis.com, I saw my VeriSign access URL page at http://username.pip.verisignlabs.com. Pretty clean process. I hope more and more Web sites will accept OpenIDs. Since Google, Yahoo and Microsoft all joined the OpenID foundation, I'm pretty sure you'll be able to use an account you have with any of them as your OpenID (and indeed all of the above companies have announced that they'll be offering OpenIDs).
Monday, February 04, 2008
Podcasts from PyCon 2007 are available
Some podcasts from PyCon2007 are starting to trickle in on this page. You can listen for example to the Web Frameworks Panel and to the Testing Tools Panel.
James Shore's litmus test for a new language
Implement FIT. Sounds reasonable to me. After all, self-testing and acceptance testing should be a big part of a new and cool language. I'll make sure I ask Chuck Esterbrook this question about his Cobra language during our next SoCal Piggies meeting :-)
Saturday, February 02, 2008
Next SoCal Piggies meeting: Feb. 28th
Steve Wagner from Gorilla Nation graciously offered to host the next SoCal Piggies meeting at his company's offices in Culver City. We're still waiting for the full line-up of talks, but Chuck Esterbrook, of Webware fame, already offered to present an introduction to a new programming language he created, Cobra. Should be a lot of fun. To whet your appetite, Cobra is, in Chuck's words, "a new programming language with a Python-inspired syntax and first class support for unit tests and design-by-contracts."
So if you're in the area, please consider attending our meeting on Thursday, Feb. 28th, starting at 7 PM. Check out our wiki or the mailing list for more details.
So if you're in the area, please consider attending our meeting on Thursday, Feb. 28th, starting at 7 PM. Check out our wiki or the mailing list for more details.
Monday, January 28, 2008
How modern Web design is conducted
Via the 40. (with egg) blog, a time breakdown on how Web design is conducted in our day and time. Hmmm...maybe there's room for allocating some slice of that pie to testing, using Firebug for example. UPDATE: I meant debugging with Firebug and testing with twill and Selenium of course :-)
Friday, January 25, 2008
Checklist automation and testing
This is a follow-up to my previous post on writing automated tests for sysadmin-related checklists. That post seems to have struck a chord, judging by the comments it generated.
Here's the scenario I'm thinking about: you need to deploy a standardized set of packages and configurations to a bunch of servers. You put together a checklist detailing the steps you need to take on each server -- kickstart the box, run some post-install scripts, do some configuration customization, etc. At this point, you're already ahead of the game, and you're not relying solely on human memory. However, if you rely on a human being going manually through each step of the checklist on each server, you're in for some surprises in the guise of missed steps. The answer of course is to automate as many steps as you can, ideally all of them.
Now we're getting to the main point of my post: assuming you did automate all the steps of the checklist, and you ran your scripts on each server, do you REALLY have that warm and fuzzy feeling that everything is OK? You don't, unless you also have a comprehensive automated test suite that runs on every server and actually checks that stuff happened the way you intended.
Here are some concrete examples of stuff I'm verifying after the deployment of a certain type of our servers (running Apache/Tomcat).
OS-specific tests
* does the sudoers file contain certain users that need those rights
* is the sshd process set to start at boot time
* is the ClientAliveInterval variable set correctly in /etc/sshd/sshd_config
* are certain NFS mount points defined in /etc/fstab, and do they actually exist on the server
* is sendmail set to start at boot time, and running
* are iptables and/or SELinux configured the way they should be
* ....and more
Apache-specific tests
* is httpd set to start at boot time
* do the virtual host configuration files in /etc/httpd/conf.d contain the expected information
* has mod_jk been installed and configured properly (mod_jk provides the glue between Apache and Tomcat)
* is SSL configured properly
* does the /etc/logrotate.d/httpd configuration file contain the correct options (for example keep the logs for N days and compress them)
* etc.
Tomcat-specific tests
* has a specific version of Java been installed
* has Tomcat been installed in the correct directory, with the correct permissions
* has Tomcat been set to start at boot time
* etc.
Media-specific tests
* has ImageMagick been installed in the correct location
* does ImageMagick support certain file formats (JPG, PNG, etc)
* can ImageMagick actually process certain types of files (JPG, PNG, etc.)
Some of these tests could be run from a monitoring system (one of the commenters on my previous post mentioned that their sysadmins use Zimbrix; an Open Source alternative is Nagios, and there are many others.) However, a monitoring system typically doesn't go into the level of detail I described, especially when it comes to configuration files and other more advanced customizations. That's why I think it's important to use a real test framework and a real scripting language for this type of automated tests.
In my case, each type of test resides in its own file -- for example test_os.py, test_apache.py, test_tomcat.py, test_media.py. I run the tests using the nose test framework.
Here are some examples of small test functions. I'm using sets for making sure that expected lines are in certain files or in the output of certain commands, since most of the time I don't care about the order in which those lines appear.
From test_os.py:
From test_apache.py:
From test_tomcat.py:
From test_media.py:
As always, comments and suggestions are very welcome! Also see Titus's post for some sysadmin-related automated tests that he's running on a regular basis.
Here's the scenario I'm thinking about: you need to deploy a standardized set of packages and configurations to a bunch of servers. You put together a checklist detailing the steps you need to take on each server -- kickstart the box, run some post-install scripts, do some configuration customization, etc. At this point, you're already ahead of the game, and you're not relying solely on human memory. However, if you rely on a human being going manually through each step of the checklist on each server, you're in for some surprises in the guise of missed steps. The answer of course is to automate as many steps as you can, ideally all of them.
Now we're getting to the main point of my post: assuming you did automate all the steps of the checklist, and you ran your scripts on each server, do you REALLY have that warm and fuzzy feeling that everything is OK? You don't, unless you also have a comprehensive automated test suite that runs on every server and actually checks that stuff happened the way you intended.
Here are some concrete examples of stuff I'm verifying after the deployment of a certain type of our servers (running Apache/Tomcat).
OS-specific tests
* does the sudoers file contain certain users that need those rights
* is the sshd process set to start at boot time
* is the ClientAliveInterval variable set correctly in /etc/sshd/sshd_config
* are certain NFS mount points defined in /etc/fstab, and do they actually exist on the server
* is sendmail set to start at boot time, and running
* are iptables and/or SELinux configured the way they should be
* ....and more
Apache-specific tests
* is httpd set to start at boot time
* do the virtual host configuration files in /etc/httpd/conf.d contain the expected information
* has mod_jk been installed and configured properly (mod_jk provides the glue between Apache and Tomcat)
* is SSL configured properly
* does the /etc/logrotate.d/httpd configuration file contain the correct options (for example keep the logs for N days and compress them)
* etc.
Tomcat-specific tests
* has a specific version of Java been installed
* has Tomcat been installed in the correct directory, with the correct permissions
* has Tomcat been set to start at boot time
* etc.
Media-specific tests
* has ImageMagick been installed in the correct location
* does ImageMagick support certain file formats (JPG, PNG, etc)
* can ImageMagick actually process certain types of files (JPG, PNG, etc.)
Some of these tests could be run from a monitoring system (one of the commenters on my previous post mentioned that their sysadmins use Zimbrix; an Open Source alternative is Nagios, and there are many others.) However, a monitoring system typically doesn't go into the level of detail I described, especially when it comes to configuration files and other more advanced customizations. That's why I think it's important to use a real test framework and a real scripting language for this type of automated tests.
In my case, each type of test resides in its own file -- for example test_os.py, test_apache.py, test_tomcat.py, test_media.py. I run the tests using the nose test framework.
Here are some examples of small test functions. I'm using sets for making sure that expected lines are in certain files or in the output of certain commands, since most of the time I don't care about the order in which those lines appear.
From test_os.py:
def test_sshd_on():
stdout, stdin = popen2.popen2('chkconfig sshd --list')
lines = stdout.readlines()
assert "sshd \t0:off\t1:off\t2:on\t3:on\t4:on\t5:on\t6:off\n" in lines
From test_apache.py:
def test_logrotate_httpd():
lines = open('/etc/logrotate.d/httpd').readlines()
lines = set(lines)
expected = set([
" rotate 100\n",
" compress\n",
])
assert lines.issuperset(expected)
From test_tomcat.py:
def test_homedir():
target_dir = '/opt/target'
assert os.path.isdir(target_dir)
(st_mode, st_ino, st_dev, st_nlink, st_uid, st_gid, st_size, st_atime, st_mtime, st_ctime) = os.stat(target_dir)
assert st_uid == TARGET_UID, 'User wrong for %s' % pathname
assert st_gid == TARGET_GID, 'Group wrong for %s' % pathname
From test_media.py:
def test_ImageMagick_listformat():
stdout, stdin = popen2.popen2(''/usr/local/bin/identify --list format'')
lines = stdout.readlines()
lines = set(lines)
expected = set([
" JNG* PNG rw- JPEG Network Graphics\n",
" JPEG* JPEG rw- Joint Photographic Experts Group JFIF format (62)\n",
" JPG* JPEG rw- Joint Photographic Experts Group JFIF format\n",
" PJPEG* JPEG rw- Progessive Joint Photographic Experts Group JFIF\n",
" JNG* PNG rw- JPEG Network Graphics\n",
" MNG* PNG rw+ Multiple-image Network Graphics (libpng 1.2.10)\n",
" PNG* PNG rw- Portable Network Graphics (libpng 1.2.10)\n",
" PNG24* PNG rw- 24-bit RGB PNG, opaque only (zlib 1.2.3)\n",
" PNG32* PNG rw- 32-bit RGBA PNG, semitransparency OK\n",
" PNG8* PNG rw- 8-bit indexed PNG, binary transparency only\n",
])
assert lines.issuperset(expected)
As always, comments and suggestions are very welcome! Also see Titus's post for some sysadmin-related automated tests that he's running on a regular basis.
Tuesday, January 22, 2008
Stay away from the AT&T Tilt phone
Why? Because it's extremely fragile. I got one a couple of weeks ago (sponsored by my company, otherwise I wouldn't have shelled out $399) and after just 2 days I found the screen cracked in 2 places. It's true I carried it in my jacket's pocket while driving, and it probably jammed against my leg or something, but I've done that with other phones and didn't have this issue.
Of course, calls to the store, AT&T warranty and the manufacturer were all fruitless. The store won't accept it back because it's not in 'like new' state, and AT&T's warranty doesn't cover cracks in the screen. My best option at this point is to send it to the manufacturer for repairs, which will run me another $190. For now, I'm just using it as is, but I just want to tell whoever bothers to read this that I'm not happy -- not with the Tilt, and not with AT&T's customer service. There. Take that, AT&T.
Of course, calls to the store, AT&T warranty and the manufacturer were all fruitless. The store won't accept it back because it's not in 'like new' state, and AT&T's warranty doesn't cover cracks in the screen. My best option at this point is to send it to the manufacturer for repairs, which will run me another $190. For now, I'm just using it as is, but I just want to tell whoever bothers to read this that I'm not happy -- not with the Tilt, and not with AT&T's customer service. There. Take that, AT&T.
Joel on checklists
Another entertaining blog post from Joel Spolsky, this time on some issues they had with servers and networking equipment hosted at a data center in Manhattan. It all comes down to a network switch which had its ports configured to automatically negotiate their speed. As a result, one port was misbehaving and brough their whole web site down. The conclusion reached by Joel and his sysadmin team: we need documentation, we need checklists. I concur, but as I said in a recent post, this is still not enough. Human beings are notoriously prone to skipping tests on checklists. What Joel and his team really need are AUTOMATED TESTS that run periodically and check every single thing on those checklists. You can easily automate the step which verifies that a port on the switch is set to 100 Mbps or 1 Gbps; you can either use SNMP, or some expect-like script.
In fact, at my own company I'm developing a pretty extensive automated test suite (written in Python of course, and using nose) that verifies all the steps we go through whenever we deploy a server or a network device. It's very satisfying to see those dots and have a total of N passed tests and 0 failed tests, with N increasing daily. Automated tests for sysadmin tasks is an area little explored, so there's lots of potential for cool stuff to happen. If you're doing something similar and have ideas to share, please leave a comment.
In fact, at my own company I'm developing a pretty extensive automated test suite (written in Python of course, and using nose) that verifies all the steps we go through whenever we deploy a server or a network device. It's very satisfying to see those dots and have a total of N passed tests and 0 failed tests, with N increasing daily. Automated tests for sysadmin tasks is an area little explored, so there's lots of potential for cool stuff to happen. If you're doing something similar and have ideas to share, please leave a comment.
Wednesday, January 16, 2008
MySQL has been assimilated
...by Sun, for $1 billion. Bummer. I shudder whenever I see companies at the forefront of Open Source being gobbled up by giants such as Sun. I still don't know what Sun's Open Source strategy is -- they've been going back and forth with their support for Linux, and they seem to be pushing Open Solaris pretty heavily these days, although I personally don't know anybody in the OSS community who is using Open Solaris. UPDATE: Tim O'Reilly thinks this is a great fit for both Sun and MySQL, and says that "Sun has staked its future on open source, releasing its formerly proprietary crown jewels, including Solaris, Java, and the Ultra-Sparc processor design." Hmmm...maybe, but Sun has always struck me as being bent on world domination, just as bad as Microsoft.
Update 01/18/08: Here's a really good recap on the MySQL acquisition at InfoQ. Most people express a warm fuzzy feeling about this whole thing. I hope my apprehensions are unfounded.
In other acquisition-related news, Oracle agreed to buy BEA (the makers of WebLogic) for a paltry $7.85 billion.
Update 01/18/08: Here's a really good recap on the MySQL acquisition at InfoQ. Most people express a warm fuzzy feeling about this whole thing. I hope my apprehensions are unfounded.
In other acquisition-related news, Oracle agreed to buy BEA (the makers of WebLogic) for a paltry $7.85 billion.