Tuesday, May 30, 2006
Sandboxes everywhere
To me, 2006 seems to be the year of the virtual machine. Here's the equation:
commodity hardware + solid open-source virtualization technologies such as Xen = great opportunities for testers
Many companies have already started to capitalize on this equation (Autoriginate/HostedQA is just one example). The beauty of Open Source however makes this opportunity available to anybody who possesses a medium-to-high amount of Linux hacking skillz :-).
I'll post more about this topic as work on Cheesecake/SoC progresses. The way I see it, we'll offer a way for people to post their packages to one of our servers, and we'll compute all the dynamic Cheesecake scores (such as code coverage obtained by running unit tests) in a dedicated virtual machine. These dynamic scores will also be computed when a request comes from the PyPI interface. This is all on the drawing board right now, but that's the general idea.
Another related project that hasn't been started yet, based on an idea that Titus had last year, would be to automatically apply patches to Python core, compile, build and run all unit tests, all of this in a safe sandbox environment. This will hopefully lower the barrier of accepting patches into Python core.
Michael Feathers on refactoring and continuous integration
M.F. disusses the situation of a project with a large code base, where several teams work on different releases/branches of the code at the same time. How do you refactor with confidence in this case? Another scenario discussed in the post is a project with dependencies on 3rd party code. How do you refactor with confidence, when you know that the 3rd party code keeps changing and needs to be patched everytime it gets updated?
Michael's answer is: integrate frequently. I say: buildbot to the rescue! :-)
Wednesday, May 24, 2006
Cheesecake and the Summer of Code
Cheesecake is an application designed to evaluate and estimate the overall quality (or so called 'kwalitee') of a given software package written in Python. It emphasizes a need for well-written documentation and unit tests, encouraging good programming practices and penalizing sloppy design and careless distribution. Using Cheesecake to check your code gives you confidence that your software doesn't merely run, but is usable and easy to test and modify as well.
Because Python is very easy to learn and use there exists a vast variety of software written in it, most of which was scattered until PyPI was created. Now, when new packages are being indexed on Cheese Shop every day, an effort can be made to spread the spirit of good software design and code reuse among the Python community. This can be achieved by combining the power of Cheesecake and Cheese Shop. Everytime a new version of a package would be uploaded to Cheese Shop, its cheesecake index will be calculated and published on web. Having a way to measure a quality of a package with accordance to other existing packages will be of invaluable help for all developers. It will promote well built packages and in the long run raise the overall quality of Python software.
Adding Cheesecake functionality to PyPI has been already mentioned by Phillip J. Eby on the catalog-sig mailing list. Together with Cheesecake maintainer Grig Gheorghiu we've discussed modifications needed to be done to Cheesecake code to be reliable enough so it could be incorporated into PyPI service. A working copy of our ideas is accessible on the project wiki. It includes enhancing Cheesecake code scoring techniques to take into account unit tests of a package, running tests in secure environment, extending supported archive formats and fixing all known bugs. Development of Cheesecake will adhere to best practices such as unit testing, continuous integration (via buildbot), pylint verification, etc.
The next part of this project will include collaboration with Richard Jones, PyPI maintainer, and merging Cheesecake into PyPI service. Upon completion all PyPI uploads will be automatically scored by Cheesecake. It will be possible to browse packages archive by cheesecake index, sorting results by installability, documentation and code kwalitee index. Statistics in numeric and graphical form will also be made available. This part of a project will involve writing server-side code, with emphasis on security and robustness.
The remaining time will be spent on resolving all problems that would occur during usage of Cheesecake and PyPI. Along with fixing bugs, I will develop a simple Hello world package that can be taken as an example of good development practices for all Python developers. It should also score 100% in the Cheesecake test of course. It will be what hello is for GNU Project.
Here are a few thoughts I had regarding the value of this project:
This project will have 2 very important contributions: first of all, it will integrate with PyPI and help rank the Cheeseshop packages according to various quality criteria. People learn better by example -- and what better examples than tools that score high on a scale that looks at different quality indicators such as documentation, installability, and code 'kwalitee'? Cheesecake will provide a way to identify the best-of-breed packages in those areas.
Second, the project will investigate ways to dynamically assess packages by executing their code in a sandbox environment. This will help mainly with getting code coverage numbers by running a project's unit tests, but one can easily envision many other applications -- one idea that Titus Brown had was to automatically apply and verify patches to Python core, without the fear that the host machine will crash and burn. This will hopefully streamline the process of accepting patches into Python core (a famously complicated process currently).
Michał and I will use Trac to manage this project. The idea is to have short iterations represented as milestones in Trac, with tickets of type 'enhancement' that represent the stories to be done in each iteration. Each story will be split into short tasks that can be accomplished in a matter of hours, and each task will be represented as a ticket of type....'task', what else? This will give us a nice way of watching the progress of the project over the summer. Of course, the criterion for the completion of a given story is: all unit/acceptance/functional tests should pass for that story.
I'm very excited to have Michał work on this project and I'm very hopeful that at the end of this summer we'll have a solid application that will benefit the Python community.
Here is the list of the 25 applications accepted to the Summer of Code under the PSF umbrella.
Wednesday, May 10, 2006
Dynamically updating buildbot status text
Note that if you just want to customize the build step status with some text that is known in advance by the master (e.g. "client install" or "twill functional tests"), all you need to do is to subclass from ShellCommand and set the descriptionDone class variable to the desired custom text. See this post for more details on how to do this.
For dynamically updating the status text, the solution I found was to override some of the methods in the ShellCommand class.
My particular scenario is this: the build slave installs some package and identifies its version number. I want to be able to display that version number in the status for that build step.
I defined the following subclass of ShellCommand:
The two most important methods in this case are createSummary and getText. I chose createSummary for overriding because it has access to the slave's log. In my case, that log contained the version number computed by the slave, so I just introduced a new variable, self.version, and set it to the result of a regular expression search for "--version=(.*)".
class ClientInstall(ShellCommand):
name = "client install"
description = ["running %s" % name]
descriptionDone = [name]
def __init__(self, **kwargs):
ShellCommand.__init__(self, **kwargs)
self.version = None
def createSummary(self, log):
log_text = log.getText()
s = re.search("--version=(.*)", log_text)
if s:
self.version = s.group(1)
def getText(self, cmd, results):
text = self.describe(True)[:]
if results == WARNINGS:
text.append("warnings")
if results == FAILURE:
text.append("failed")
if self.version:
text.append("version=" + self.version)
return text
The getText method is called by ShellCommand inside the setStatus method, like this (the ShellCommand class lives in the process/step.py file installed under the buildbot root installation directory, in my case /usr/local/lib/python2.4/site-packages/buildbot):
My overridden version of getText (shown above) checks to see if self.version is non-empty, and if this is the case, it appends it to the variable text, which is a copy of the list returned by self.describe(True). Copying the list into a variable instead of modifying it in place is very important. Initially, I did something like:
def setStatus(self, cmd, results):
# this is good enough for most steps, but it can be overridden to
# get more control over the displayed text
self.step_status.setColor(self.getColor(cmd, results))
self.step_status.setText(self.getText(cmd, results))
self.step_status.setText2(self.maybeGetText2(cmd, results))
text = self.describe(True)
if results == WARNINGS:
text += ["warnings"]
The net effect of this was that each of my build slaves was updating this particular build step status with version information from all the other build slaves. It felt like a global or class-wide variable was being trampled under foot by all the build slaves, and indeed this was the case, as I found out when I sent a message to the buildbot-devel list and Brian Warned explained what was going on: the list of strings returned by self.describe() is a class-wide value that's not supposed to be mutated. Brian suggested modifying the above snippet of code to:
Neal Norwitz suggested the solution I finally adopted, which is to first make a copy of the list returned by self.describe, then append to it. This is more efficient, because it only allocates the list one time, then resizes it if necessary:
text = self.describe(True)
if results == WARNINGS:
text = text + ["warnings"]
Once again, buildbot proved to be very flexible and customizable -- but not without jumping through some hoops in this particular scenario. In any case, I hope this post will be useful for buildbot users out there who want to display more customized information in their build steps.
text = self.describe(True)[:]
if results == WARNINGS:
text.append("warnings")
Tuesday, May 09, 2006
SSH tunnelling with Putty
Here's David's howto, almost verbatim:
What we'll do is forward port 9080 on the PC to 8000 on 192.168.2.200 (the host/port for Trac). I'm using Putty version 0.54.
1. Start Putty (so you're looking at the PuTTY Configuration screen.)
2. Enter 192.168.2.100 (the IP of the box you can ssh into) in the Host name / IP address box.
3. Check SSH as the protocol (port number should change to 22.)
4. Enter 'trac-tunnel' as the Saved Sessions name, and click Save.
5. Open the Connection list in the left pane.
6. Open the SSH list in the left pane, Click Tunnels.
7. Check X11 Forwarding (in case you need to run X-based applications.)
8. Back on the right side, at the bottom, enter 9080 for source port (there's nothing special about port 9080, it can be any non-used port on your local machine.)
9. Enter 192.168.2.200:8000 as the Destination, leave Local checked.
10. Click Add.
11. Important, easy to forget: Click Session on the left pane, Click Save.
Now your 'trac-tunnel' session will not only connect you to the .100 box, but when you're logged into the .100, it will mediate a tunnel between your PC's port 9080 and port 8000 on 192.168.2.200.
So, let's try it out:
1. Use Putty to open the 'trac-tunnel' connection, and log in as yourself
2. Point your browser to http://127.0.0.1:9080/ and you'll get right in.
You'd repeat Steps 8-10 to add more local port forwardings. Step 11 is easy to forget, so be warned...
Zen of Unicode
Modifying EC2 security groups via AWS Lambda functions
One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...
-
Here's a good interview question for a tester: how do you define performance/load/stress testing? Many times people use these terms inte...
-
I've been using dnspython lately for transferring some DNS zone files from one name server to another. I found the package extremely us...
-
Update 02/26/07 -------- The link to the old httperf page wasn't working anymore. I updated it and pointed it to the new page at HP. Her...