Agile Testing

Wednesday, June 18, 2008

Celtics use Ubuntu to beat Lakers

Excerpt from the Associated Press article about the Celtics-Lakers game last night:

"It was a group effort by this gang in green, which bonded behind Rivers, who borrowed an African word ubuntu (pronounced Ooh-BOON-too) and roughly means "I am, because we are" in English, as the Celtics' unifying team motto.

The Celtics gave the Lakers a 12-minute crash course of ubuntu in the second quarter.

Boston outscored Los Angeles 34-19, getting 11 field goals on 11 assists. The Celtics toyed with the Lakers, outworking the Western Conference's best inside and out and showing the same kind of heart that made Boston the center of pro basketball's universe in the '60s. "

It's not what you thought, but it's still nice to see that the ubuntu concept is used successfully in sports too. I wonder what parallel we can make between the Lakers' game last night and an operating system. The Windows Blue Screen of Death comes to mind.

Tuesday, June 17, 2008

Security testing for agile testers

I've been asked by Lisa Crispin to contribute a few paragraphs on security testing to an upcoming book on agile testing that she and Janet Gregory are co-authoring. Here's what I came up with:

Security testing is a broad topic that cannot be possibly covered in a few paragraphs. Whole books have been devoted to this subject. Here we will try to at least provide some guidelines and pointers to books and tools that might prove useful to agile teams interested in security testing.

Just like functional testing, security testing can be viewed and conducted from two perspectives: from the inside out (white-box testing) and from the outside in (black-box testing).

Inside-out security testing assumes that the source code for the application under test is available to the testers. The code can be analyzed statically with a variety of tools that try to discover common coding errors which can make the application vulnerable to attacks such as buffer overflows or format string attacks. (Resources:
http://en.wikipedia.org/wiki/Buffer_overflow and
http://en.wikipedia.org/wiki/Format_string_vulnerabilities)

A list of tools that can be used for static code analysis can be found here:
http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis

The fact that the testers have access to the source code of the application also means that they can map what some books call "the attack surface" of the application, which is the list of all the inputs and resources used by the program under test. Armed with a knowledge of the attack surface, testers can then apply a variety of techniques that attempt to break the security of the application. A very effective class of such techniques is called fuzzing and is based on fault injection. Using this technique, the testers try to make the application fail by feeding it various types of inputs (hence the term fault injection). These inputs range from carefully crafted strings used in SQL Injection attacks, to random byte changes in given input files, to random strings fed as command line arguments. (Resources:
http://www.fuzzing.org/category/fuzzing-book/ and
http://www.fuzzing.org/fuzzing-software)

The outside-in approach is the one mostly used by attackers that try to penetrate into the servers or the network hosting your application. As a security tester, you need to have the same mindset that attackers do, which means that you have to use your creativity in discovering and exploiting vulnerabilities in your own application. You also need to stay up to date with the latest security news and updates related to the platform/operating system your application runs on. These tasks are by no means easy, they require extensive knowledge, and as such are mostly outsourced to third parties that specialize in security testing.

So what are agile testers to do when faced with the apparently insurmountable task of testing the security of their application? Here are some practical, pragmatic steps that anybody can follow:

1. Adopt a continuous integration (CI) process that periodically runs a suite of automated tests against your application.

2. Learn how to use one or more open source static code analysis tools. Add a step to your CI process which consists of running these tools against your application code. Mark the step as failed it the tools find any critical vulnerabilities.

3. Install an automated security vulnerability scanner such as Nessus
(http://www.nessus.org/nessus/). Nessus can be run in a command-line, non-GUI mode, which makes it suitable for inclusion in a CI tool. Add a step to your CI process which consists of running Nessus against your application. Capture the Nessus output in a file and parse that file for any high importance security holes found by the scanner. Mark the step as FAIL when any such holes are found.

4. Learn how to use one or more open source fuzzing tools. Add a step to your CI process which consists of running these tools against your application code. Mark the step as failed it the tools find any critical vulnerabilities.

As with any automated testing effort, running these tools is no guarantee that your code and your application will be free of security defects. However, running these tools will go a long way towards improving the quality of your application in terms of security. As always, the 80/20 rule applies. These tools will probably find the 80% most common security bugs out there while requiring 20% of your security budget.

To find the remaining 20% security defects, you're well advised to spend the other 80% of your security budget on high quality security experts. They will be able to test your application security thoroughly by the use of techniques such as SQL injection, code injection, remote code inclusion and cross-site scripting. While there are some tools that try to automate some of these techniques, they are no match for a trained professional who takes the time to understand the inner workings of your application in order to craft the perfect attack against it.

Tools for troubleshooting Web app performance

I came across this blog post which talks about 15 tools that can make your life easier when you need to troubleshoot the performance of your Web application. I knew about most of them, but a new addition to my arsenal is definitely wbox -- think of it as an HTTP-based ping. Very simple, but extremely useful.

Thursday, June 12, 2008

What does your Wordle look like?

The meme du jour seems to be Wordle tag clouds. I couldn't resist generating one out of the text on the first page of my blog. Here it is, in all its splendor:

You would think I'm very self-centered, since my first and last names appear so prominently. But I think it's because every blog post ends with "posted by Grig Gheorghiu at ". The next biggest word is Python, so there's some redemption for me right there :-)

Friday, May 23, 2008

Incremental backups to Amazon S3

Based on this great blog post by Tim McCormack, I managed to write some scripts that back up files to Amazon S3. The files are encrypted with GnuPG and rsync-ed to S3 using a Python-based tool called duplicity.

Here's what I did in order to get all this going on a CentOS 5.1 server running Python 2.5.

1) Signed up for Amazon S3 and got the AWS_ACCESS_KEY_ID and the AWS_SECRET_ACCESS_KEY.

2) Downloaded and installed the following packages: boto, GnuPGInterface, librsync, duplicity. All of them except librsync are Python-based, so they can be installed via 'python setup.py install'. For librsync you need to use './configure; make; make install'.

3) Generated a GPG key pair using "gpg --gen-key". Made a note of the hex fingerprint of the key (you can list the fingerprints of your keys via "gpg --fingerprint").

4) Wrote a simple boto-based Python script to create and list S3 buckets (the equivalent of directories in S3 parlance). Note that boto uses SSL, so your Python installation needs to have SSL enabled.

Here's how the script looks:


#!/usr/bin/env python

ACCESS_KEY_ID = 'theaccesskeyid'
SECRET_ACCESS_KEY = 'thesecretaccesskey'

from boto.s3.connection import S3Connection
conn = S3Connection(ACCESS_KEY_ID, SECRET_ACCESS_KEY)
buckets = [
         'mybuckets_myserver_mysqldump',
         'mybuckets_myserver_full',
     ]
for bucket in buckets:
 conn.create_bucket(bucket)
rs = conn.get_all_buckets()
print 'Bucket listing:'
for b in rs:
 print b.name

5) Wrote a bash script (heavily influenced by Tim McCormack's post) that runs duplicity and backs up the root partition of my Linux server (minus some directories) to S3. The nice thing about duplicity is that it uses rsync, so it only transfers the diffs over the wire. Here's how my script looks like:


export myEncryptionKeyFingerprint=somehexnumber
export mySigningKeyFingerprint=somehexnumber
export AWS_ACCESS_KEY_ID=accesskeyid
export AWS_SECRET_ACCESS_KEY=secretaccesskey
export PASSPHRASE=mypassphrase

/usr/local/bin/duplicity --encrypt-key=$myEncryptionKeyFingerprint
--sign-key=$mySigningKeyFingerprint --exclude=/sys --exclude=/dev
--exclude=/proc --exclude=/tmp --exclude=/mnt --exclude=/media /
s3+http://mybuckets_myserver_full

export AWS_ACCESS_KEY_ID=
export AWS_SECRET_ACCESS_KEY=
export PASSPHRASE=

NOTE: duplicity will interactively prompt you for your GPG key's passphrase, unless you have a variable called PASSPHRASE that contains the passphrase. Since I wanted to run this script as a cron job, I chose the less secure way of specifying the passphrase in clear inside the script. YMMV.

That's about it. Running the script produces an output such as this:


--------------[ Backup Statistics ]--------------
StartTime 1211482825.55 (Thu May 22 12:00:25 2008)
EndTime 1211488426.17 (Thu May 22 13:33:46 2008)
ElapsedTime 5600.62 (1 hour 33 minutes 20.62 seconds)
SourceFiles 174531
SourceFileSize 5080402735 (4.73 GB)
NewFiles 174531
NewFileSize 5080402735 (4.73 GB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 174531
RawDeltaSize 1200920038 (1.12 GB)
TotalDestinationSizeChange 2702953170 (2.52 GB)
Errors 0
-------------------------------------------------

The first time you run the script it will take a while, but subsequent runs will only back up the files that were changed since the last run. For example, my second run transferred only 19.3 MB:

--------------[ Backup Statistics ]--------------
StartTime 1211529638.99 (Fri May 23 01:00:38 2008)
EndTime 1211529784.18 (Fri May 23 01:03:04 2008)
ElapsedTime 145.19 (2 minutes 25.19 seconds)
SourceFiles 174522
SourceFileSize 5084478500 (4.74 GB)
NewFiles 64
NewFileSize 2280357 (2.17 MB)
DeletedFiles 28
ChangedFiles 418
ChangedFileSize 217974696 (208 MB)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 510
RawDeltaSize 2465010 (2.35 MB)
TotalDestinationSizeChange 20211663 (19.3 MB)
Errors 0

ASas
-------------------------------------------------

To restore files from S3, you use duplicity and specify the source as s3+http://mybuckets_myserver_full and the destination as a local directory.

Thanks to Tim McCormack for his detailed blog post, it made things so much easier than digging all this info by Google Fu.

Monday, May 19, 2008

Compiling Python 2.5 with SSL support

If you compile Python 2.5.x from source, you need to jump through some hoops so that SSL support is enabled. Googling around, I found Patrick Altman's excellent blog post talking about this very issue.

In my case, I needed to enable SSL support for Python 2.5.2 on CentOS 5.1. I already had the openssl development libraries installed:

# yum list installed | grep ssl
mod_ssl.i386 1:2.2.3-11.el5_1.cento installed
openssl.i686 0.9.8b-8.3.el5_0.2 installed
openssl-devel.i386 0.9.8b-8.3.el5_0.2 installed

Here's what I did next, following Patrick's post:

1) edited Modules/Setup.dist from the Python 2.5.2 source distribution and made sure the correct lines were put back in (they were commented out by default):

_socket socketmodule.c

# Socket module helper for SSL support; you must comment out the other
# socket line above, and possibly edit the SSL variable:
#SSL=/usr/local/ssl
_ssl _ssl.c \
-DUSE_SSL -I$(SSL)/include -I$(SSL)/include/openssl \
-L$(SSL)/lib -lssl -lcrypto

2) ran ./configure; make; make install

3) verified that I can access socket.ssl:

# python2.5
Python 2.5.2 (r252:60911, May 19 2008, 14:23:27)
[GCC 4.1.2 20070626 (Red Hat 4.1.2-14)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket
>>> socket.ssl
function ssl at 0xb7ef410c>

That's it. Not sure why it's so non-intuitive though.

Thursday, May 15, 2008

Encrypting a Linux root partition with LUKS and DM-CRYPT

One of our customers needed to have his Linux laptop's root partition encrypted. We found a HOWTO on achieving this with RHEL5, and we adapted it for CentOS 5. The technique is based on LUKS and DM-CRYPT. Kudos to my colleague Chris Evans for going through the exercise of getting this to work on CentOS 5 and for producing the documentation that follows, which I'm posting here hoping that it will benefit somebody at some point.

* Boot off of a Live CD, I used Fedora Core 9 Preview
* Find out which disk is which; for me /dev/sda was the external usb, and /dev/sdb was the internal

sfdisk -d /dev/sdb | sfdisk /dev/sda
pvcreate --verbose /dev/sda2
vgextend --verbose VolGroup00 /dev/sda2
pvmove --verbose /dev/sdb2 /dev/sda2 # This takes ages
vgreduce --verbose VolGroup00 /dev/sdb2
pvremove --verbose /dev/sdb2
fdisk /dev/sdb

* Change the partition type to 83 for /dev/sdb2
* Here is when you get to choose the password that will protect your partition:

cryptsetup --verify-passphrase --key-size 256 luksFormat /dev/sdb2

cryptsetup luksOpen /dev/sdb2 cryptroot
pvcreate --verbose /dev/mapper/cryptroot
vgextend --verbose VolGroup00 /dev/mapper/cryptroot
pvmove --verbose /dev/sda2 /dev/mapper/cryptroot # This takes ages
vgreduce --verbose VolGroup00 /dev/sda2
pvremove --verbose /dev/sda2
mkdir /mnt/tmp
mount /dev/VolGroup00/LogVol00 /mnt/tmp
cp -ax /dev/* /mnt/tmp/dev # I said no to overwriting any files
chroot /mnt/tmp/
(chroot) # mount -t proc proc /proc
(chroot) # mount -t sysfs sysfs /sys
(chroot) # mount /boot
(chroot) # swapon -a
(chroot) # vgcfgbackup

For the initrd, the blog mentions /etc/sysconfig/mkinitrd as a file. CentOS had a directory, I tried doing their suggestion as a file in there, moving the directory out, and making the file as they suggested. Both failed. So I ran the following command:

(chroot) # mkinitrd -v /boot/initrd-2.6.18-53.el5.crypt.img --with=aes --with=sha256 --with=dm-crypt 2.6.18-53.el5

Now we need to modify the initrd so that it will decrypt the partition at boot time

(chroot) # cd /boot
(chroot) # mkdir /boot/initrd-2.6.18-53.el5.crypt.dir
(chroot) # cd /boot/initrd-2.6.18-53.el5.crypt.dir
(chroot) # gunzip < ../initrd-2.6.18-53.el5.crypt.img | cpio -ivd

Now, we need to modify init by adding the following lines after the line which reads “mkblkdevs” and before “echo Scanning and configuring dmraid supported devices.”:

echo Decrypting root device
cryptsetup luksOpen /dev/sda2 cryptroot
echo Scanning logical volumes
lvm vgscan --ignorelockingfailure
echo Activating logical volumes
lvm vgchange -ay --ignorelockingfailure vg00

Copy cryptsetup and lvm to be put into the initrd, the blog doesn't mention it, but I'm sure it needs it.

cp /sbin/cryptsetup bin/
cp /sbin/lvm bin/

Compress the new initrd

find ./ | cpio -H newc -o | gzip -9 > /boot/initrd-2.6.18-53.el5.crypt.img

Modify the grub.conf. Copy the grub entry for the current kernel, and change as follows

title Centos Encrypted Server (2.6.18-53.1.4.el5)
initrd /initrd-2.6.18-53.el5.crypt.img

Unmount the fs's in the chroot, and exit

cd /
umount /boot
umount /proc
umount /sys
exit

NOTE: Don't upgrade the kernel without upgrading the initrd and grub.conf.

Reboot and test :)

At this point you have an encrypted root partition. You should be prompted for a password during the boot process (the boot partition is not encrypted). If somebody steals your laptop, they won't be able to mount the root partition without knowing the password.

After you have crypto setup, you can find out information about it (such as the crypto algorithm used) via this command:

# cryptsetup luksDump /dev/sda2
LUKS header information for /dev/sda2

Version:        1
Cipher name:    aes
Cipher mode:    cbc-essiv:sha256
Hash spec:      sha1
Payload offset: 2056
MK bits:        256
MK digest:      af 2e e6 39 3e 79 60 bb 4a 2b 33 05 1c 86 3a 83 bc a0 ef c1
MK salt:        79 b2 13 53 6f 52 72 a1 b5 3d dc d3 72 cd d6 f4
               e3 25 3c 6e 08 00 f3 1d 44 1e 90 47 bc 43 e7 07
MK iterations:  10
UUID:           721abe52-5122-447b-8ed0-5ca3b2b32366

Key Slot 0: ENABLED
       Iterations:             247223
       Salt:                   86 c7 53 6a 13 a9 77 81 89 ec 90 b3 e5 6a ea 8d
                               da 0c 6f ad ec 3e 3c 47 2d 6e 5f 59 28 4e 7c 63
       Key material offset:    8
       AF stripes:             4000
Key Slot 1: DISABLED
Key Slot 2: DISABLED
Key Slot 3: DISABLED
Key Slot 4: DISABLED
Key Slot 5: DISABLED

Thursday, May 08, 2008

Notes from the latest SoCal Piggies meeting

...have been posted to the "Happenings in Python User groups" blog.

Update: Ben Bangert sent me the slides he used. You can download or view the PDF from here.

Monday, May 05, 2008

Guido open sources Code Review app running on GAPE

Not sure why this wasn't publicized more, but Guido van Rossum announced today that he open sourced the code for Code Review, a Google AppEngine app he released last week. Code Review is based on Mondrian, the internal code review tool that Guido wrote for Google. The relationship between the two apps in terms of features is: Code Review < Mondrian.

The code for Code Review is part of a Google code project called Rietveld. I haven't looked at it yet, but I'll certainly do so soon, just to see the master's view on how to write a GAPE application.

Ruby to Python bytecode compiler

Kumar beat me to it, but I'll mention it here too: Why the Lucky Stiff published a Ruby-to-Python-bytecode compiler, as well as tools to decompile the byte code into source code. According to the README file, he based his work on blog posts by Ned Batchelder related to dissecting Python bytecode. I wholeheartedly agree with Why's comment at the end of the README file:

  You know, it's crazy that Python
 and Ruby fans find themselves
 battling so much.  While syntax
 is different, this exercise
 proves how close they are to
 each other!  And, yes, I like
 Ruby's syntax and can think much
 better in it, but it would be
 nice to share libs with Python
 folk and not have to wait forever
 for a mythical VM that runs all
 possible languages.

Tuesday, April 29, 2008

Special guest for next SoCal Piggies meeting

We'll have the SoCal Piggies meeting this Thursday May 1st at the Gorilla Nation office in Culver City. Our special guest will be Ben Bangert, the creator of Pylons, who will give us an introduction to his framework. We'll also have a presentation from Pablo Noego from Gorilla Nation on a chat application he wrote using Google App Engine. We'll probably also have an informal discussion on Python mock testing tools and techniques.

BTW, I am putting together a Google code project for mock testing techniques in Python, in preparation for a presentation I would like to give to the group at some point. I called the project moctep, in honor of that ancient Egyptian deity, the protector of testers (or mockers, or maybe both). It doesn't have much so far, but there's some sample code you can browse through in the svn repository if you're curious. I'll be adding more meat to it soon.

Anyway, if you're a Pythonista who happens to be in the L.A. area on Thursday, please consider attending our meeting. It will be lots of fun, guaranteed.

Tuesday, April 22, 2008

"OLPC Automated Testing" project accepted for SoC

I'm happy to say that Zach Riggle's application for this year's Google Summer of Code, "OLPC Project Automated Testing", was accepted. I'm looking forward to mentoring Zach, and having Titus as a backup mentor. There's some very cool stuff that can be done in this area, and I hope that at the end of the summer we'll have some solid automated testing techniques and tools that can be applied to any Python project, not only to the OLPC Sugar environment. Stay tuned for more info on this project. BTW, here is the list of PSF-sponsored applications accepted for this years' SoC.

Thursday, April 17, 2008

Come work for RIS Technology

We just posted this on craigslist, but it never hurts to blog about it too. If you're interested, send an email to techjobs at ristech.net. You and I might get to work together on the same team!

Open Source Tech Top Guns Wanted

Are you a passionate Linux user? Are you running the latest Ubuntu alpha release on your laptop just because you can? Are you wired to the latest technologies -- things like Amazon EC2/S3 and Google AppEngine? Are you a virtuoso when it comes to virtualization (Xen/VMWare)?

Do you program in Python? Do you take hard problems as personal challenges and don't give up until you solve them?

RIS Technology Inc. is a rapidly growing Los Angeles-based premium managed hosting provider that hosts and manages internet applications for medium to large size organizations nationwide. We have grown consistently at 100% each of the past four years and are currently hiring for additional growth at our corporate operations center near LAX, in Los Angeles, CA. We have immediate openings for dedicated and knowledgeable technology engineers. If the answer to the questions above is YES, then we'd like to extend an invitation to interview with us.

We are an equal opportunity employer and have excellent benefits. We realize that one of the main things that makes us excellent are the people we choose to work with. We look for the best and brightest and our goal is to make work less "work" and more fun.

Wednesday, April 16, 2008

Google App Engine feels constrictive

I've been toying a bit with Google App Engine. I was lucky enough to score one of the 10,000 developer accounts. I first went through their tutorial, which was fine. Then I tried to port a simple application that I used to run from the command line, which queried a range of IP addresses for their reverse DNS names. No luck. I was using the dnspython module, which in turn uses the Python socket module -- and socket is not available within the Google App Engine sandbox environment.

Also, I was talking to Michał on rewriting the Cheesecake service to run on Google App Engine, but he pointed out that cron jobs are not allowed, so that won't work either... It seems that with everything I've tried with GAE I've run into a wall so far. I know it's a 'paradigm change' for Web development, but still, I can't help wishing I had my favorite Python modules to play with.

What has your experience been with GAE so far? I know Kumar wrote a cool PyPI mirror in GAE, but I haven't seen many other 'real life' applications mentioned on Planet Python.

Friday, April 11, 2008

Ubuntu Gutsy woes with Intel 801 graphics card

I just upgraded my Dell Inspiron 6000 laptop to Ubuntu Gutsy last night. My graphics card is based on the Intel 810 chipset. After the upgrade, everything graphics-related was dog-slow. Scrolling in Firefox was choppy, IM-ing was choppy, even typing at the console was choppy. Surprisingly, I didn't find a lot of solutions to this problem. But many people on Ubuntu forums suggested disabling compiz/xgl, so that's what I ended up doing. In fact, I uninstalled all compiz and xgl-related packages, rebooted, and graphics became snappy again. Now back to trying to write an application to run on THE GOOGLE.

Thursday, April 10, 2008

Meme du jour: shell history

Here's mine from my Ubuntu laptop:


$ history|awk '{a[$2]++ } END{for(i in a){print a[i] " " i}}' |sort -rn|head
121 cd
91 ssh
82 ls
46 vi
28 python
26 scp
16 dig
12 more
7 twistd
6 rm

Thursday, April 03, 2008

Steve Loughran on 'Farms, Fabrics and Clouds'

Yesterday I and my colleagues at RIS Technology had the pleasure of attending a remote presentation given to us by Steve Loughran, who works as a researcher at HP Labs and is also a committer on the Ant project. I had seen Steve's slides from a presentation he gave at the University of Bristol on 'Farms, Fabrics and Clouds' back in December 2007, and I have been pestering him via email ever since, hoping to have him release a screencast. After much back and forth, Steve offered to simply present for now directly to us via Skype. He did it out of the goodness of his heart, but both he and I realized that there's a nice little business opportunity in this type of presentation: you release the slides with no audio, then you get hired to present to interested parties in person, remotely, via Skype and a shared set of slides, with a Q&A session at the end. Everybody wins in this scenario. Filing it in the 'ideas worth trying' category.

To come back to Steve's presentation -- here are the slides from a previous version. I hope he will soon post the updated version we saw yesterday, but the differences are not major. The co-author of the talk is Julio Guijarro. Their area of interest within HP Labs is the deployment of large applications across distributed resources and the management of these apps/resources with an eye to maximizing their output and minimizing their cost. A familiar (and hard) problem for everybody who works in the hosting industry.

Steve talked about how the infrastructure architectures have changed over the years from a single web server talking to a single database server, to clustering, and finally to server farms and computing-on-demand. The challenge for us 'server farmers' is to figure a way to manage thousands of servers, heaps of storage, a myriad of network infrastructure devices, and large distributed applications on top of that -- all while keeping everything purring and happy, running to their maximum potential. Sounds impossible, but Amazon seems to be doing a decent job at it. And in fact Steve spent quite some time talking about how Amazon changed the game by their S3 and EC2 offerings. Even though they're not quite ready for prime time in terms of production deployments, Amazon will soon get there. As a proof, see their recent introduction of static IP addresses in EC2, and of the possibility of running your application in different data centers.

In my opinion, the best of Steve's slides are the 'Assumptions that are now invalid' ones. They really turn the 'established facts and best practices' of infrastructure and application design on their heads. Here are some examples of assumptions that don't hold anymore in our day and time:

it is expensive to create, deploy and duplicate a new system, running a Linux image of your choice (see Instalinux as a counter-example)
system failure is unusal and 100% availability can be achieved
databases are the best form of storage
you need physical access to the data center
a single server farm needs to scale to infinity

My other favorite part, which is not in the online slides yet, is the concept of 'agile infrastructure'. I haven't seen this concept before applied to server hosting, but Steve has a great point here. If you look at something like Amazon EC2, where you can pay as you go, you can test you application in a smaller environment and then scale it up, you can move your application between data centers -- this is indeed an agile environment that also imposes some new demands on your application.

I really recommend that you check out Steve's slides. There's a lot to chew on, but you can't afford not to chew on it, if you have anything to do with the IT industry these days.

Here are a couple more links that might prove useful:

Anubis: a tuple-space implementation that uses multicast to share information between hosts within a site
SmartFrog: a technology from HP used to distribute and manage applications (think puppet but geared towards application deployment); see also Google video

Thanks again to Steve for presenting to us. Now, as a server farmer, I need to go back to my plow and try to improve it (maybe buy a tractor?)

Update: Steve has some more thoughts on the Agile Infrastructure concept. Intriguing. This is something I'll definitely keep a very close eye on and tinker with.

Wednesday, April 02, 2008

For you students interested in GSoC

If you're a student and you want to apply for a Python-related project for Google Summer of Code 2008, Matt Harrison has just the project for you. The project has to do with branch coverage analysis and reporting. Matt is willing to mentor too. It's a really good opportunity, so don't hesitate to apply. Hurry up though, the deadline is April 8th.

Tuesday, April 01, 2008

TurboGears and Pylons finally merging

This has been a long time coming, and fans of both projects have been eagerly waiting for it, but it's finally happened. Not sure if you've seen the announcements from Kevin Dangoor, Mark Ramm and Ben Bangert on their projects' mailing lists, but basically they boil down to "we feel like after the sprints at PyCon we made enough progress so that we can pull the trigger on merging the source code from the 2 projects in one common trunk." They make it sound like it was purely a technological problem, but I have my doubts about that. I think it was driven in part by the increasing popularity of Django. Unifying TurboGears and Pylons is a somewhat desperate measure to chip away at the Django market share. We'll see if it works or not. Check out the brand new page of the TurboPylons project.

Monday, March 31, 2008

ReviewBoard: open source code review tool

Via Marc Hedlund's post on O'Reilly Radar, here's an open source code review tool from VMWare: ReviewBoard. For all of us non-googlers out there, it's probably the next best thing to Guido's Mondrian (question: why has that tool not been released as open source?). Check out the sweet screenshots. The kicker though is that it uses Python and Django. Way to go, VMWare!