Tuesday, March 18, 2008

Links to resources from PyCon talks

I took some notes at the PyCon talks I've been to, and I'm gathering links to resources referenced in these talks. Hopefully they'll be useful to somebody (I know they will be to me at least.)

"MPI Cluster Programming with Python and Amazon EC2" by Pete Skomoroch

* slides in PDF format
* Message Passing Interface (MPI) modules for Python: mpi4py, pympi
* ElasticWulf project (Beowulf-like setup on Amazon EC2)
* IPython1: parallel computing in Python
* EC2 gotchas

"Like Switching on the Light: Managing an Elastic Compute Cluster with Python" by George Belotsky

* S3FS: mount S3 as a local file system using Fuse (unstable)
* EC2UI: Firefox extension for managing EC2 clusters
* S3 Organizer: Firefox extension for managing S3 storage
* bundling an EC2 AMI and storing it to S3
* the boto library, which allows programmatic manipulation of Amazon Web services such as EC2, S3, SimpleDB etc. (a python-boto package is available for most Linux distributions too; for example 'yum install python-boto)

"PyTriton: building a petabyte storage system" by Jonathan Ellis

* All this was done at Mozy (online remote backup, now owned by EMC, just like Avamar, the company I used to work for)
* They maxed out Foundry load balancers, so they ended up using LVS + ipvsadm
* They used erasure coding for data integrity -- rolled their own algorithm but Jonathan recommended that people use zfec developed by AllMyData
* An alternative to erasure coding would be to use RAID6, which is used by Carbonite

"Use Google Spreadsheets API to create a database in the cloud" by Jeffrey Scudder

* slides online
* APIs and documentation on google code

"Supervisor as a platform" by Chris McDonough and Mike Naberezny

* slides online
* supervisord home page

"Managing complexity (and testing)" by Matt Harrison

* slides online
* PyMetrics module for measuring the McCabe complexity of your code
* coverage module and figleaf module for measuring your code coverage

Resources from lightning talks

* bug.gd -- online repository of solutions to bugs, backtraces, exceptions etc (you can easy_install bug.gd, then call error_help() after you get a traceback to try to get a solution)
* geopy -- geocode package
* pvote.org -- Ka-Ping Yee's electronic voting software in 460 lines of Python (see also Ping's PhD dissertation on the topic of Building Reliable Voting Machine Software)
* bitsyblog -- a minimalist approach to blog software

6 comments:

Anonymous said...

For all online backup, file sharing and storage related info, I recommend this website:

http://www.BackupReview.info

Doug Napoleone said...

Grig,

There is a bug in the scheduleing app where talk data can not be uploaded to tutorials, (but regular talks is working).

Would you be interested in helping me collect the talk slides that have been posted online and get them into the app for central storage on the site?

Grig Gheorghiu said...

Doug -- sure, I can help you with that. Send me an email to grig at gheorghiu.net to coordinate.

Grig

AJ said...

Thanks for taking the time to compile this list of helpful info on Pycon 2008. I wasn't able to attend this year, and it's nice to see some of the content.

just-another.net said...

You didn't happen to dig up an resources from Eggs and Buildout Deployment in Python by Jeff Rush, did you? He said he'd email us the Powerpoint, but I've yet to see it.

Grig Gheorghiu said...

just-another.net -- I haven't found Jeff's slides yet. If you want to email him personally, his email is jeff at taupro.com.

Grig