Wednesday, May 24, 2006

Cheesecake and the Summer of Code

I'm very pleased to announce that Michał Kwiatkowski's project "Cheesecake enhancements and its integration with PyPI" was accepted as a Google Summer of Code project under the Python Software Foundation umbrella. Here's a summary of Michał's application:

Cheesecake is an application designed to evaluate and estimate the overall quality (or so called 'kwalitee') of a given software package written in Python. It emphasizes a need for well-written documentation and unit tests, encouraging good programming practices and penalizing sloppy design and careless distribution. Using Cheesecake to check your code gives you confidence that your software doesn't merely run, but is usable and easy to test and modify as well.

Because Python is very easy to learn and use there exists a vast variety of software written in it, most of which was scattered until PyPI was created. Now, when new packages are being indexed on Cheese Shop every day, an effort can be made to spread the spirit of good software design and code reuse among the Python community. This can be achieved by combining the power of Cheesecake and Cheese Shop. Everytime a new version of a package would be uploaded to Cheese Shop, its cheesecake index will be calculated and published on web. Having a way to measure a quality of a package with accordance to other existing packages will be of invaluable help for all developers. It will promote well built packages and in the long run raise the overall quality of Python software.

Adding Cheesecake functionality to PyPI has been already mentioned by Phillip J. Eby on the catalog-sig mailing list. Together with Cheesecake maintainer Grig Gheorghiu we've discussed modifications needed to be done to Cheesecake code to be reliable enough so it could be incorporated into PyPI service. A working copy of our ideas is accessible on the project wiki. It includes enhancing Cheesecake code scoring techniques to take into account unit tests of a package, running tests in secure environment, extending supported archive formats and fixing all known bugs. Development of Cheesecake will adhere to best practices such as unit testing, continuous integration (via buildbot), pylint verification, etc.

The next part of this project will include collaboration with Richard Jones, PyPI maintainer, and merging Cheesecake into PyPI service. Upon completion all PyPI uploads will be automatically scored by Cheesecake. It will be possible to browse packages archive by cheesecake index, sorting results by installability, documentation and code kwalitee index. Statistics in numeric and graphical form will also be made available. This part of a project will involve writing server-side code, with emphasis on security and robustness.

The remaining time will be spent on resolving all problems that would occur during usage of Cheesecake and PyPI. Along with fixing bugs, I will develop a simple Hello world package that can be taken as an example of good development practices for all Python developers. It should also score 100% in the Cheesecake test of course. ;-) It will be what hello is for GNU Project.

If you're interested in details, this Cheesecake wiki page contains a lot of ideas which will start being turned into reality as of today :-) Please feel free to edit the page and add your own wishlist-type items.

Here are a few thoughts I had regarding the value of this project:

This project will have 2 very important contributions: first of all, it will integrate with PyPI and help rank the Cheeseshop packages according to various quality criteria. People learn better by example -- and what better examples than tools that score high on a scale that looks at different quality indicators such as documentation, installability, and code 'kwalitee'? Cheesecake will provide a way to identify the best-of-breed packages in those areas.

Second, the project will investigate ways to dynamically assess packages by executing their code in a sandbox environment. This will help mainly with getting code coverage numbers by running a project's unit tests, but one can easily envision many other applications -- one idea that Titus Brown had was to automatically apply and verify patches to Python core, without the fear that the host machine will crash and burn. This will hopefully streamline the process of accepting patches into Python core (a famously complicated process currently).

Michał and I will use Trac to manage this project. The idea is to have short iterations represented as milestones in Trac, with tickets of type 'enhancement' that represent the stories to be done in each iteration. Each story will be split into short tasks that can be accomplished in a matter of hours, and each task will be represented as a ticket of type....'task', what else? This will give us a nice way of watching the progress of the project over the summer. Of course, the criterion for the completion of a given story is: all unit/acceptance/functional tests should pass for that story.

I'm very excited to have Michał work on this project and I'm very hopeful that at the end of this summer we'll have a solid application that will benefit the Python community.

Here is the list of the 25 applications accepted to the Summer of Code under the PSF umbrella.

2 comments:

will said...

That's awesome! If Michael needs help testing things or wants a guinea pig, let me know. PyBlosxom still needs a lot of help and so it'd be a good project to test with.

I joined the cheesecake-devel mailing list, but you're more than welcome to email me directly.

Rock on Grig!

/will

Grig Gheorghiu said...

Will -- thanks for the comment and for offering PyBlosxom as a guinea pig :-) We'll surely take you up on your offer!

Grig