Cheesecake: how tasty is your code?

Update 3/20/06: I'm republishing this post in order to fix this blog's atom.xml index file by getting rid of some malformed XML.

Our friends in the Perl community came up with the concept of KWALITEE: "It looks like quality, it sounds like quality, but it's not quite quality". Kwalitee is an empiric measure of how good a specific body of code is. It defines quality indicators and measures the code along them. It is currently used by the CPANTS Testing Service to evaluate the 'goodness' of CPAN packages. Here are some of the quality indicators that measure kwalitee:
  • extractable: does the package use a known packaging format?
  • has_version: does the package name contain a version number?
  • has_readme: does the package contain a README file?
  • has_buildtool: does the package contain a Makefile?
  • has_tests: does the package contain tests?
I think it would be worth having a similar quality indicator for Python modules. Since the Python CPAN equivalent is the PyPI hosted at the Cheese Shop, it stands to reason that the quality indicator of a PyPI package should be called the Cheesecake index, and I hereby declare that I'm starting the Cheesecake project. The goal of the project is to produce a tool that emits a Cheesecake index for a given Python distribution.

Here are some metrics and tools that I think could be used in computing the Cheesecake index, in addition to some of the CPAN kwalitee metrics:
  • unit test coverage: how many methods/functions are exercised in the unit tests?
  • docstring coverage: how many methods/functions have docstrings?
  • PyFlakes/PyLint validation
As synchronicity would have it, I found a post on today that refers to well-written Python code. Here are some ideas that Micah Elliott shared about what constitutes a "Pythonic" distribution:

  • Has modules grouped into packages, all are cohesive, loosely coupled, and reasonable length
  • Largely follows PEP conventions
  • Avoids reinventing any wheels by using as many Python-provided modules as possible
  • Well documented for users (manpages or other) and developers (docstrings), yet self-documenting with minimal inline commenting
  • Uses distutils for ease of distribution
  • Contains standard informational files such as: BUGS.txt COPYING.txt FAQ.txt HISTORY.txt README.txt THANKS.txt
  • Contains standard directory structure such as: doc/ tools/ (or scripts/ or bin/) packageX/ packageY/ test/
  • Clean UI, easy to use, probably relying on optparse or getopt
  • Has many unit tests that are trivial to run, and code is structured to facilitate building of tests
  • The first example of a pythonic package that comes to my mind is docutils
Checking for some of these things can be automated. Some properties, such as 'clean UI' or 'reasonable length', are more subjective and harder to automate, but in any case they're all very good ideas and a good starting point for computing the Cheesecake index.

Any other ideas? Anybody interested in participating in such a project? Leave a comment with your email address or send me email at grig at gheorghiu dot net.


Alex A. Naanou said…
this indeed would be fun :)

..It would as well be interesting to add a subjective score on various subjects that a user of the package can add.

I would be interested in both such a software (especially if it would be configurable to other styles) and working on the task...
Grig Gheorghiu said…
Micah Elliott adds in an email:

Grig, I think you're onto something here; good idea. I have no
experience with CPANTS, and I'm not sure how many of my ideals could be
checked programmatically. But if your Cheesecake tool comes into
fruition, here are some things that I would personally find useful:

* A command-line version that I could easily run on my projects.

* An output that gives more than just an index/score; maybe a bunch of
stats/indicators like pylint. I.e., it would be say "pypkglint" or
"pydistchecker", a higher level lint that operates on packages
of just source files.

* Some checks that might be useful

- Module and package naming conventions. (PEP-8 describes
module-naming, but I see this broken more often than followed in
practice. And it is silent on package names, but the tutorial
capitalized names.) Some consistency here would be nice.

- Existence of standard files. ESR goes into detail on this in his
"Art of UNIX Programming" book (pp 452).

- Existence of standard directories (those I mentioned before).

- Output of checkee "--help" should satisfy some standards. I
check my own tools by running "help2man" which forces me to setup
optparse to follow a strict format. I have some active RFEs on
optik (optparse) to address this.

- Use of distutils. Maybe just a check for ?

- Consistency of module length. Not sure about this one, but you
might lower the score if some package modules are 10 lines while
others are 10KLOC.

- Number of modules per package. Maybe 4..20 is a good amount?

- Extra points for existence of something like "api.html", which
indicates that epydoc/pydoc generated API info.

- Extra points for .svn/CVS/RCS directories indicating that version
control is in place. Maybe even glarking of version numbers where
high numbers indicate that code is checked in frequently.

- Use of ReST in documentation, or even in docstrings.

- Count of unit tests. Do module names map to test_modulename in
test directory? How many testXXX functions exist?

- A summary calculation of pylint/pychecker scores for each module.

- Point deduction (or fail!) if any .doc/.xls, etc. files included

- Extra points for use of modules that indicate extra usability was
incorporated, such as: gettext (multi-language), optparse (clean
UI), configparser (fine control), etc.

* A PEP describing the conventions (though some will argue that PEPs
should be enforcable by the compiler, so maybe just a "Cheesecake
Convention" document).

* And of course anything that CPANTS offers :-)

I'm sure people here have more ideas for quality indicators...
jason said…
another excellent thing that could be ripped off from the perl world is a built-in test target for distutils. I am constantly shocked that there is is python test that does anything helpful. :(

Popular posts from this blog

Performance vs. load vs. stress testing

Running Gatling load tests in Docker containers via Jenkins

Dynamic DNS updates with nsupdate and BIND 9