As I mentioned in my previous post, there's an ongoing discussion on the agile-testing mailing list on the merits of white-box vs. black-box testing. I had a lively exchange of opinions on this theme with Ron Jeffries. If you read my "Quick black-box testing example" post, you'll see the example of an application under test posted by Ron, as well as a list of back-box test activities and scenarios that I posted in reply. Ron questioned most of these black-box test scenarios, on the grounds that they provide little value to the overall testing process. In fact, I came away with the conclusion that Ron values black-box testing very little. He is of the opinion that white-box testing in the form of TDD is pretty much sufficient for the application to be rock-solid and as much bug-free as any piece of software can hope to be.
I never had the chance to work on an agile team, so I can't really tell if Ron's assertion is true or not. But my personal opinion is that there is no way developers doing TDD can catch several classes of bugs that are outside of their code-only realm. I'm thinking most of all about the various quality criteria categories, also known as 'ilities', popularized by James Bach. Here are some of them: usability, installability, compatibility, supportability, maintainability, portability, localizability. All these are qualities that are very hard to test in a white-box way. They all involve interactions with the operating system, with the hardware, with the other applications running on the machine hosting the AUT. To this list I would add performance/stress/load testing, security testing, error recoverability testing. I don't see how you can properly test all these things if you don't do black-box testing in addition to white-box type testing.
In fact, there's an important psychological distinction between developers doing TDD and 'traditional' testers doing mostly black-box testing. A developer thinks "This is my code. It works fine. In fact, I'm really proud of it.", while a tester is more likely to think "This code has some really nasty bugs. Let me discover them before our customer does." These two approaches are complementary. You can't perform just one at the expense of the other, or else your overall code quality will suffer. You need to build code with pride before you try to break it in various devious ways.
Here's one more argument from Ron as to why white-box testing is more valuable than black-box testing:
To try to simplify: the search method in question has been augmented with an integer "hint" that is used to say where in the large table we should start our search. The idea is that by giving a hint, it might speed up the search, but the search must always work even if the hint is bad.
The question I was asking was how we would test the hinting aspect.
I expect questions to arise such as those Michael Bolton would suggest, including perhaps:
What if the hint is negative?
What if the hint is after the match?
What if the hint is bigger than the size of the table?
What if integers are actually made of cheese?
What if there are more records in the table than a 32-bit int?
Then, I propose to display the code, which will include, at the front, some lines like this:
if (hint < 1) hint = 0;
if (hint > table.size) hint = 0;
Then, I propose to point out that if we know that code is there, there are a couple of tests we can save. Therefore white box testing can help make testing more efficient, QED.
My counter-argument was this: what if you mistakenly build a new release of your software out of some old revision of the source code, a revision which doesn't contain the first 2 lines of the search method? Presumably the old version of the code was TDD-ed, but since the 2 lines weren't there, we didn't have unit tests for them either. So if you didn't have black-box tests exercising those values of the hint argument, you'd let an important bug escape out in the wild. I don't think it's that expensive to create automated tests that verify the behavior of the search method with various well-chosen values of the hint argument. Having such a test harness in place goes a long way in protecting against admittedly weird situations such as the 'old code revision' I described.
In fact, as Amir Kolsky remarked on the agile-testing list, TDD itself can be seen as black-box testing, since when we unit test some functionality, we usually test the behavior of that piece of code and not its implementation, thus we're not really doing white-box testing. To this, Ron Jeffries and Ilja Preuss replied that in TDD, you write the next test with an eye on the existing code. In fact, you write the next test so that the next piece of functionality for the existing code fails. Then you make it pass, and so on. So you're really looking at both the internal implementation of the code and at its interfaces, as exposed in your unit tests. At this point, it seems to me that we're splitting hairs. Maybe we should talk about developer/code producer testing vs. non-developer/code consumer testing. In fact, I just read this morning a very nice blog post from Jonathan Kohl on a similar topic: "Testing an application in layers". Jonathan talks about manual vs. automated testing (another hotly debated topic on the agile-testing mailing list), but many of the ideas in his post can be applied to the white-box vs. black-box discussion.