Monday, August 01, 2005

White-box vs. black-box testing

As I mentioned in my previous post, there's an ongoing discussion on the agile-testing mailing list on the merits of white-box vs. black-box testing. I had a lively exchange of opinions on this theme with Ron Jeffries. If you read my "Quick black-box testing example" post, you'll see the example of an application under test posted by Ron, as well as a list of back-box test activities and scenarios that I posted in reply. Ron questioned most of these black-box test scenarios, on the grounds that they provide little value to the overall testing process. In fact, I came away with the conclusion that Ron values black-box testing very little. He is of the opinion that white-box testing in the form of TDD is pretty much sufficient for the application to be rock-solid and as much bug-free as any piece of software can hope to be.

I never had the chance to work on an agile team, so I can't really tell if Ron's assertion is true or not. But my personal opinion is that there is no way developers doing TDD can catch several classes of bugs that are outside of their code-only realm. I'm thinking most of all about the various quality criteria categories, also known as 'ilities', popularized by James Bach. Here are some of them: usability, installability, compatibility, supportability, maintainability, portability, localizability. All these are qualities that are very hard to test in a white-box way. They all involve interactions with the operating system, with the hardware, with the other applications running on the machine hosting the AUT. To this list I would add performance/stress/load testing, security testing, error recoverability testing. I don't see how you can properly test all these things if you don't do black-box testing in addition to white-box type testing.

In fact, there's an important psychological distinction between developers doing TDD and 'traditional' testers doing mostly black-box testing. A developer thinks "This is my code. It works fine. In fact, I'm really proud of it.", while a tester is more likely to think "This code has some really nasty bugs. Let me discover them before our customer does." These two approaches are complementary. You can't perform just one at the expense of the other, or else your overall code quality will suffer. You need to build code with pride before you try to break it in various devious ways.

Here's one more argument from Ron as to why white-box testing is more valuable than black-box testing:

To try to simplify: the search method in question has been augmented with an integer "hint" that is used to say where in the large table we should start our search. The idea is that by giving a hint, it might speed up the search, but the search must always work even if the hint is bad.

The question I was asking was how we would test the hinting aspect.

I expect questions to arise such as those Michael Bolton would suggest, including perhaps:

What if the hint is negative?
What if the hint is after the match?
What if the hint is bigger than the size of the table?
What if integers are actually made of cheese?
What if there are more records in the table than a 32-bit int?

Then, I propose to display the code, which will include, at the front, some lines like this:

if (hint < 1) hint = 0;
if (hint > table.size) hint = 0;

Then, I propose to point out that if we know that code is there, there are a couple of tests we can save. Therefore white box testing can help make testing more efficient, QED.

My counter-argument was this: what if you mistakenly build a new release of your software out of some old revision of the source code, a revision which doesn't contain the first 2 lines of the search method? Presumably the old version of the code was TDD-ed, but since the 2 lines weren't there, we didn't have unit tests for them either. So if you didn't have black-box tests exercising those values of the hint argument, you'd let an important bug escape out in the wild. I don't think it's that expensive to create automated tests that verify the behavior of the search method with various well-chosen values of the hint argument. Having such a test harness in place goes a long way in protecting against admittedly weird situations such as the 'old code revision' I described.

In fact, as Amir Kolsky remarked on the agile-testing list, TDD itself can be seen as black-box testing, since when we unit test some functionality, we usually test the behavior of that piece of code and not its implementation, thus we're not really doing white-box testing. To this, Ron Jeffries and Ilja Preuss replied that in TDD, you write the next test with an eye on the existing code. In fact, you write the next test so that the next piece of functionality for the existing code fails. Then you make it pass, and so on. So you're really looking at both the internal implementation of the code and at its interfaces, as exposed in your unit tests. At this point, it seems to me that we're splitting hairs. Maybe we should talk about developer/code producer testing vs. non-developer/code consumer testing. In fact, I just read this morning a very nice blog post from Jonathan Kohl on a similar topic: "Testing an application in layers". Jonathan talks about manual vs. automated testing (another hotly debated topic on the agile-testing mailing list), but many of the ideas in his post can be applied to the white-box vs. black-box discussion.


Anonymous said...

I'd like to share an analogy I picked up in a Masters level Advanced Software Testing class.

Assume you are flying a plane looking for people stranded in the water below. You are not aware of how much range you need to cover, so you might want to fly high to try to cover the most area quickly. This is analagous to Black Box testing. Now let's assume you cover a wide area, but you are too high where you can't get a good view of individuals in certain areas. You would lower your plane and focus on this area. This is analagous to White Box testing. If you want to get really close to the water, you could, but it's the most expensive as it would take longer to cover the complete search area. This is analagous to Format Methods.
The overview of testing from a high level, is that Black box will cover a fairly large area, but lack the details that can be achieved by white box testing. Since black box is the cheapest method, it is good to use this first to cover a wide area, as testing is not free, pick critical areas that are the most important as far as requirements and develop white box tests for them. It makes sense to spend more on critical requirements. i.e. will it make a lot of people mad if level 10 of the free tetris game on the cell phone has a bug? Or would it make people mad if calls frequently dropped or adding new entries in the address book didn't work well?

My vote is using Black Box and White Box together strategically will achieve the most cost effective solution to uncovering the most bugs. Sure one could argue that white box or formal methods can achieve the best coverage, but how much more time and cost does it add? Many companies undervalue the testing domain, and it truly shows on products. I see many situations where better testing could have stopped major bugs and lead to increases in reliability, quality, usability, ......

Hope this helps somebody out when trying to see the big picture....

Grig Gheorghiu said...

Jason -- thanks for your comment. You'll excuse me if I'm a bit skeptical of the values taught in a class such as 'Advanced Software Testing' which doesn't seem to mention automated testing (or at least you don't). In my book, writing unit tests for your software (and it can't get more white-box than unit tests) and running them in an automated fashion, inside a continuous integration process, is the single most important thing you can do to maintain your sanity, and improve the quality of your app at the same time. Unfortunately, pompously academic classes like the ones you took don't bother mentioning this, but instead focus on the cost incurred to implement white box testing. Well, if you don't incur that cost upfront by writing unit tests, you'll surely incur it N times over later on, when the customers discover bugs in the field....


Anonymous said...

I don't understand the whole white box versus black box argument.
It is akin to the whole Quantum Mechanics versus Relativity debate in physics. There was a time when the physics world was separated by who thought which one was the most valid representation of the physical world until it turned out that they were both valid.
Quantum Mechanics is best used for representing the atomic scale of the physical world (or in the case of white box testing, the atomic scale of the application), with Relativity being applied to problems of a scale of the planets (or in the case Black box testing that of the large components that make up the application)

In other words, it isn't a crime to use both!

Anonymous said...

I want to knw As we can apply randomise techniqes to black box tesing can we apply the same to white box testing? If yes how we can do that

Anonymous said...

Your counter-argument about the missing 2 lines describes this issue perfectly. After a long research I found excellent tools for black box testing , I can't see myself working without it.

Roshan Kumar said...

Great effort....

This clearly describes the differences between black box and white box testing

Anonymous said...

I'm this blog entitled Agile testing? and then you say:
"I never had the chance to work on an agile team." And you're arguing with Michael Bolton?????

Lost credibility here!

(ah...moderated, bet this doesn't show up!)

Grig Gheorghiu said...

Anonymous -- I wish you were not so coward as to hide your own name. I did allow your comment to go through, I only moderate spam. Do note that my post was written in Aug. 2005. I've been working in many agile shops since and my thoughts expressed in this blog post stand unchanged.

The irony here is that I was actually agreeing with the Context-Driven testers, James Bach and Michael Bolton in particular, although in general it's true I tend not to agree with some of their philosophies.

So dear Anonymous Coward, please read my posts a bit more carefully before you comment. I don't particularly care if I lost credibility in your eyes either, FWIW.

Anonymous said...

The idea that one doesn't need black box testing for apps (generally speaking) is frankly absurd. Black box testing is by definition testing the functionality of an app. Doesn't Ron believe you should test drive a car before letting it out of the factory either? Yikes

Anonymous said...

@Anonymous - Pretty sure most cars are *not* driven before they are let off the factory assembly line.

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...