Why Most Unit Testing is a Waste

I just finished reading “Why Most Unit Testing is a Waste”, a paper (PDF) by James O Coplien.

I spent several years working as a “Quality Lead” in two companies that were writing Operating System code in C++. Unit Testing was “in” at the time (and the engineers would write their own tests and that would make the product better and the process more efficient!) From my background, you might assume (correctly) that this topic is of interest to me.

Not only is this an interesting topic, Coplien’s paper is also well written and a delight to read. It’s unusual to find such approachable and relatable writing on this topic. I recommend this paper to anyone who writes code or manages teams that write code.

Into Modern Times

Unit testing was a staple of the FORTRAN days, when a function was a function and was sometimes worthy of functional testing. Computers computed, and functions and procedures represented units of computation. In those days the dominant design process composed complex external functionality from smaller chunks, which in turn orchestrated yet smaller chunks, and so on down to the level of well-understood primitives. Each layer supported the layers above it. You actually stood a good chance that you could trace the functionality of the things at the bottom, called functions and procedures, to the requirements that gave rise to them out at the human interface. There was hope that a good designer could understand a given function’s business purpose. And it was possible, at least in well-structured code, to reason about the calling tree. You could mentally simulate code execution in a code review.

Object orientation slowly took the world by storm, and it turned the design world upside-down. First, the design units changed from things-that-computed to small heterogeneous composites called objects that combine several programming artifacts, including functions and data, together inside one wrapper. The object paradigm used classes to wrap several functions together with the specifications of the data global to those functions. The class became a cookie cutter from which objects were created at run time. In a given computing context, the exact function to be called is determined at run-time and cannot be deduced from the source code as it could in FORTRAN. That made it impossible to reason about run-time behaviour of code by inspection alone. You had to run the program to get the faintest idea of what was
going on.

So, testing became in again. And it was unit testing with a vengeance. The object community had discovered the value of early feedback, propelled by the increasing speed of machines and by the rise in the number of personal computers. Design became much more data-focused because objects were shaped more by their data structure than by any properties of their methods. The lack of any explicit calling structure made it difficult to place any single function execution in the context of its execution. What little chance there might have been to do so was taken away by polymorphism. So integration testing was out; unit testing was in. System testing was still somewhere there in the background but seemed either to become someone else’s problem or, more dangerously, was run by the same people who wrote the code as kind of a grown-up version of unit testing.

— Why Most Unit Testing is a Waste,

Coplien builds his case carefully, with plenty of examples and recommendations, through seven “chapters”.

The Cure is Worse than the Disease

Tests for their Own Sake and Designed Tests

The Belief that Tests are Smarter than Code Telegraphs Latent Fear or a Bad Process

Low-Risk Tests Have Low (even potentially negative) Payoff

Complex Things are Complicated

Less is More, or: You are Not Schizophrenic

You Pay for Tests in Maintenance — and Quality!

The paper was not only enjoyable to read, it brought back a few memories for me:

Coplien mentions a client who informed him that “they had written their tests in such a way that they didn’t have to change the tests when the functionality changed”. I was reminded of a story from a co-worker many years ago.
My co-worker asked a client how they knew their code was good. The reply was “Our engineers are very sincere.”

I’ve retained that story ever since. Are your engineers “very sincere”?
Regarding tautological testing, information from failed tests, and more, I recall having a fruitless discussion with some of our testing team at one job. They were insisting that the code was good because all of the tests passed. I was trying (and failing) to explain that all this really meant was that all of the tests passed.
At the same job, management was concerned because we weren’t finding the number of “bugs per line of code” that certain articles and studies they had read told them we should be finding. We (the Quality Leads and the Engineering Leads) kept trying to explain that we were using OOP (C++) and the articles they referred to were written based on C code. To no avail.
So, as any smart engineer can, we found ways to report more bugs per line of code. Never underestimate the power of an engineer to game the system to provide management with the results it thinks it wants.
Regarding code coverage… I was responsible for the code coverage metrics at one job. We were working on an OS upgrade; my job was to take the output of the code coverage tool, run it through some analysis routines, and produce reports. ONe of the teams consistently had low “coverage” scores for a large piece of code. This was the I/O team and the code in question was essentially an enormous case statement of the form
```
case driver A ) do this;
case driver B ) do that;
case driver C ) do something_else;
```
Easy to desk check; difficult (impossible in practice) to hit every line of code. They could, of course, have written a thousand unit tests, but would that have truly accomplished anything of value?

Recommended reading: Why Most Unit Testing is a Waste, by James O Coplien.

Also recommended, two follow-on posts:
TDD is dead. Long live testing. by DHH,
and Driven, by Mark Bernstein.