Building Real Software: Automated Tests as Documentation

One of the arguments for writing automated tests is that tests can act as useful documentation for a system. But what do tests document? And who will find this documentation useful?

Most developers don’t rely on system documentation because there isn't enough documentation to give them a complete idea of how the system works, or because there’s too much of it to read, or because it’s not well written, or because it’s not up to date, or because they don’t believe it’s up to date.

But a good set of automated tests can tell you how the system really works today – not how somebody thought it works or how they thought it was supposed to work or how it used to work, if anybody bothered to write any of this down.

“Tests are no more and no less than executable documentation. When writing new code, it makes sense that it should be documented by the person closest to it: the author. That’s my rationale for developer testing.

Comment “doc blocks” and wiki pages, are always found to be inaccurate to some extent. By contrast automated tests fail noisily whenever they go out of date. So with regard to documentation, automated tests are uniquely advantageous in that (given a CI server) you at least have the option of keeping your documentation up to date whenever it starts to drift away from reality.”
Noah Sussman

You might be able to use tests as documentation, if…

To be useful as documentation, tests have to be:

comprehensive – they have to cover all of the important areas of the code and functions of the system;
run often and work – run on every check-in, or least often enough to give everyone confidence that they are up to date; and the tests have to pass – you can’t leave tests broken or failing;
written to be read – writing a test that works, and writing a test that can be used as documentation, are two different things;
at the right level of abstraction – most people when they talk about automated tests mean unit tests. …

Unit tests constitute design documentation that evolves naturally with a system. Read that again. This is the Holy Grail of software development, documentation that evolves naturally with a system. What better way to document a class than to provide a coded set of use cases. That's what these unit tests are: a set of coded use cases that document what a class does, given a controlled set of inputs. As such, this design document is always up-to-date because the unit tests always have to pass.
Jeff Canna Testing, fun? Really?

Of course even tests that run frequently and pass in Continuous Integration or Continuous Delivery can still contain mistakes. There will be tests that pass but shouldn't, or tests that tell you what the system does, but what the system does is not what what it is supposed to do (one of the risks of having developers writing tests is that if they misunderstood the requirement and got the code wrong, they got the tests wrong too). But on the whole, tests that run often should be more accurate than some document that may or may not have been right in the first place and that probably hasn't been kept up to date since. And through code coverage analysis, you can at least understand what parts of the system are described by the tests, and what parts aren't.

Tests have to be written to be read

One problem with using tests as documentation, is that tests are only a means to an end – even tests written up front in TDD. Tests are just another part of the supporting infrastructure and tooling that developers rely on to write code (and it's all about the code). Tests are tools to help developers think about what code they need to write (if they are following TDD), to prove that the code does what it was supposed to do, and to catch mistakes when people make changes in the future.

As a result, developers don’t put the same amount of attention and discipline into designing and implementing tests as they do with the code itself. As long as the tests run quickly and they cover the code (or at least the important parts of the code), they’ve done the job. Nobody has to care that much about how the tests are named or what they look like inside. Few teams peer review unit tests, and even then most reviewers only check to see that somebody wrote a test, and not that every test is correct, or that each test has a nice meaningful and consistent name and is easy to understand. There aren't a lot of developers who understand xUnit Test Patterns or spend extra time – or can afford to spend extra time – refactoring tests. So, a lot of automated tests aren't clean, consistent or easy to follow.

Another thing that makes unit tests hard to follow as documentation, is the way that tests are usually organized. Common conventions for structuring and naming unit tests are that developers will write a test class/module for every code class/module in the system, with test methods to assert specific behaviour in that code module.

Assuming that whoever wrote the code wrote a comprehensive set of tests and followed a consistent, well-defined structure and naming approach, and that they came up with good, descriptive names for each test class and test method and that everyone who worked on the code over time understood all of this and took the trouble to keep all of it up as they changed and refactored the code and moved responsibilities around and wrote their own tests (which is assuming a lot), then you should be able to get a good idea of what’s happening inside each piece of code by following the tests.

Can you really understand a system from Unit Tests?

But you’re still not going to be able to read a nice story about how a system is designed or what the system does or how it does it by reading unit tests.

Even if unit tests are complete and up-to-date and well written and well organized and have good names (if, if, if, if, if), the accumulation of details built up from looking at all of these tests is overwhelming. Unit tests are too close to the metal, sometimes obscured by fixtures and other test plumbing, and too far removed from the important aspects of the business logic or the design.

UnitTests include a lot of testing of lower level processes that have no direct connection to the stories.
Steve Jorgensen, comment in Unit Test as Documentation

Tests follow the code. You can’t understand the tests without understanding the code. So why not read the code instead? You’ll have to do this eventually to make sure that you really know what’s going on, and because without reading the code, you can’t know if the tests are well-written in the first place.

One place where low-level developer tests may be useful as documentation is describing how to use an API – provided that tests are comprehensive, expressive and “named in such a way that the behavior they validate is evident”.

A good set of unit tests like this can act as a reference implementation, showing how the API is supposed to be used, documenting common usage details:

If you are looking for authoritative answers on how to use the Rails API, look no further than the Rails unit tests. Those tests provide excellent documentation on what's supported by the API and what isn't, how the API is intended to be used, the kind of use cases and domain specific problems that drove the API, and also what API usage is most likely to work in future versions of Rails.
Peter Marklund, Rails Tip: Use the Unit Tests as Documentation

But again, this can only work if you take the time to design, organize and write the tests to do “Double Duty” as tests and as documentation, (Double Duty: How to repurpose the unit tests you’re doing to help create the documentation you’re not, by Brian Button) and make your tests “as understandable as humanly possible for as many different readers as possible”.

Tests as documentation?

I like the idea that automated tests can serve as documentation – we’d all save time and money this way. But who is this documentation for, and how is it supposed to be used?

I don’t know any developers who would start by reading tests cases in order to understand the design of a system. A good developer knows that they can’t trust documents or pictures, or tests, or even what other programmers tell them, or comments in the code. The only thing that they can trust is the code.

Tests might be more useful as documentation to testers. After all, it’s a tester’s job to understand the tests in order to maintain them or add to them. But most testers aren't going to learn much from most tests. When it comes to unit tests, it is the same for testers as it is for developers: unit tests aren't useful unless you understand the code – and if you understand the code, then you should read it instead. Higher-level acceptance tests are easier to understand and more useful to look at especially for non-technical people. It should be easier to tell a story and to follow a story about what a system does through high-level functional and integration scenarios (big fat tests), or acceptance tests captured in a tool like Fitnesse. Rather than asserting detailed implementation-specific conditions, these tests describe technical scenarios, or business rules and business workflows and other requirements that somebody thought were important enough to test for.

But even if you can follow these tests, there’s no way to know how well they describe what the system does without spending time talking to people, learning about the domain, testing the system yourself, and… reading the code.

I asked Jonathan Kohl, one of the smartest testers I know, about his experience using automated tests as documentation:

Back in '03 and '04, Brian Marick and I were looking into this. We held an experimental workshop at XP/Agile Universe in 2004 with testers, developers and other project stakeholders to see what would happen if people who were unfamiliar with the program code (both the code and the automated unit tests) could get something useful from the automated tests as documentation. It was a complete and utter failure. No one really got any value whatsoever from the automated unit tests from a doc perspective. We had to explain what was going on afterwards…

Marick and I essentially tossed the idea aside after that experience. I gave up on the whole double duty thing. Documentation is really good at documenting and explaining, while code is good at program creation and executing. I go for the right tool for the job now, rather than try to overload concepts.

Bottom line, I do not find tests useful as documentation at all. When done well, I do find them useful as examples of implementation of interfaces, etc. when I am new to a system, but nothing replaces a good written doc, especially when coupled with some face-to-face brainstorming and explanations.

There are a lot of benefits to automated testing, especially once you have a good automated suite in place. But documentation is not one of them.

3 comments:

Gareth Waterhouse said...: Is this not where bdd comes into its own as a tool for automating tests? By writing it in gwt format you have a good set of documentation in the feature files?

I do agree that unit tests probably don't offer much towards documentation, but I do think that bdd can be the solution, especially if the tests are run as part of a nightly build.

What are your thoughts on bdd as a form of documentation?; June 13, 2013 at 1:32 PM
Jonathan Kohl said...: If I was writing a chapter in a book about your software, the code and automated tests would be figures and illustrations, not the text.; June 13, 2013 at 2:55 PM
Jim Bird said...: Gareth, that's the promise of BDD, but does it really work? I don't have enough experience with it to say one way or the other. Have you tried it on a big project and seen the results? And what happens over time, when the software is in place for years and has been changed in maintenance - do teams stick with BDD for long enough?; June 14, 2013 at 9:01 AM