“Reviews catch more than half of a product’s defects regardless of the domain, level of maturity of the organization, or lifecycle phase during which they were applied”. What We Have Learned About Fighting DefectsRecent research into code review practices and advances in tools make reviews more effective and less expensive, and can change the way that we think of code reviews and the way that we do them.
Lightweight code reviews work
A lot of the literature on code reviews, including Code Complete, focus on formal, team-based code inspections, using a model defined by Michael Fagan at IBM in the 1970s. Heavyweight and high-ceremony, Fagan inspections typically involve a minimum of 4 people: the programmer, the reader, a moderator, and one or more reviewers. Each review meeting requires extensive preparation, lots of documentation, and can last several hours.
More people now are having success with lightweight but still disciplined code reviews; a middle ground between informal “hey pal can you take a look at this” over-the-shoulder code walkthroughs, and expensive full-on Fagan inspections. In the Modern Code Review chapter of Making Software, Jason Cohen explains that lightweight reviews can be almost as effective, and much less expensive than formal inspections. The formal review meeting in a Fagan inspection, which is expensive to run and difficult to arrange, is mostly a waste of time – 95% of defects are found before the meeting. The real purpose of the meeting is to communicate the findings and to make sure that the programmer understands them. If that’s the case, the meetings can be held in much less formal way, or not at all.
Tool-assisted reviews
Reviewers today can rely on tools to help make their work more effective and efficient. Static Analysis bug checkers like Findbugs, tools from Coverity and Klocwork, and even checkers built-in to IDEs like IntelliJ Idea make the job of code reviewers easier and more valuable. These tools catch subtle bugs and common mistakes in coding that get past compilers and that can get past reviewers too, they highlight crappy or questionable code, and can enforce style conventions and best practices in coding. This means that reviewers can focus on more fundamental issues like correctness and error handling, and safety and security issues (and these tools, and others like Fortify 360 help find security vulnerabilities too).
Formal review meetings are expensive and difficult to setup. A lot of smaller teams do reviews offline instead through email, with the occasional one-on-one meeting when the programmer or reviewer feels that it is necessary. Bigger, distributed teams can use peer review collaboration tools like Crucible or Review Board or SmartBear’s CodeCollaborator or Google Code Reviews (aka Rietveld) to share code review findings. These tools integrate with your version control system (and sometimes your bug tracker) and are especially useful if you are involving multiple reviewers in different locations and different timezones. Reviewers and the programmer can add annotations or comments directly in with the changes, cutting back on the need for review meetings.
Some teams use these tools to ensure that code reviews are done, and prove that they are being done for compliance: they can enforce workflows (that reviews are done before code is committed to the main code line) and managers (and auditors) can see the results of code reviews. New programmers can look at the review history for code that they are working on to understand more about the code and about how to do an effective review, and programmers and managers can go back over the results of reviews and find ways to improve them. Some of these tools collect code review metrics, and others, like Klocwork Inspect, integrate reviewer findings and static analysis findings into the same review space.
Another code review tool, this time to help with security reviews, is Agnitio from David Rook, the Security Ninja. Agnitio serves a different purpose: rather than helping developers collaborate on a code review, it structures and guides a reviewer through a security review, following a detailed code and design review checklist, and then records the results of each review.
Reviews find more than functional defects
Reviews can find important and fundamental design and implementation problems. This includes subtle logic problems and operational safety weaknesses that testers may not find because they are hard to test for, and reviewers have the “white box advantage” that they can see problems or omissions in the code. And code reviews are important for finding security vulnerabilities – often the only way to find vulnerabilities, except through exhaustive and expensive pen testing. This is why code reviews are a fundamental part of secure SDLC’s like Microsoft’s SDL.
But there’s even more to reviews than finding bugs and security vulnerabilities. An interesting study by Mantyla and Lassenius in 2009 “What types of defects are really discovered in code reviews?” builds on earlier research by Siy and Votta at Bell Labs in the late 1990s, to show that the majority of problems found by reviewers are not functional mistakes, but what the reviewers call evolvability defects: issues that make code harder to understand and maintain, more fragile and more difficult to modify and fix.
On average, between 60% and 75% of the defects found in code reviews fall into this class. Of these, around 1/3 are simple code clarity issues: improving element naming and comments, making sure that the code makes sense. Most of the rest of the findings are organizational problems where the code is poorly structured, or duplicate or unused code; or approach problems where the reviewer can see a much simpler and cleaner implementation, sometimes replacing hand-rolled code with built in language features or library calls. And reviewers can also find changes that don’t belong or aren’t required, copy-and-paste mistakes and inconsistencies.
These defects or recommendations feed back into refactoring and are important for future maintenance of the software, reducing complexity and making it easier to change or fix the code in the future. But it’s more than this: many of these changes also reduce the technical risk of implementation, offering simpler and safer ways to solve the problem, and isolating changes or reducing the scope of a change, which in turn will reduce the number of defects that could be found in testing or escape into the wild.
So why aren’t more people reviewing code?
A study in 2005 found that less than half of development teams are following code reviews.
WTF?
Sure, reviews are hard. One of my colleagues who spends a lot of his time reviewing code finds it exhausting, much harder than writing code himself. To do a good review, you have to spend the time to understand the code and what the change was supposed to do, and then put yourself in to the mind of the programmer and work through the solution approach that they decided on. It takes time and it takes discipline. But the return on investment is clear.
I’ll give a pass to XP-style teams that already follow disciplined pair programming – although the focus of pairing is on finding a good solution, rather than looking for problems, a lot of mistakes and structural issues will still be caught in pair programming. Other problems can be caught by a good static analysis tool. And most of these teams are also writing disciplined unit tests, probably following TDD. They’re not writing sloppy code.
Solo programmers have an obvious problem: if you’re coding on your own, who’s going to review your code? You’re limited to self-reviews: setting the code aside for a while and then reviewing it on your own. This is more effective than you would expect. Jason Cohen says that programmers reviewing their own code can find as many as half of the issues that another reviewer would find, just by taking a second look. And solo programmers can also rely on static analysis tools to find some problems.
Reviews don’t need to be a big deal, you don’t need formal review meetings. And there are tools to help make reviews cheaper, easier and more effective. So, what about the rest of you? Why aren’t you doing code reviews? What’s your excuse?