Building Real Software: June 2013

Thursday, June 27, 2013

Design Patterns after Design is Done

Design Patterns are a useful tool when you are designing a system, an effective shorthand for communicating and sharing design ideas and a way to build consistency into the code – if people understand them and follow patterns properly.

I'm not interested in arguments over whether design patterns are good or not, or which patterns are good and which ones aren't - although these are all important questions.

What I want to understand is how useful design patterns are over time. Do they still matter after the initial design is done?

Looking for Patterns in Code

The first thing to ask is whether developers working on the code can recognize the design patterns that were used, and how useful is it for them when they do.

In Design Patterns for Maintenance, Joshua Engel makes a case that when someone maintaining code recognizes a pattern, they instantly get more context, which means they can move faster and with more confidence:

“When the maintainer recognizes a pattern in a piece of code being maintained, the maintainer grasps a bit more about the code. The maintainer taps into a wealth of background material: personal experience as well as what's read in textbooks like Design Patterns. That background can clue you in to potential pitfalls, limitations, and the intended way for the code to evolve. Or, if the code fails to completely match the patterns, you have a guide to what needs to be added to gain the benefits of that pattern.”

This is backed up by research. In Making Software, chapter 22 “The Evidence for Design Patterns” reviewes two studies by Prof. Walter Tichy showing that design patterns can be useful to developers maintaining code, provided that the people maintaining the code recognize and understand the patterns. One study of computer science students who had some training in design patterns found that students made fewer mistakes and were faster at changing code if it followed well-known and easily-understood design patterns, and if the design patterns being used in the code were clearly documented in the comments. In another study, experienced programmers were also able to make changes quicker and with fewer bugs if the code followed design patterns that they were familiar with.

But another study on design patterns in legacy code explains how difficult it is to “mine for design patterns” in code:

recognizing design patterns in code requires a good understanding of the code as well as common patterns;
some patterns are easier to recognize than others – and some patterns probably won’t be recognized at all.

Understanding and Recognizing Patterns in Code

Patterns are only valuable if they are immediately recognizable by whoever is working on the code and can be easily followed. Which means that they have to be implemented properly in the first place, and sustained over time.

Besides the canonical GoF design pattern catalog, which most developers have at least heard of, and maybe Martin Fowler’s Patterns of Enterprise Application Architecture, there are lots of other less well-known pattern collections, never mind proprietary patterns that were invented by the team that wrote the software.

You can’t expect a developer to recognize these patterns, never mind understand them, from looking at code, unless they are otherwise made explicit in the code through naming conventions and comments (including, for more obscure patterns, live links to the pattern definition). The studies above prove thjat this kind of documentation is especially important for less experienced developers.

But just because something has “Factory” or “Strategy” in the name, or comments explaining that the code is following a pattern, doesn’t mean that it actually follows that pattern properly, at least not any more.

Refactoring to Patterns

Another place where patterns come into play is in refactoring. When cleaning up the structure of code, it’s natural (for some developers at least) to think about where patterns can be applied.

Refactoring to Patterns takes refactoring to a higher level, not just correcting obvious problems and inconsistencies in the code. It describes how to bring the design inline with common patterns (not all of them from the GoF book) using multiple refactoring steps.

Some patterns are simple to understand, simple to apply, don’t require a lot of changes to implement, and result in simpler code: Factories and Prototypes. Refactoring to other patterns requires a lot more work to understand and change the code, and may not be worth the effort: Strategies and State, Observer, Visitor.

What’s the real payback for refactoring or rewriting code to patterns for the sake of patterns? There often isn't one.

You don’t want to refactor to patterns unless:

you have a good reason to refactor the code in the first place – the code is difficult to understand and change; and
you know how to do refactoring properly and safely;
you need the extra flexibility that most patterns offer; and
you have the experience and judgement to know what patterns are needed and how to use them properly; and
the people who you work with also understand patterns well enough to keep up with the changes that you want to make.

As it says in the GoF book:

Design patterns should not be applied indiscriminately. Often they achieve flexibility and variability by introducing additional levels of indirection, and that can complicate a design and/or cost you some performance. A design pattern should only be applied when the flexibility it affords is actually needed.

The Value of Patterns over Time

Refactoring to Patterns encourages more ambitious, larger-scale refactoring – which can be dangerous, because the more you refactor, the more chances there are of making mistakes and introducing bugs – and implementing patterns doesn't always make code more maintainable and easier to understand, which defeats the purpose of refactoring.

A study on design patterns and software quality at the University of Montreal (2008) found that design patterns in practice do not always improve code quality, reusability and expandability; and often makes code harder to understand. Some patterns are better than others: Composite makes code easier to follow and easier to change. Abstract Factory makes code more modular and reusable, but at the expense of understandability. Flyweight makes code less expandable and reusable, and much harder to follow. Most developers don’t recognize or understand the Visitor pattern. Observer can be difficult to understand as well, although it does make the code more flexible and extendible. Chain of Responsibility makes code harder to follow, and harder to change or fix safely. And Singleton, of course, while simple to recognize and understand, can make code much harder to change.

For maintainability and understandability, it’s more important to recognize and sustain coding conventions so that the code base is consistent than it is to implement patterns. And to understand common refactorings and how to use your IDE’s refactoring tools, as well as Michael Feathers’ patterns for cleaning up legacy code.

Whether you’re designing and writing new code, or changing code, or refactoring code, the best advice is:

Don’t use patterns unless you need to.
Don’t use patterns that you don’t fully understand.
Don’t expect that whoever is going to work on the code in the future to recognize and understand the patterns that you used – stick to common patterns, and make them explicit in comments where you think it is important or necessary.
When you’re changing code, take some time to look for and understand the patterns that might be in place, and decide whether it is worth preserving (or restoring) them: whether doing this will really make the code better and more understandable.

Thursday, June 20, 2013

What is Important in Secure Software Design?

There are many basic architectural and design mistakes that can compromise the security of a system:

Missing something important in security features like access control or auditing, privacy and compliance requirements;
Technical mistakes in understanding and implementing defence-against-the-dark-arts security stuff like crypto, managing secrets and session management (you didn’t know enough to do something or to do it right);
Misunderstanding architectural responsibilities and trust zones, like relying on client-side validation, or “I thought that the data was already sanitized”;
Leaving the attack surface bigger than it has to be – because most developers don’t understand what a system’s attack surface is, or know that they need to watch out when they change it;
Allowing access by default, so when an error happens or somebody forgets to add the right check in the right place, the doors and windows are left open and the bad guys can walk right in;
Choosing an insecure development platform or technology stack or framework or API and inheriting somebody else’s design and coding mistakes;
Making stupid mistakes in business workflows that allow attackers to bypass checks and limits and steal money or steal information.

Learning about Secure Software Design

If you want to build a secure system, you need to understand secure design. Hopefully you won’t start by reading Secure Software Design by Richardson and Thies. While it does describe many of the major issues in application security and IT security in general, and some common threats and vulnerabilities, it (ironically, given the title) doesn't explain how to do secure software design. And too much of the “practical information” in the book is dangerously almost but not quite right: the section on XSS for example, which does mention output escaping, but doesn't explain how to do it properly or that it is much more important than “Scrubbing input for unnecessary characters and altering necessary but possibly dangerous characters” (however you would go about doing that safely). Or mostly wrong: the section on secure database design – no, “One of the simplest ways to protect a web application from an [sic] SQL injection attack is to validate all input parameters” is not correct, and “You should also avoid dynamic SQL and use parameterized stored procedures” is not close enough to being correct to be understood or followed properly. The book does raise awareness of application security issues, and early on the authors do point readers to CERT, SANS and OWASP, so there is hope that students will find and use those resources instead of relying on this book.

Principles – Motherhood and Apple Pie, or Goodness and Rightness and So What?

Every book that takes on secure software design, even a good book like Secure and Resilient Software Development by Merkow and Raghavan, spends time going through basic secure design principles: The importance of C and I and maybe A. Modularity and compartmentalization, separation of responsibilities, economy of mechanism (an unsimple way to say simplicity), least privilege, defence in depth certainly but not security through obscurity, complete mediation (uh huh), and psychological acceptability, and whatever else Saltzer and Schroeder wrote up 40 years ago.

All good and true and wise and right ideas to live by, but you can read this stuff all day (if you can stay awake through it) and it won’t help you to design a more secure system. There’s nothing clear or actionable here – it’s preaching and high-level hand waving. You can’t tell if you have done enough of it, you’ll never know if you got it right or what you missed, or what really important and what isn't.

Threats and Attacks and Risks – Learning to be Afraid of …something…

The rest of secure design is mostly about threats and attacks and exploits – risk-focused threat modeling exercises. Developers design something nice, and then a security expert comes in and attacks their design, looks for weaknesses and oversights, enumerates threats and walks through attack trees and vulnerabilities and tells the developers what they missed and what some theoretical attacker might be able to take advantage of to compromise the system.

This is difficult stuff for developers to understand, and difficult for them to get excited about: you’re asking developers – concrete problem solvers – to think about problems that will "probably never" happen. And to do it properly requires that you not only understand how the system works (and the technology that it works on), but also what kind of attacks are possible in what contexts (and how likely they really are), which means you need specialized experience and knowledge that most developers don’t have and can’t get easily.

But even if you know this stuff and follow a structured approach like STRIDE or maybe Trike there’s no way to know if you’ve done a good job of threat modeling, if you've done enough of it and if you've identified all the important problems, or if you’ve missed some important attack vector or critical vulnerability and your gonna be pwned anyway.

Threat modeling, at least the way it is commonly understood, with expensive meetings where architects and developers and testers and security experts and project managers get together to methodically walk through design documents, and then write up CYA paperwork afterwards, doesn’t fit with the way that most developers actually work – especially developers on Agile teams who do most of their design work incrementally and iteratively, constantly refining and filling the design in as they go. Or developers maintaining legacy systems under constant pressure to fix or change something that is already there as fast and cheaply as they can. There isn’t time or space to fit in threat modeling meetings or all that documentation and paperwork, and it’s probably not the best use of time if they could find some.

Even lighter weight threat modeling hasn't made much of an inroad in development shops and I am not convinced that it will.

Secure Design Checklists, Cheat Sheets and Patterns

When developers are designing and building a system, they want to look forward: towards understanding the problem they are trying to solve and what they need to build and how they can get it built quickly. Rather than looking back at what people missed or did wrong, it’s more valuable and practical and cost-effective to focus on what they should and can do upfront as part of the design – the practices and patterns and tools that they should use and what they shouldn’t, the problems that they have to look out for when they are making design decisions and trade-offs.

I've talked before about how important and useful checklists can be in software security: simple steps and things to think about when working on different design problems, to make sure that you aren't missing something important or doing something stupid.

Microsoft’s Patterns and Practices site includes an (unfortunately “retired”) secure architecture and design review checklist which covers most of the things that you need to think about when designing a secure app. In case this checklist disappears some day, a full copy of it is included in Merkow and Raghavan's book on Secure and Resilient Software Development.

OWASP has a secure design checklist, but it is not targeted to developers – it’s a tool to help an auditor run security design reviews in a document-heavy waterfall environment. There is an OWASP Application Architecture Cheat Sheet (currently in draft), which includes some good questions to ask in initial architecture and high-level design. The rest of the OWASP Cheat Sheets can be used to help designers and coders with specific application security problems – as long as you know what problems you need to solve.

There’s also been some work on secure patterns, which could be useful for developers who take a pattern-based approach to software design. The SEI’s Secure Design Pattern catalog is an attempt to include security in some common software design patterns (secure versions of Factory, Strategy, Builder, Chain of Responsibility…), or to apply patterns to some common software security problems. And there are a couple of books like Core Security Patterns (an intimidating 1000+ page list of security patterns for big standards-based J2EE apps) and Security Patterns in Practice (which has just been published). However, these patterns have not made it to the mainstream – I don’t know many real-life developers who are even aware of these patterns, never mind tried to apply them.

One of the most useful tools I've come across in the secure design space is SD Elements, an online software service which helps development teams make application security decisions. You start by describing your project and its security and compliance requirements and the language(s)/platform that you’re using, and SD Elements guides you through a set of questions and options on how to deal with important security aspects of the design, implementation and testing of the system. It helps you to understand the decisions that you need to make, and holds your hand as you make them.

Security in Design, not Secure Design

Secure design shouldn't be about things that you don’t understand or can’t do anything about. Secure design should be about understanding the problems that you can and should take care of on your own and the problems you shouldn't.

Understanding what your system’s attack surface looks like and what to look out for when you change it.

How trust zones work.

Where and why and how you should use proven application security frameworks and libraries like Shiro or ESAPI, or how to properly leverage the security capabilities of your application framework (Rails or Play or Spring or whatever…).

The first step – and the most important step – is to get software designers and architects to think about security when they think about design, in the same way that they think about time-to-market and developer convenience, or performance or reliability or future proofing or technical elegance. Not just the security features that they should a have stories for, but security as a continuous thread in architecture and design.

When they select tools and languages and frameworks and platforms.

When they think about architectural responsibilities and layering and patterns.

And when they work with data: identifying and tracing and protecting confidential and private information and secrets, taking care of data validation properly, and thinking about safe data access and data storage.

Secure design has to fit into design and how design it is done. It has to be part of decisions as design decisions are being made, not bolted on afterwards in audits and reviews.

Thursday, June 13, 2013

Automated Tests as Documentation

One of the arguments for writing automated tests is that tests can act as useful documentation for a system. But what do tests document? And who will find this documentation useful?

Most developers don’t rely on system documentation because there isn't enough documentation to give them a complete idea of how the system works, or because there’s too much of it to read, or because it’s not well written, or because it’s not up to date, or because they don’t believe it’s up to date.

But a good set of automated tests can tell you how the system really works today – not how somebody thought it works or how they thought it was supposed to work or how it used to work, if anybody bothered to write any of this down.

“Tests are no more and no less than executable documentation. When writing new code, it makes sense that it should be documented by the person closest to it: the author. That’s my rationale for developer testing.

Comment “doc blocks” and wiki pages, are always found to be inaccurate to some extent. By contrast automated tests fail noisily whenever they go out of date. So with regard to documentation, automated tests are uniquely advantageous in that (given a CI server) you at least have the option of keeping your documentation up to date whenever it starts to drift away from reality.”
Noah Sussman

You might be able to use tests as documentation, if…

To be useful as documentation, tests have to be:

comprehensive – they have to cover all of the important areas of the code and functions of the system;
run often and work – run on every check-in, or least often enough to give everyone confidence that they are up to date; and the tests have to pass – you can’t leave tests broken or failing;
written to be read – writing a test that works, and writing a test that can be used as documentation, are two different things;
at the right level of abstraction – most people when they talk about automated tests mean unit tests. …

Unit tests constitute design documentation that evolves naturally with a system. Read that again. This is the Holy Grail of software development, documentation that evolves naturally with a system. What better way to document a class than to provide a coded set of use cases. That's what these unit tests are: a set of coded use cases that document what a class does, given a controlled set of inputs. As such, this design document is always up-to-date because the unit tests always have to pass.
Jeff Canna Testing, fun? Really?

Of course even tests that run frequently and pass in Continuous Integration or Continuous Delivery can still contain mistakes. There will be tests that pass but shouldn't, or tests that tell you what the system does, but what the system does is not what what it is supposed to do (one of the risks of having developers writing tests is that if they misunderstood the requirement and got the code wrong, they got the tests wrong too). But on the whole, tests that run often should be more accurate than some document that may or may not have been right in the first place and that probably hasn't been kept up to date since. And through code coverage analysis, you can at least understand what parts of the system are described by the tests, and what parts aren't.

Tests have to be written to be read

One problem with using tests as documentation, is that tests are only a means to an end – even tests written up front in TDD. Tests are just another part of the supporting infrastructure and tooling that developers rely on to write code (and it's all about the code). Tests are tools to help developers think about what code they need to write (if they are following TDD), to prove that the code does what it was supposed to do, and to catch mistakes when people make changes in the future.

As a result, developers don’t put the same amount of attention and discipline into designing and implementing tests as they do with the code itself. As long as the tests run quickly and they cover the code (or at least the important parts of the code), they’ve done the job. Nobody has to care that much about how the tests are named or what they look like inside. Few teams peer review unit tests, and even then most reviewers only check to see that somebody wrote a test, and not that every test is correct, or that each test has a nice meaningful and consistent name and is easy to understand. There aren't a lot of developers who understand xUnit Test Patterns or spend extra time – or can afford to spend extra time – refactoring tests. So, a lot of automated tests aren't clean, consistent or easy to follow.

Another thing that makes unit tests hard to follow as documentation, is the way that tests are usually organized. Common conventions for structuring and naming unit tests are that developers will write a test class/module for every code class/module in the system, with test methods to assert specific behaviour in that code module.

Assuming that whoever wrote the code wrote a comprehensive set of tests and followed a consistent, well-defined structure and naming approach, and that they came up with good, descriptive names for each test class and test method and that everyone who worked on the code over time understood all of this and took the trouble to keep all of it up as they changed and refactored the code and moved responsibilities around and wrote their own tests (which is assuming a lot), then you should be able to get a good idea of what’s happening inside each piece of code by following the tests.

Can you really understand a system from Unit Tests?

But you’re still not going to be able to read a nice story about how a system is designed or what the system does or how it does it by reading unit tests.

Even if unit tests are complete and up-to-date and well written and well organized and have good names (if, if, if, if, if), the accumulation of details built up from looking at all of these tests is overwhelming. Unit tests are too close to the metal, sometimes obscured by fixtures and other test plumbing, and too far removed from the important aspects of the business logic or the design.

UnitTests include a lot of testing of lower level processes that have no direct connection to the stories.
Steve Jorgensen, comment in Unit Test as Documentation

Tests follow the code. You can’t understand the tests without understanding the code. So why not read the code instead? You’ll have to do this eventually to make sure that you really know what’s going on, and because without reading the code, you can’t know if the tests are well-written in the first place.

One place where low-level developer tests may be useful as documentation is describing how to use an API – provided that tests are comprehensive, expressive and “named in such a way that the behavior they validate is evident”.

A good set of unit tests like this can act as a reference implementation, showing how the API is supposed to be used, documenting common usage details:

If you are looking for authoritative answers on how to use the Rails API, look no further than the Rails unit tests. Those tests provide excellent documentation on what's supported by the API and what isn't, how the API is intended to be used, the kind of use cases and domain specific problems that drove the API, and also what API usage is most likely to work in future versions of Rails.
Peter Marklund, Rails Tip: Use the Unit Tests as Documentation

But again, this can only work if you take the time to design, organize and write the tests to do “Double Duty” as tests and as documentation, (Double Duty: How to repurpose the unit tests you’re doing to help create the documentation you’re not, by Brian Button) and make your tests “as understandable as humanly possible for as many different readers as possible”.

Tests as documentation?

I like the idea that automated tests can serve as documentation – we’d all save time and money this way. But who is this documentation for, and how is it supposed to be used?

I don’t know any developers who would start by reading tests cases in order to understand the design of a system. A good developer knows that they can’t trust documents or pictures, or tests, or even what other programmers tell them, or comments in the code. The only thing that they can trust is the code.

Tests might be more useful as documentation to testers. After all, it’s a tester’s job to understand the tests in order to maintain them or add to them. But most testers aren't going to learn much from most tests. When it comes to unit tests, it is the same for testers as it is for developers: unit tests aren't useful unless you understand the code – and if you understand the code, then you should read it instead. Higher-level acceptance tests are easier to understand and more useful to look at especially for non-technical people. It should be easier to tell a story and to follow a story about what a system does through high-level functional and integration scenarios (big fat tests), or acceptance tests captured in a tool like Fitnesse. Rather than asserting detailed implementation-specific conditions, these tests describe technical scenarios, or business rules and business workflows and other requirements that somebody thought were important enough to test for.

But even if you can follow these tests, there’s no way to know how well they describe what the system does without spending time talking to people, learning about the domain, testing the system yourself, and… reading the code.

I asked Jonathan Kohl, one of the smartest testers I know, about his experience using automated tests as documentation:

Back in '03 and '04, Brian Marick and I were looking into this. We held an experimental workshop at XP/Agile Universe in 2004 with testers, developers and other project stakeholders to see what would happen if people who were unfamiliar with the program code (both the code and the automated unit tests) could get something useful from the automated tests as documentation. It was a complete and utter failure. No one really got any value whatsoever from the automated unit tests from a doc perspective. We had to explain what was going on afterwards…

Marick and I essentially tossed the idea aside after that experience. I gave up on the whole double duty thing. Documentation is really good at documenting and explaining, while code is good at program creation and executing. I go for the right tool for the job now, rather than try to overload concepts.

Bottom line, I do not find tests useful as documentation at all. When done well, I do find them useful as examples of implementation of interfaces, etc. when I am new to a system, but nothing replaces a good written doc, especially when coupled with some face-to-face brainstorming and explanations.

There are a lot of benefits to automated testing, especially once you have a good automated suite in place. But documentation is not one of them.

Thursday, June 6, 2013

Choosing between a Pen Test and a Secure Code Review

Secure Code Reviews (bringing someone in from outside of the team to review/audit the code for security vulnerabilities) and application Pen Tests (again, bringing a security specialist in from outside the team to test the system) are both important practices in a secure software development program. But if you could only do one of them, if you had limited time or limited budget, which should you choose? Which approach will find more problems and tell you more about the security of your app and your team? What will give you more bang for your buck?

Pen testing and code reviews are very different things – they require different work on your part, they find different problems and give you different information. And the cost can be quite different too.

White Box / Black Box

We all know the difference between white box and black box.

Because they can look inside the box, code reviewers can zero in on high-risk code: public interfaces, session management and password management and access control and crypto and other security plumbing, code that handles confidential data, error handling, auditing. By scanning through the code they can check if the app is vulnerable to common injection attacks (SQL injection, XSS, …),and they can look for time bombs and back doors (which are practically impossible to test for from outside) and other suspicious code. They may find problems with concurrency and timing and other code quality issues that aren't exploitable but should be fixed any ways. And a good reviewer, as they work to understand the system and its design and ask questions, can also point out design mistakes, incorrect assumptions and inconsistencies – not just coding bugs.

Pen Testers rely on scanners and attack proxies and other tools to help them look for many of the same common application vulnerabilities (SQL injection, XSS, …) as well as run-time configuration problems. They will find information disclosure and error handling problems as they hack into the system. And they can test for problems in session management and password handling and user management, authentication and authorization bypass weaknesses, and even find business logic flaws especially in familiar workflows like online shopping and banking functions. But because they can’t see inside the box, they – and you – won’t know if they've covered all of the high-risk parts of the system.

The kind of security testing that you are already doing on your own can influence whether a pen test or a code review is more useful. Are you testing your web app regularly with a black box dynamic vulnerability scanning tool or service? Or running static analysis checks as part of Continuous Integration?

A manual pen test will find many of the same kinds of problems that an automated dynamic scanner will, and more. A good static analysis tool will find at least some of the same bugs that a manual code review will – a lot of reviewers use static analysis source code scanning tools to look for low hanging fruit (common coding mistakes, unsafe functions, hard-coded passwords, simple SQL injection, ...). Superficial tests or reviews may not involve much more than someone running one of these automated scanning tools and reviewing and qualifying the results for you.

So, if you’ve been relying on dynamic analysis testing, it makes sense to get a code review to look for problems that you haven’t already tested for yourself. And if you’ve been scanning code with static analysis tools, then a pen test may have a better chance of finding different problems.

Costs and Hassle

A pen test is easy to setup and manage. It should not require a lot of time and hand holding from your team, even if you do it right and make sure to explain the main functions of the application to the pen test team and walk them through the architecture, and give them all the access they need.

Code reviews are generally more expensive than pen tests, and will require more time and effort on your part – you can’t just give an outsider a copy of the code and expect them to figure it all out on their own. There is more hand holding needed both ways. You holding their hand and explaining the architecture and how the code is structured and how the system works and the compliance and risk drivers, answering questions about the design and the technology as they go along; and them holding your hand, patiently explaining what they found and how to fix it, and working with your team to understand whether each finding is worth fixing, weeding out false positives and other misunderstandings.

This hand holding is important. You want to get maximum value out of a reviewer’s time – you want them to focus on high-risk code and not get lost on tangents. And you want to make sure that your team understands what the reviewer found and how important each bug is and how they should be fixed. So not only do you need to have people helping the reviewer – they should be your best people.

Intellectual Property and Confidentiality and other legal concerns are important, especially for code reviews – you’re letting an outsider look at the code, and while you want to be transparent in order to ensure that the review is comprehensive, you may also be risking your secret sauce. Solid contracting and working with reputable firms will minimize some of these concerns, but you may also need to strictly limit what code the reviewer will get to see.

Other Factors in Choosing between Pen Tests and Code Reviews

The type of system and its architecture can also impact your decision.

It’s easy to find pen testers who have lots of experience in testing web portals and online stores – they’ll be familiar with the general architecture and recognize common functions and workflows, and can rely on out-of-the-box scanning and fuzzing tools to help them test. This has become a commodity-based service, where you can expect a good job done for a reasonable price.

But if you’re building an app with proprietary system-to-system APIs or proprietary clients, or you are working in a highly-specialized technical domain, it’s harder to find qualified pen testers, and they will cost more. They’ll need more time and help to understand the architecture and the app, how everything fits together and what they should focus on in testing. And they won’t be able to leverage standard tools, so they’ll have to roll something on their own, which will take longer and may not work as well.

A code review could tell you more in these cases. But the reviewer has to be competent in the language(s) that your app is written in – and, to do a thorough job, they should also be familiar with the frameworks and libraries that you are using. Since it is not always possible to find someone with the right knowledge and experience, you may end up paying them to learn on the job – and relying a lot on how quickly they learn. And of course if you’re using a lot of third party code for which you don’t have source, then a pen test is really your only choice.

Are you in a late stage of development, getting ready to release? What you care about most at this point is validating the security of the running system including the run-time configuration and, if you’re really late in development, finding any high-risk exploitable vulnerabilities because that’s all you will have time to fix. This is where a lot of pen testing is done.

If you’re in the early stages of development, it’s better to choose a code review. Pen testing doesn’t make a lot sense (you don’t have enough of the system to do real system testing) and a code review can help set the team on the right path for the rest of the code that they have to write.

Learning from and using the results

Besides finding vulnerabilities and helping you assess risk, a code review or a pen test both provide learning opportunities – a chance for the development team to understand and improve how they write and test software.

Pen tests tell you what is broken and exploitable – developers can’t argue that a problem isn’t real, because an outside attacker found it, and that attacker can explain how easy or hard it was for them to find the bug, what the real risk is. Developers know that they have to fix something – but it’s not clear where and how to fix it. And it’s not clear how they can check that they’ve fixed it right. Unlike most bugs, there are no simple steps for the developer to reproduce the bug themselves: they have to rely on the pen tester to come back and re-test. It’s inefficient, and there isn’t a nice tight feedback loop to reinforce understanding.

Another disadvantage with pen tests is that they are done late in development, often very late. The team may not have time to do anything except triage the results and fix whatever has to be fixed before the system goes live. There’s no time for developers to reflect and learn and incorporate what they’ve learned.

There can also be a communication gap between pen testers and developers. Most pen testers think and talk like hackers, in terms of exploits and attacks. Or they talk like auditors, compliance-focused, mapping their findings to vulnerability taxonomies and risk management frameworks, which don’t mean anything to developers.

Code reviewers think and talk like programmers, which makes code reviews much easier to learn from – provided that the reviewer and the developers on your team make the time to work together and understand the findings. A code reviewer can walk the developer through what is wrong, explain why and how to fix it, and answer the developer’s questions immediately, in terms that a developer will understand, which means that problems can get fixed faster and fixed right.

You won’t find all of the security vulnerabilities in an app through a code review or a pen test – or even from doing both of them (although you’d have a better chance). If I could only do one or the other, all other factors aside, I would choose a code review. A review will take more work, and probably cost more, and it might not even find as many security bugs. But you will get more value in the long term from a code review. Developers will learn more and quicker, hopefully enough to understand how to look for and fix security problems on their own, and even more important, to avoid them in the first place.