Building Real Software: Choosing between a Pen Test and a Secure Code Review

Secure Code Reviews (bringing someone in from outside of the team to review/audit the code for security vulnerabilities) and application Pen Tests (again, bringing a security specialist in from outside the team to test the system) are both important practices in a secure software development program. But if you could only do one of them, if you had limited time or limited budget, which should you choose? Which approach will find more problems and tell you more about the security of your app and your team? What will give you more bang for your buck?

Pen testing and code reviews are very different things – they require different work on your part, they find different problems and give you different information. And the cost can be quite different too.

White Box / Black Box

We all know the difference between white box and black box.

Because they can look inside the box, code reviewers can zero in on high-risk code: public interfaces, session management and password management and access control and crypto and other security plumbing, code that handles confidential data, error handling, auditing. By scanning through the code they can check if the app is vulnerable to common injection attacks (SQL injection, XSS, …),and they can look for time bombs and back doors (which are practically impossible to test for from outside) and other suspicious code. They may find problems with concurrency and timing and other code quality issues that aren't exploitable but should be fixed any ways. And a good reviewer, as they work to understand the system and its design and ask questions, can also point out design mistakes, incorrect assumptions and inconsistencies – not just coding bugs.

Pen Testers rely on scanners and attack proxies and other tools to help them look for many of the same common application vulnerabilities (SQL injection, XSS, …) as well as run-time configuration problems. They will find information disclosure and error handling problems as they hack into the system. And they can test for problems in session management and password handling and user management, authentication and authorization bypass weaknesses, and even find business logic flaws especially in familiar workflows like online shopping and banking functions. But because they can’t see inside the box, they – and you – won’t know if they've covered all of the high-risk parts of the system.

The kind of security testing that you are already doing on your own can influence whether a pen test or a code review is more useful. Are you testing your web app regularly with a black box dynamic vulnerability scanning tool or service? Or running static analysis checks as part of Continuous Integration?

A manual pen test will find many of the same kinds of problems that an automated dynamic scanner will, and more. A good static analysis tool will find at least some of the same bugs that a manual code review will – a lot of reviewers use static analysis source code scanning tools to look for low hanging fruit (common coding mistakes, unsafe functions, hard-coded passwords, simple SQL injection, ...). Superficial tests or reviews may not involve much more than someone running one of these automated scanning tools and reviewing and qualifying the results for you.

So, if you’ve been relying on dynamic analysis testing, it makes sense to get a code review to look for problems that you haven’t already tested for yourself. And if you’ve been scanning code with static analysis tools, then a pen test may have a better chance of finding different problems.

Costs and Hassle

A pen test is easy to setup and manage. It should not require a lot of time and hand holding from your team, even if you do it right and make sure to explain the main functions of the application to the pen test team and walk them through the architecture, and give them all the access they need.

Code reviews are generally more expensive than pen tests, and will require more time and effort on your part – you can’t just give an outsider a copy of the code and expect them to figure it all out on their own. There is more hand holding needed both ways. You holding their hand and explaining the architecture and how the code is structured and how the system works and the compliance and risk drivers, answering questions about the design and the technology as they go along; and them holding your hand, patiently explaining what they found and how to fix it, and working with your team to understand whether each finding is worth fixing, weeding out false positives and other misunderstandings.

This hand holding is important. You want to get maximum value out of a reviewer’s time – you want them to focus on high-risk code and not get lost on tangents. And you want to make sure that your team understands what the reviewer found and how important each bug is and how they should be fixed. So not only do you need to have people helping the reviewer – they should be your best people.

Intellectual Property and Confidentiality and other legal concerns are important, especially for code reviews – you’re letting an outsider look at the code, and while you want to be transparent in order to ensure that the review is comprehensive, you may also be risking your secret sauce. Solid contracting and working with reputable firms will minimize some of these concerns, but you may also need to strictly limit what code the reviewer will get to see.

Other Factors in Choosing between Pen Tests and Code Reviews

The type of system and its architecture can also impact your decision.

It’s easy to find pen testers who have lots of experience in testing web portals and online stores – they’ll be familiar with the general architecture and recognize common functions and workflows, and can rely on out-of-the-box scanning and fuzzing tools to help them test. This has become a commodity-based service, where you can expect a good job done for a reasonable price.

But if you’re building an app with proprietary system-to-system APIs or proprietary clients, or you are working in a highly-specialized technical domain, it’s harder to find qualified pen testers, and they will cost more. They’ll need more time and help to understand the architecture and the app, how everything fits together and what they should focus on in testing. And they won’t be able to leverage standard tools, so they’ll have to roll something on their own, which will take longer and may not work as well.

A code review could tell you more in these cases. But the reviewer has to be competent in the language(s) that your app is written in – and, to do a thorough job, they should also be familiar with the frameworks and libraries that you are using. Since it is not always possible to find someone with the right knowledge and experience, you may end up paying them to learn on the job – and relying a lot on how quickly they learn. And of course if you’re using a lot of third party code for which you don’t have source, then a pen test is really your only choice.

Are you in a late stage of development, getting ready to release? What you care about most at this point is validating the security of the running system including the run-time configuration and, if you’re really late in development, finding any high-risk exploitable vulnerabilities because that’s all you will have time to fix. This is where a lot of pen testing is done.

If you’re in the early stages of development, it’s better to choose a code review. Pen testing doesn’t make a lot sense (you don’t have enough of the system to do real system testing) and a code review can help set the team on the right path for the rest of the code that they have to write.

Learning from and using the results

Besides finding vulnerabilities and helping you assess risk, a code review or a pen test both provide learning opportunities – a chance for the development team to understand and improve how they write and test software.

Pen tests tell you what is broken and exploitable – developers can’t argue that a problem isn’t real, because an outside attacker found it, and that attacker can explain how easy or hard it was for them to find the bug, what the real risk is. Developers know that they have to fix something – but it’s not clear where and how to fix it. And it’s not clear how they can check that they’ve fixed it right. Unlike most bugs, there are no simple steps for the developer to reproduce the bug themselves: they have to rely on the pen tester to come back and re-test. It’s inefficient, and there isn’t a nice tight feedback loop to reinforce understanding.

Another disadvantage with pen tests is that they are done late in development, often very late. The team may not have time to do anything except triage the results and fix whatever has to be fixed before the system goes live. There’s no time for developers to reflect and learn and incorporate what they’ve learned.

There can also be a communication gap between pen testers and developers. Most pen testers think and talk like hackers, in terms of exploits and attacks. Or they talk like auditors, compliance-focused, mapping their findings to vulnerability taxonomies and risk management frameworks, which don’t mean anything to developers.

Code reviewers think and talk like programmers, which makes code reviews much easier to learn from – provided that the reviewer and the developers on your team make the time to work together and understand the findings. A code reviewer can walk the developer through what is wrong, explain why and how to fix it, and answer the developer’s questions immediately, in terms that a developer will understand, which means that problems can get fixed faster and fixed right.

You won’t find all of the security vulnerabilities in an app through a code review or a pen test – or even from doing both of them (although you’d have a better chance). If I could only do one or the other, all other factors aside, I would choose a code review. A review will take more work, and probably cost more, and it might not even find as many security bugs. But you will get more value in the long term from a code review. Developers will learn more and quicker, hopefully enough to understand how to look for and fix security problems on their own, and even more important, to avoid them in the first place.

3 comments:

Joe said...: "Code reviews are generally more expensive than pen tests"

Really? That might be true in some scenarios, but I suspect the opposite is true in many shops.

I know in my shop, code reviews are for all intents and purposes "free" (just consuming a few hours of time from a few people), where pen tests are costly, since they are outsourced.; June 6, 2013 at 9:23 AM
Jim Bird said...: @Joe, I should have been much clearer. When I am talking about a security code review, I mean bringing in an appsec specialist to review/audit the code, usually someone from outside (a consultant from Cigital for example). Not a peer review by another developer - which is more likely to find correctness, style or maintainability issues than security vulnerabilities.; June 6, 2013 at 10:00 AM
Anonymous said...: Why does one have to choose between dynamic and static analysis? Code reviews and pen tests? Whatever and whatever?

With the advent of shared-library insertion, we've always been able to choose how we approach flow sensitivity and tracing direction. It's all code and all code can be elicited from code. This isn't about code review or pen tests anymore. We have solutions like point-cuts as well as compile-time and dynamic-binary instrumentation.

The choice, especially when outsourcing, is a lie. A vendor will try to sell one solution and then point to the fact that the other solution may also be reliable/efficient later on.

No source code? Put yourself inside the app somehow, by way of system memory if you have to. Worried you don't know enough about runtime? Build the app with CTI and exercise even the minimal functionality.

With solutions in 2013 and on like ContrastSecurity, it makes me sad to hear that people still think there is a trade-off between blackbox and whitebox. It's all graybox to me!; July 5, 2013 at 6:47 AM