Building Real Software: March 2012

Wednesday, March 28, 2012

What's the point of application pen testing?

Penetration testing is one of the bulwarks of an application security program: get an expert tester to simulate an attack on your system, and see if they can hack their way in. But how effective is application penetration testing, and what should you expect from it?

Read my latest post at the SANS AppSec Street Fighter blog on What's the point of application pen testing?

Thursday, March 22, 2012

Is Copy and Paste Programming really a problem?

Copy and Paste Programming – taking a copy of existing code in your project and repurposing it – violates coding best practices like Don’t Repeat Yourself (DRY). It’s one of the most cited examples of technical debt, a lazy way of working, sloppy and short-sighted: an antipattern that adds to the long term cost of keeping a code base alive.

But it’s also a natural way to get stuff done – find something that already works, something that looks close to what you want to do, take a copy and use it as a starting point. Almost everybody has done in at some point. This is because there are times when copy and paste programming is not only convenient, but it might also be the right thing to do.

First of all, let’s be clear what I mean by copy and paste. This is not copying code examples off of the Internet, a practice that comes with its own advantages and problems. By copy and paste I mean when programmers take a shortcut in reuse – when they need to solve a problem that is similar to another problem in the system, they’ll start by taking a copy of existing code and changing what they need to.

Early in design and development, copy and paste programming has no real advantage. The code and design are still plastic, this is your chance to come up with the right set of abstractions, routines and libraries to do what the system needs to do. And there’s not a lot to copy from anyways. It’s late in development when you already have a lot of code in place, and especially when you are maintaining large, long-lived systems, that the copy and paste argument gets much more complicated.

Why Copy and Paste?

Programmers copy and paste because it saves time. First, you have a starting point, code that you know works. All you have to do is figure out what needs to be changed or added. You can focus on the problem you are trying to solve, on what is different, and you only need to understand what you are going to actually use. You are more free to iterate and make changes to fit the problem in front of you – you can cleanup code when you need to, delete code that you don’t need. All of this is important, because you may not know what you will need to keep, what you need to change, and what you don’t need at all until you are deeper into solving the problem.

Copy and paste programming also reduces risk. If you have to go back and change and extend existing code to do what it does today as well as to solve your new problem, you run the risk of breaking something that is already working. It is usually safer and less expensive (in the short term at least) to take a copy and work from there.

What if you are building a new B2B customer interface that will be used by a new set of customers? It probably makes sense to take an existing interface as a starting point, reuse the scaffolding and plumbing and wiring at least and as much of the the business code as makes sense, and then see what you need to change. In the end, there will be common code used by both interfaces (after all, that’s why you are taking a copy), but it could take a while before you know what this code is.

Finding a common design, the right abstractions and variations to support different implementations and to handle exceptions can be difficult and time consuming. You may end up with code that is harder to understand and harder to maintain and change in the future – because the original design didn’t anticipate the different exceptions and extensions, and refactoring can only take you so far. You may need a new design and implementation.

Changing the existing code, refactoring or rewriting some of it to be general-purpose, shared and extendable, will add cost and risk to the work in front of you. You can’t afford to create problems for existing customers and partners just because you want to bring some new customers online. You’ll need to be extra careful, and you’ll have to understand not only the details of what you are trying to do now (the new interface), but all of the details of the existing interface, its behavior and assumptions, so that you can preserve all of it.

It’s naïve to think that all of this behavior will be captured in your automated tests – assuming that you have a good set of automated tests. You’ll need to go back and redo integration testing on the existing interface. Getting customers and partners who may have already spent weeks or months to test the software to retest it is difficult and expensive. They (justifiably) won’t see the need to go through this time and expense because what they have is already working fine.

Copying and pasting now, and making a plan to come back later to refactor or even redesign if necessary towards a common solution, is the right approach here.

When Copy and Paste makes sense

In Making Software’s chapter on “Copy-Paste as a Principled Engineering Tool”, Michael Godfrey and Cory Kapser explore the costs of copy and paste programming, and the cases where copy and paste make sense:

Forking – purposely creating variants for hardware or platform variation, or for exploratory reasons.
Templating –some languages don’t support libraries and shared functions well so it may be necessary to copy and paste to share code. Somewhere back in the beginning of time, the first COBOL programmer wrote a complete COBOL program – everybody else after that copied and pasted from each other.
Customizing – creating temporary workarounds – as long as it is temporary.
Microsoft’s practice of “clone and own” to solve problems in big development organizations. One team takes code from another group and customizes it or adapts it to their own purposes – now they own their copy. This is a common approach with open source code that is used as a foundation and needs to be extended to solve a proprietary problem.

When Copy and Paste becomes a Problem

When to copy and paste, and how much of a problem it will become over time, depends on a few important factors.

First, the quality of what you are copying – how understandable the code is, how stable it is, how many bugs it has in it. You don’t want to start off by inheriting somebody else’s problems.

How many copies have been made. A common rule of thumb from Fowler and Beck`s Refactoring book is “three strikes and you refactor”. This rule comes from recognizing that by making a copy of something that is already working and changing the copy, you’ve created a small maintenance problem. It may not be clear what this maintenance problem is yet or how best to clean it up, because only two cases are not always enough to understand what is common and what is special.

But the more copies that you make, the more of a maintenance problem that you create – the cost of making changes and fixes to multiple copies, and the risk of missing making a change or fix to all of the copies increases. By the time that you make a third copy, you should be able to see patterns – what’s common between the code, and what isn’t. And if you have to do something in three similar but different ways, there is a good chance that there will be a fourth implementation, and a fifth. By the third time, it’s worthwhile to go back and restructure the code and come up with a more general-purpose solution.

How often you have to change the copied code and keep it in sync – particularly, how often you have to change or fix the same code in more than one place.

How well you know the code, do you know that there are clones and where to find them? How long it takes to find the copies, and how sure you are that you found them all. Tools can help with this. Source code analysis tools like clone detectors can help you find copy and paste code – outright copies and code that is not the same but similar (fuzzier matching with fuzzier results). Copied code is often fiddled with over time by different programmers, which makes it harder for tools to find all of the copies. Some programmers recommend leaving comments as markers in the code when you make a copy, highlighting where the copy was taken from, so that a maintenance programmer in the future making a fix will know to look for and check the other code.

Copy and Paste programming doesn’t come for free. But like a lot of other ideas and practices in software development, copy and paste programming isn’t right or wrong. It’s a tool that can be used properly, or abused.

Brian Foote, one of the people who first recognized the Big Ball of Mud problem in software design, says that copy and paste programming is the one form of reuse that programmers actually follow, because it works.

It’s important to recognize this. If we’re going to Copy and Paste, let's do a responsible job of it.

Wednesday, March 14, 2012

Defensive Programming: Being Just-Enough Paranoid

Hey, let’s be careful out there.
Sergeant Esterhaus, daily briefing to the force of Hill Street Blues

When developers run into an unexpected bug and can’t fix it, they’ll “add some defensive code” to make the code safer and to make it easier to find the problem. Sometimes just doing this will make the problem go away. They’ll tighten up data validation – making sure to check input and output fields and return values. Review and improve error handling – maybe add some checking around “impossible” conditions. Add some helpful logging and diagnostics. In other words, the kind of code that should have been there in the first place.

Expect the Unexpected

The whole point of defensive programming is guarding against errors you don’t expect.
Steve McConnell, Code Complete

The few basic rules of defensive programming are explained in a short chapter in Steve McConnell’s classic book on programming, Code Complete:

Protect your code from invalid data coming from “outside”, wherever you decide “outside” is. Data from an external system or the user or a file, or any data from outside of the module/component. Establish “barricades” or “safe zones” or “trust boundaries” – everything outside of the boundary is dangerous, everything inside of the boundary is safe. In the barricade code, validate all input data: check all input parameters for the correct type, length, and range of values. Double check for limits and bounds.
After you have checked for bad data, decide how to handle it. Defensive Programming is NOT about swallowing errors or hiding bugs. It’s about deciding on the trade-off between robustness (keep running if there is a problem you can deal with) and correctness (never return inaccurate results). Choose a strategy to deal with bad data: return an error and stop right away (fast fail), return a neutral value, substitute data values, … Make sure that the strategy is clear and consistent.
Don’t assume that a function call or method call outside of your code will work as advertised. Make sure that you understand and test error handling around external APIs and libraries.
Use assertions to document assumptions and to highlight “impossible” conditions, at least in development and testing. This is especially important in large systems that have been maintained by different people over time, or in high-reliability code.
Add diagnostic code, logging and tracing intelligently to help explain what’s going on at run-time, especially if you run into a problem.
Standardize error handling. Decide how to handle “normal errors” or “expected errors” and warnings, and do all of this consistently.
Use exception handling only when you need to, and make sure that you understand the language’s exception handler inside out.

Programs that use exceptions as part of their normal processing suffer from all the readability and maintainability problems of classic spaghetti code.
The Pragmatic Programmer

I would add a couple of other rules. From Michael Nygard’s Release It! never ever wait forever on an external call, especially a remote call. Forever can be a long time when something goes wrong. Use time-out/retry logic and his Circuit Breaker stability pattern to deal with remote failures.

And for languages like C and C++, defensive programming also includes using safe function calls to avoid buffer overflows and common coding mistakes.

Different Kinds of Paranoia

The Pragmatic Programmer describes defensive programming as “Pragmatic Paranoia”. Protect your code from other people’s mistakes, and your own mistakes. If in doubt, validate. Check for data consistency and integrity. You can’t test for every error, so use assertions and exception handlers for things that “can’t happen”. Learn from failures in test and production – if this failed, look for what else can fail. Focus on critical sections of code – the core, the code that runs the business.

Healthy Paranoid Programming is the right kind of programming. But paranoia can be taken too far. In the Error Handling chapter of Clean Code, Michael Feathers cautions that

“many code bases are dominated by error handling”

Too much error handling code not only obscures the main path of the code (what the code is actually trying to do), but it also obscures the error handling logic itself – so that it is harder to get it right, harder to review and test, and harder to change without making mistakes. Instead of making the code more resilient and safer, it can actually make the code more error-prone and brittle.

There’s healthy paranoia, then there’s over-the-top-error-checking, and then there’s bat shit crazy crippling paranoia – where defensive programming takes over and turns in on itself.

The first real world system I worked on was a “Store and Forward” network control system for servers (they were called minicomputers back then) across the US and Canada. It shared data between distributed systems, scheduled jobs, and coordinated reporting across the network. It was designed to be resilient to network problems and automatically recover and restart from operational failures. This was ground breaking stuff at the time, and a hell of a technical challenge.

The original programmer on this system didn’t trust the network, didn’t trust the O/S, didn’t trust Operations, didn’t trust other people’s code, and didn’t trust his own code – for good reason. He was a chemical engineer turned self-taught system programmer who drank a lot while coding late at night and wrote thousands of lines of unstructured FORTRAN and Assembler under the influence. The code was full of error checking and self diagnostics and error-correcting code, the files and data packets had their own checksums and file-level passwords and hidden control labels, and there was lots of code to handle sequence accounting exceptions and timing-related problems – code that mostly worked most of the time. If something went wrong that it couldn’t recover from, programs would crash and report a “label of exit” and dump the contents of variables – like today’s stack traces. You could theoretically use this information to walk back through the code to figure out what the hell happened. None of this looked anything like anything that I learned about in school. Reading and working with this code was like programming your way out of Arkham Asylum.

If the programmer ran into bugs and couldn’t fix them, that wouldn’t stop him. He would find a way to work around the bugs and make the system keep running. Then later after he left the company, I would find and fix a bug and congratulate myself until it broke some “error-correcting” code somewhere else in the network that now depended on the bug being there. So after I finally figured out what was going on, I took out as much of this “protection” as I could safely remove, and cleaned up the error handling so that I could actually maintain the system without losing what was left of my mind. I setup trust boundaries for the code – although I didn’t know that’s what it was called then – deciding what data couldn’t be trusted and what could. Once this was done I was able to simplify the defensive code so that I could make changes without the system falling over itself, and still protect the core code from bad data, mistakes in the rest of the code, and operational problems.

Making code safer is simple

The point of defensive coding is to make the code safer and to help whoever is going to maintain and support the code – not make their job harder. Defensive code is code – all code has bugs, and, because defensive code is dealing with exceptions, it is especially hard to test and to be sure that it will work when it has to. Understanding what conditions to check for and how much defensive coding is needed takes experience, working with code in production and seeing what can go wrong in the real world.

A lot of the work involved in designing and building secure, resilient systems is technically difficult or expensive. Defensive programming is neither – like defensive driving, it’s something that everyone can understand and do. It requires discipline and awareness and attention to detail, but it’s something that we all need to do if we want to make the world safe.

Thursday, March 8, 2012

AppSec at RSA 2012 Conference

I spent last week at the RSA conference in San Francisco - my first time at a global IT security event like this. A lot to learn and think about, and a lot of problems to be solved.

Read my latest post at the SANS AppSec Street Fighter blog on my experience at this year's RSA conference.