Building Real Software: Essential Attack Surface Management

To attack your system, to steal something or do something else nasty, the bad guys need to find a way in, and usually a way out as well. This is what Attack Surface Analysis is all about: mapping the ways in and out of your system, looking at the system from an attacker’s perspective, understanding what parts of the system are most vulnerable, where you need to focus testing and reviews. It’s part of design and it’s also part of risk management.

Attack Surface Analysis is simple in concept. It’s like walking around your house, counting all of the doors and windows, and checking to see if they are open, or easy to force open. The fewer doors and windows you have, and the harder they are to open, the safer you are. The bigger a system’s attack surface, the bigger a security problem you have, and the more work that you have to put into your security program.

For enterprise systems and web apps, the doors and windows include web URLs (every form, input field – including hidden fields, URL parameters and scripts), cookies, files and databases shared outside the app, open ports and sockets, external system calls and application APIs, admin user ids and functions. And any support backdoors into the app, if you allow that kind of thing.

I’m not going to deal with minimizing the attack surface by turning off features or deleting code. It’s important to do this when you can of course, but most developers are paid to add new features and write more forms and other interfaces – to open up the Attack Surface. So it’s important to understand what this means in terms of security risk.

Measuring the System’s Attack Surface

Michael Howard at Microsoft and other researchers have developed a method for measuring the attack surface of an application, and to track changes to the attack surface over time, called the Relative Attack Surface Quotient (RSQ).

Using this method you calculate an overall attack surface score for the system, and measure this score as changes are made to the system and to how it is deployed. Researchers at Carnegie Mellon built on this work to develop a formal way to calculate an Attack Surface Metric for large systems like SAP. They calculate the Attack Surface as the sum of all entry and exit points, channels (the different ways that clients or external systems connect to the system, including TCP/UDP ports, RPC end points, named pipes...) and untrusted data elements. Then they apply a damage potential/effort ratio to these Attack Surface elements to identify high-risk areas.

Smaller teams building and maintaining smaller systems (which is most of us) and Agile teams trying to move fast don’t need to go this far. Managing a system’s attack surface can be done through a few straightforward steps that developers can understand and take ownership of.

Attack Surface: Where to Start?

Start with some kind of baseline if you can – at least a basic understanding of the system’s attack surface. Spend a few hours reviewing design and architecture documents from an attack surface perspective. For web apps you can use a tool like Arachni or Skipfish or w3af or one of the many commercial dynamic testing and vulnerability scanning tools or services to crawl your app and map the attack surface – at least the part of the system that is accessible over the web. Or better, get an appsec expert to review the application and pen test it so that you understand the attack surface and real vulnerabilities.

Once you have a map of the attack surface, identify the high risk areas. Focus on remote entry points – interfaces with outside systems and to the Internet – and especially where the system allows anonymous, public access. This is where you are most exposed to attack. Then understand what compensating controls you have in place, operational controls like network firewalls and application firewalls,and intrusion detection or prevention systems to help protect your app.

The attack surface model will be rough and incomplete to start, especially if you haven’t done any security work on the system before. Use what you have and fill in the holes as the team makes changes to the attack surface. But how do you know when you are changing the attack surface?

When are you Changing the Attack Surface?

According to The Official Guide to the CSSLP

“… it is important to understand that the moment a single line of code is written, the attack surface has increased.”

But this over states the risk of making code changes – there are lots of code changes (for example to behind-the-scenes reporting and analytics and changes to business logic) that don’t make the system more vulnerable to attack. Remember, the attack surface is the sum of entry and exit points and untrusted data elements in the system.

Adding a new system interface, a new channel into the system, a new connection type, a new API, a new type of client, a new mobile app, or a new file or database table shared with the outside – these changes directly affect the Attack Surface and change the risk profile of your app. The first web page that you create opens up the system’s attack surface significantly and introduces all kinds of new risks. If you add another field to that page, or another web page like it, while technically you have made the attack surface bigger, you haven’t increased the risk profile of the application in a meaningful way. Each of these incremental changes is more of the same, unless you follow a new design or use a new framework.

Changes to session management and authentication and password management also affect the attack surface. So do changes to authorization and access control logic, especially adding or changing role definitions, adding admin users or admin functions with high privileges. Changes to the code that handles encryption and secrets. Changes to how data validation is done. And major architectural changes to layering and trust relationships, or fundamental changes in technical architecture – swapping out your web server or database platform, or changing the run-time OS.

Use a Risk-Based and Opportunity-Based Approach

Attack Surface management can be done in an opportunistic way, driven by your ongoing development requirements. As they work on a piece of the system, the team reviews whether and how the changes affect the attack surface, what the risks are, and raise flags for deeper review. These red flags drive threat modeling and secure code reviews and additional testing.

This means that developers can stay focused on delivering features, while still taking responsibility for security. Attack surface reviews become a part of design and QA and risk management, burned in to how the team works, done when needed in each stage or phase or sprint.

The first time that you touch a piece of the system it may take longer to finish the change because you need to go through more risk assessment. But over time, as you work on the same parts of the system or the same problems, and as you learn more about the application and more about security risks, it will get simpler and faster.

Your understanding of the system’s attack surface will probably never be perfect or complete – but you will always be updating it and improving it. New risks and vulnerabilities will keep coming up. When this happens you add new items to your risk checklists, new red flags to watch for. As long as the system is being maintained, you’re never finished doing attack surface management.

3 comments:

rksethi said...: Hey Jim,

Great post. I think you have the right idea about reducing the scope of defining attack surface to entry & exit points. My gut reaction is that very few development shops will be willing to add another attack surface gating process if they’ve already adopted a couple of security activities such as static analysis, penetration testing, and/or threat modeling. I really like the way you positioned it as a trigger for further security review, such as threat modeling. I’ve seen a couple of places that say things along the lines of “if we’re accepting new input then this release will undergo a risk assessment”.

I think that if you position it as “evaluation criteria for security review” rather than “attack surface analysis” and you ask a small number of very simple-to-answer questions (e.g. “Have you modified the way you handle non-public financial data?”) then you’ll be able to get development groups on-board. If developers understand this as a 5-10 minute activity that might save them hours of unnecessary security reviews then they’ll be much more apt to take it on.; January 15, 2012 at 11:27 AM
Jim Bird said...: Thanks Rohit. For us, attack surface analysis isn't a separate gate, it's an ongoing part of how we design and build software, something that we do all the time, iteratively and incrementally. You're right: if you do it this way, it doesn't have to be expensive.; January 16, 2012 at 5:34 AM
Building energy management system said...: The most basic form of energy management consists of a simple time clock and thermostat. And in fact these systems are still the best choice of control in certain buildings today.; August 30, 2012 at 4:28 AM