Thursday, March 19, 2015

Making Refactoring Work

A recent academic study raises some questions about how useful and how important refactoring really is.

The researchers found that refactoring didn’t seem to make code measurably easier to understand or change, or even measurably cleaner (measured by cyclomatic complexity, depth of inheritance, class coupling or lines of code).

But as other people have discussed, this study is deeply flawed. It appears to have been designed by people who didn’t understand how to do refactoring properly:

  1. The researchers chose 10 “high impact” refactoring techniques (from a 2011 study by Shatnawi and Li) based on a model of OO code quality which measures reusability, flexibility, extendibility and effectiveness (“the degree to which a design is able to achieve the desired functionality and behavior using OO design concepts and techniques” – whatever that means), but which specifically did not include understandability. And then they found that the refactored code was not measurably easier to understand or fix. Umm, should this have been a surprise?

    The refactorings were intended to make the code more extensible and reusable and flexible. In many cases this would have actually made the code less simple and harder to understand. Flexibility and extendibility and reusability often come at the expense of simplicity, requiring additional scaffolding and abstraction. These are long-term investments that are intended to pay back over the life of a system – something that could not be measured in the couple of hours that the study allowed.

    The list of techniques did not include common and obviously useful refactorings which would have made the code simpler and easier to understand, such as Extract Class and Extract Method (which are the two most impactful refactorings, according to research by Alshehri and Benedicenti, 2014) Extract Variable, Move Method, Change Method Signature, Rename anything, … [insert your own shortlist of other useful refactorings here].

  2. There is no evidence – and no reason to believe – that the refactoring work that was done, was done properly. Presumably somebody entered some refactoring commands in Visual Studio and the code was “refactored properly”.

  3. The study set out to measure whether refactoring made code easier to change. But they attempted to do this by assessing whether students were able to find and fix bugs that had been inserted into the code – which is much more about understanding the code than it is about changing it.

  4. The code base (4500 lines) and the study size (two groups of 10 students) were both too small to be meaningful, and students were not given enough time to do meaningful work: 5 minutes to read the code, 30 minutes to answer some questions about it, 90 minutes to try to find and fix some bugs.

  5. And as the researchers point out, the developers who were trying to understand the code were inexperienced. It’s not clear that they would have been able to understand the code and work with it even it had been refactored properly.

But the study does point to some important limitations to refactoring and how it needs to be done.

Good Refactoring takes Time

Refactoring code properly takes experience and time. Time to understand the code. Time to understand which refactorings should be used in what context. Time to learn how to use the refactoring tools properly. Time to learn how much refactoring is enough. And of course time to get the job done right.

Someone who isn’t familiar with the language or the design and the problem domain, and who hasn’t worked through refactoring before won’t do a good job of it.

Refactoring is Selfish

When you refactor, it’s all about you. You refactor the code in ways to make it easier for YOU to understand and that should make it easier for YOU to change in the future. But this doesn’t necessarily mean that the code will be easier for someone else to understand and change.

It’s hard to go wrong doing some basic, practical refactoring. But deeper and wider structural changes, like Refactoring to Patterns or other “Big Refactoring” or “Large Scale Refactoring” changes that make some programmers happy can also make the code much harder for other programmes to understand and work with – especially if the work only gets done part way (which often happens with well-intentioned, ambitious root canal refactoring work).

In the study, the researchers thought that they were making the code better, by trying to make it more extensible, reusable and flexible. But they didn’t take the needs of the students into consideration. And they didn’t follow the prime directive of refactoring:

Always start by refactoring to understand. If you aren’t making the code simpler and easier to understand, you’re doing it wrong.

Ironically, what the students in the study should have done – with the original code, as well as the “refactored code” – was to refactor it on their own first so that they could understand it. That would have made for a more interesting, and much more useful, study.

Refactoring Works

There’s no doubt that refactoring – done properly – will make code more understandable, more maintainable, and easier to change. But you need to do it right.

Wednesday, March 4, 2015

Putting Security into Sprints

To build a secure app, you can’t wait to the end and hope to “test security in”. For teams who follow Agile methods like Scrum, this means you have to find a way to add security into Sprints. Here’s how to do it:

Sprint Zero

A few basic security steps need to be included upfront in Sprint Zero:

  1. Platform selection – when you are choosing your language and application framework, take some time to understand the security functions they provide. Then look around for security libraries like Apache Shiro (a framework for authentication, session management and access control), Google KeyCzar (crypto), and the OWASP Java Encoder (XSS protection) to fill in any blanks.
  2. Data privacy and compliance requirements – make sure that you understand data needs to be protected and audited for compliance purposes (including PII), and what you will need to prove to compliance auditors.
  3. Secure development training – check the skill level of the team, fill in as needed with training on secure coding. If you can’t afford training, buy a couple of copies of Iron-Clad Java, and check out SAFECode’s free seminars on secure coding.
  4. Coding guidelines and code review guidelines – consider where security fits in. Take a look at CERT’s Secure Java Coding Guidelines.
  5. Testing approach – plan for security unit testing in your Continuous Integration pipeline. And choose a static analysis tool and wire it into Continuous Integration too. Plan for pen testing or other security stage gates/reviews later in development.
  6. Assigning a security lead - someone on the team who has experience and training in secure development (or who will get extra training in secure development) or someone from infosec, who will act as the point person on risk assessments, lead threat modeling sessions, coordinate pen testing and scanning and triage the vulnerabilities found, bring new developers up to speed.
  7. Incident Response - think about how the team will help ops respond to outages and to security incidents.

Early Sprints

The first few Sprints, where you start to work out the design and build out the platform and the first-ofs for key interfaces and integration points, is when the application’s attack surface expands quickly.

You need to do threat modeling to understand security risks and make sure that you are handling them properly.

Start with Adam Shostack’s 4 basic threat modeling questions:

  1. What are you building?
  2. What can go wrong?
  3. What are you going to do about it?
  4. Did you do an acceptable job at 1-3?

Delivering Features (Securely)

A lot of development work is business as usual, delivering features that are a lot like the other features that you’ve already done: another screen, another API call, another report or another table. There are a few basic security concerns that you need to keep in mind when you are doing this work. Make sure that problems caught by your static analysis tool or security tests are reviewed and fixed. Watch out in code reviews for proper use of frameworks and libraries, and for error and exception handling and defensive coding.

Take some extra time when a security story comes up (a new security feature or a change to security or privacy requirements), and think about abuser stories whenever you are working on a feature that deals with something important like money, or confidential data, or secrets, or command-and-control functions.

Heavy Lifting

You need to think about security any time you are doing heavy lifting: large-scale refactoring, upgrading framework code or security plumbing or the run-time platform, introducing a new API or integrating with a new system. Just like when you are first building out the app, spend extra time threat modeling, and be more careful in testing and in reviews.

Security Sprints

At some point later in development you may need to run a security Sprint or hardening Sprint – to get the app ready for release to production, or to deal with the results of a pen test or vulnerability scan or security audit, or to clean up after a security breach.

This could involve all or only some of the team. It might include reviewing and fixing vulnerabilities found in pen testing or scanning. Checking for vulnerabilities in third party and Open Source components and patching them. Working with ops to review and harden the run-time configuration. Updating and checking your incident response plan, or improving your code review or threat modeling practices, or reviewing and improving your security tests. Or all of the above.

Adding Security into Sprints. Just Do It.

Adding security into Sprints doesn’t have to be hard or cost a lot. A stripped down approach like this will take you a long way to building secure software. And if you want to dig deeper into how security can fit into Sprints, you can try out Microsoft’s SDL for Agile. Just do it.

Site Meter