Thursday, October 11, 2012

Bad Things Happen to Good Code

We need to understand what happens to code over time and why, and what a healthy, long-lived code base looks like. What architectural decisions have the most lasting impact, and what decisions made early will make the most difference over the life of a system.

Forces of Compromise

Most of the discussion around technical debt assumes that code degrades over time because of sloppiness and lazy coding practices and poor management decisions, by programmers who don’t know or don’t care about what they are doing or who are forced to take short-cuts under pressure. But it’s not that simple. Code is subject to all kinds of pressures and design compromises, big and small, in the real world.

Performance optimization trade-offs can force you to bend the design and code in ways that were never expected. Dealing with operational dependencies and platform quirks and run-time edge cases also adds complexity. Then there are regulatory requirements – things that don’t fit the design and don’t necessarily make sense but you have to do anyways. And customization: customer-specific interfaces and options and custom workflow variants and custom rules and custom reporting, all to make someone important happy.

Integration with other systems and API lock-in and especially maintaining backwards compatibility with previous versions can all make for ugly code. Michael Feathers, who I think is doing the most interesting and valuable work today in understanding what happens to code and what should happen to code over time, has found that code around APIs and other hard boundaries becomes especially messy – because some interfaces are so hard to change, this forces programmers to do extra work (and workarounds) behind the scenes.

All of these forces contribute to making a system more complex, harder to understand, harder to change and harder to test over time – and harder to love.

Iterative Development is Erosive

In Technical Debt, Process and Culture, Feathers explains that “generalized entropy in software systems” is inevitable, the result of constant and normal wear and tear in an organization. As more people work on the same code, the design will naturally deteriorate as each person interprets the design in their own way and makes their own decisions on how to do something. What’s interesting is that the people working with this code can’t see how much of the design has been lost because their familiarity with the code makes it appear to be simpler and clearer than it really is. It’s only when somebody new joins the team that it becomes apparent how bad things have become.

Feathers also suggests that highly iterative development accelerates entropy, and that code which is written iteratively is qualitatively different than code in systems where the team spent more time in upfront design. Iterative development and maintenance tend to bias towards the existing structure of the system, meaning that more compromises will end up being made.

Iterative design and development involves making a lot of small mistakes, detours and false starts as you work towards the right solution. Testing out new ideas in production through A/B split testing amplifies this effect, creating more options and complexity. As you work this way some of the mistakes and decisions that you make won’t get unmade – you either don’t notice them, or it’s not worth the cost. So you end up with dead abstractions and dead ends, design ideas that aren't meaningful any more or are harder to work with than they should be. Some of this will be caught and corrected later in the course of refactoring, but the rest of it becomes too pervasive and expensive to justify ripping out.

Dealing with Software Sprawl

Software, at least software that gets used, gets bigger and more complicated over time – it has to, as you add more features and interfaces and deal with more exceptions and alternatives and respond to changing laws and regulations. Capers Jones analysis shows that the size of the code base for a system under maintenance will increase between 5-10% per year. Our own experience bears this out - the code base for our systems has doubled in size in the last 5 years.

As the code gets bigger it also gets more complex – code complexity tends to increase an average of between 1% and 3% per year. Some of this is real, essential complexity – not something that you can wish your way out of. But the rest is due to how changes and fixes are done.

Feathers has confirmed by mining code check-in history (Discovering Startling Things from your Version Control System) that most systems have a common shape or “power curve”. Most code is changed only infrequently or not at all, but the small percentage of methods and classes in the system that are changed a lot tend to get bigger and more complex over time. This is because it is

easier to add code to an existing method than to add a new method and easier to add another method to an existing class than to add a new class.
The key to keeping a code base healthy is disciplined refactoring of this code, taking the time to come up with new and better abstractions, and preventing the code from festering and becoming malignant.

There is also one decision upfront that has a critical impact on the future health of a code base. Capers Jones has found that the most important factor in how well a system ages is, not surprisingly, how complex the design was in the beginning:

The rate of entropy increase, or the increase in cyclomatic complexity, seems to be proportional to the starting value. Applications that are high in complexity when released will experience much faster rates or entropy or structural decay than applications put into production with low complexity levels. The Economics of Software Quality
Systems that were poorly designed only get worse – but Jones has found that systems that were well-designed can actually get better over time.

1 comment:

Anonymous said...

I would add one, but very important factor - how familiar with the source code and the design the developers are.
As the system gets bigger, you hire more people - if they are not trained properly by more experienced colleagues and are not given the enough amount of time to get more familiar with the code they will start to write code that is not compliant with the design.
Hopefully, the code will fulfill business requirements, but you get different solutions to the problems that, probably, were already solved in the other part of the system. As the times go by the initial design, best practices get lost in the mess of custom solutions.

LP

Site Meter