Building Real Software: You can’t Refactor your way out of every Problem

Refactoring is a disciplined way to clarify, retain or restore the design of a system as you make changes, and to help cleanup and correct the mistakes and mess that we all make as we work, to clear away the evidence of false starts and changes in direction and back tracking and to help fill in gaps and misunderstandings.

As a colleague of mine has pointed out, you can get a lot out of even the most simple and obvious refactoring changes: eliminating duplication, changing variable and method names to be more meaningful, extracting methods, simplifying conditional logic, replacing a magic number with a named constant. These are easy things to do, and will give you a big return in understandability and maintainability.

But refactoring has limitations – there are some problems that refactoring won’t solve.

Refactoring can’t help you if the design is fundamentally wrong

Some people naively believe that you can refactor your way out of any design mistake or misunderstanding – and that you can use refactoring as a substitute for upfront design. This assumes that you will be able to immediately recognize mistakes and gaps from customer feedback and correct the design as you are developing.

But it can take a long time, usually only once the system is being used in the real world by real customers to do real things, before you learn how wrong you actually were, how much you missed and misunderstood, exceptions and edge cases and defects piling up before you finally understand (or accept) that no, the design doesn't hold up, you can’t just keep on extending it and patching what you have – you need a different set of abstractions or a different architecture entirely.

Refactoring helps you make course corrections. But what if you find out that you've been driving the entire time in the wrong direction, or in circles?

Barry Boehm, in Balancing Agility and Discipline, explains that starting simple and refactoring your way to the right answer sometimes falls down:

“Experience to date also indicates that low-cost refactoring cannot be depended upon as projects scale up. The most serious problems that arise with simple design are problems known as “architecture breakers”. These highly expensive problems can occur when early, simple design decisions result in forseeable changes that cause breakage in design beyond the ability of refactoring to handle.”

This is another argument in the “Refactor or Design” holy war over how much design should be / needs to be done upfront and how much can be filled in as you go through incremental change and refactoring.

Deep Decisions

Many design ideas can be refined, elaborated, iterated and improved over time, and refactoring will help you with this. But some early decisions on approach, packaging, architecture, and technology platform are too fundamental and too deep to change or correct with refactoring.

You can use refactoring to replace in-house code with standard library calls, or to swap one library for another – doing the same thing in a different way. Making small design changes and cleaning things up as you go with refactoring can be used to extend or fill in gaps in the design and to implement cross-cutting features like logging and auditing, even access control and internationalization – this is what the XP approach to incremental design is all about.

But making small-scale design changes and improvements to code structure, extracting and moving methods, simplifying conditional logic and getting rid of case statements isn’t going to help you if your architecture won’t scale, or if you chose the wrong approach (like SOA) or the wrong application framework (J2EE with Enterprise Java Beans, any multi-platform UI framework or any of the early O/R mapping frameworks – remember the first release of TopLink?, or something that you rolled yourself before you understood how the language actually worked), or the wrong language (if you found out that Ruby or PHP won’t scale), or a core platform middleware technology that proves to be unreliable or that doesn't hold up under load or that has been abandoned, or if you designed the system for the wrong kind of customer and need to change pretty much everything.

Refactoring to Patterns and Large Refactorings

Joshua Kerievsky’s work on Refactoring to Patterns provides higher-level composite refactorings to improve – or introduce – structure in a system, by properly implementing well-understood design patterns such as factories and composites and observers, replacing conditional logic with strategies and so on.

Refactoring to Patterns helps with cleaning up and correcting problems like

“duplicated code, long methods, conditional complexity, primitive obsession, indecent exposure, solution sprawl, alternative classes with different interfaces, lazy classes, large classes, combinatorial explosions and oddball solutions”.

Lippert and Roock’s work on Large Refactorings explains how to take care of common architectural problems in and between classes, packages, subsystems and layers, doing makeovers of ugly inheritance hierarchies and reducing coupling between modules and cleaning up dependency tangles and correcting violations between architectural layers – the kind of things that tools like Structure 101 help you to see and understand.

They have identified a set of architectural smells and refactorings to correct them:

Smells in dependency graphs: Visible dependency graphs, tree-like dependency graphs, cycles between classes, unused classes
Smells in inheritance hierarchies: Parallel inheritance hierarchies, list-like inheritance hierarchy, inheritance hierarchy without polymorphic assignments, inheritance hierarchy too deep, subclasses without redefinitions
Smells in packages: Unused packages, cycles between packages, too small/large packages, packages unclearly named, packages too deep or nesting unbalanced
Smells in subsystems: Subsystem overgeneralized, subsystem API bypassed, subsystem too small/large, too many subsystems, no subsystems, subsystem API too large
Smells in layers: Too many layers, no layers, strict layers violated, references between vertically separate layers, upward references in layers, inheritance between protocol-oriented layers (coupling).

Composite refactorings and large refactorings raise refactoring to higher levels of abstraction and usefulness, and show you how to identify problems on your own and how to come up with your own refactoring patterns and strategies.

But refactoring to patterns or even large-scale refactoring still isn't enough to unmake or remake deep decisions or change the assumptions underlying the design and architecture of the system. Or to salvage code that isn't safe to refactor, or worth refactoring.

Sometimes you need to rewrite, not refactor

There is no end of argument over how bad code has to be before you should give up and rewrite it rather than trying to refactor your way through it.

The best answer seems to be that refactoring should always be your first choice, even for legacy code that you didn’t write and don’t understand and can’t test (there is an entire book written on how and where to start refactoring legacy spps).

But if the code isn’t working, or is so unstable and so dangerous that trying to refactor it only introduces more problems, if you can’t refactor or even patch it without creating new bugs, or if you need to refactor too much of the code to get it into acceptable shape (I’ve read somewhere than 20% is a good cut-off, but I can’t find the reference), then it’s time to declare technical bankruptcy and start again. Rewriting the code from scratch is sometimes your only choice. Some code shouldn't be – or can’t be – saved.

"Sometimes code doesn't need small changes—it needs to be tossed out so that you can start over. If you find yourself in a major refactoring session, ask yourself whether instead you should be redesigning and reimplementing that section of code from the ground up." Steve McConnell, Code Complete

You can use refactoring to restore, repair, cleanup or adapt the design or even the architecture of a system. Refactoring can help you to go back and make corrections, reduce complexity, and help you fill in gaps. It will pay dividends in reducing the cost and risk of ongoing development and support.

But refactoring isn’t enough if you have to reframe the system – if you need to do something fundamentally different, or in a fundamentally different way – or if the code isn’t worth salvaging. Don’t get stuck believing that refactoring is always the right thing to do, or that you can refactor yourself out of every problem.