Tuesday, November 15, 2011

Diminishing Returns in software development and maintenance

Everyone knows from reading The Mythical Man Month that as you add more people to a software development project you will see diminishing marginal returns.

When you add a person to a team, there’s a short-term hit as the rest of the team slows down to bring the new team member up to speed and adjusts to working with another person, making sure that they fit in and can contribute. There’s also a long-term cost. More people means more people who need to talk to each other (n x n-1 / 2), which means more opportunities for misunderstandings and mistakes and misdirections and missed handoffs, more chances for disagreements and conflicts, more bottleneck points.

As you continue to add people, the team needs to spend more time getting each new person up to speed and more time keeping everyone on the team in synch. Adding more people means that the team speeds up less and less, while people costs and communications costs and overhead costs keep going up. At some point negative returns set in – if you add more people, the team’s performance will decline and you will get less work done, not more.

Diminishing Returns from any One Practice

But adding too many people to a project isn’t the only case of diminishing returns in software development. If you work on a big enough project, or if you work in maintenance for long enough, you will run into problems of diminishing returns everywhere that you look.

Pushing too hard in one direction, depending too much on any tool or practice, will eventually yield diminishing returns. This applies to:
- Manual functional and acceptance testing
- Test automation
- Any single testing technique
- Code reviews
- Static analysis bug finding tools
- Penetration tests and other security reviews

Aiming for 100% code coverage on unit tests is a good example. Building a good automated regression safety net is important – as you wire in tests for key areas of the system, programmers get more confidence and can make more changes faster.

How many tests are enough? In Continuous Delivery, Jez Humble and David Farley set 80% coverage as a target for each of automated unit testing, functional testing and acceptance testing. You could get by with lower coverage in many areas, higher coverage in core areas. You need enough tests to catch common and important mistakes. But beyond this point, more tests get more difficult to write, and find fewer problems.

Unit testing can only find so many problems in the first place. In Code Complete, Steve McConnell explains that unit testing can only find between 15% and 50% (on average 30%) of the defects in your code. Rather than writing more unit tests, people’s time would be better spent on other approaches like exploratory system testing and code reviews or stress testing or fuzzing to find different kinds of errors.
Too much of anything is bad, but too much whiskey is enough.
Mark Twain, as quoted in Code Complete
Refactoring is important for maintaining and improving the structure and readability of code over time. It is intended to be a supporting practice – to help make changes and fixes simpler and clearer and safer. When refactoring becomes an end in itself or turns into Obsessive Refactoring Disorder, it not only adds unnecessary costs as programmers waste time over trivial details and style issues, it can also add unnecessary risks and create conflict in a team.

Make sure that refactoring is done in a disciplined way, and focus refactoring on those areas that need it the most: on code that is frequently changed, routines that are too big, too hard to read, too complex and error-prone. Putting most of your attention refactoring (or if necessary rewriting) this code will get you the highest returns.

Less and Less over Time

Diminishing returns also set in over time. The longer that you spend working the same way and with the same tools, the less benefits you will see. Even core practices that you’ve grown to depend on don’t pay back over time, and at some point may cost more than they are worth.

It’s time again for New Year’s resolutions – time to sign up at a gym and start lifting weights. If you stick with the same routine for a couple of months, you will start to see good results. But after a while your body will get used to the work – if you keep doing the same things the same way your performance will plateau and you will stop seeing gains. You will get bored and stop going to the gym, which will leave more room for people like me. If you do keep going, trying to push harder for returns, you will overtrain and injure yourself.

The same thing happens to software teams following the same practices, using the same tools. Some of this is due to inertia. Teams, organizations reach an equilibrium point and they want to stay there. Because it is comfortable, and it works – or at least they understand it. And because the better the team is working, the harder it is to get better – all the low-hanging fruit has been picked. People keep doing what worked for them in the past. They stop looking beyond their established routines, stop looking for new ideas. Competence and control lead to complacency and acceptance. Instead of trying to be as good as possible, they settle for being good enough.

This is the point of inspect-and-adapt in Scrum and other time boxed methods – asking the team to regularly re-evaluate what they are doing and how they are doing it, what’s going well and what isn’t, what they should do more of or less of, challenging the status quo and finding new ways to move forward. But even the act of assessing and improving is subject to diminishing returns. If you are building software in 2-week time boxes, and you’ve been doing this for 3, 4 or 5 years, then how much meaningful feedback should you really expect from so many superficial reviews? After a while the team finds themselves going over the same issues and problems and coming up with the same results. Reviews become an unnecessary and empty ritual, another waste of time.

The same thing happens with tools. When you first start using a static analysis bug checking tool for example, there’s a good chance that you will find some interesting problems that you didn’t know were in the code – maybe even more problems than you can deal with. But once you triage this and fix up the code and use the tool for a while, the tool will find fewer and fewer problems until it gets to the point where you are paying for insurance – it isn’t finding problems any more, but it might someday.

In "Has secure software development reached its limits?” William Jackson argues that SDLCs – all of them – eventually reach a point of diminishing returns from a quality and security standpoint, and that Microsoft and Oracle and other big shops are already seeing diminishing returns from their SDLCs. Their software won’t get any better – all they can do is to keep spending time and money to stay where they are. The same thing happens with Agile methods like Scrum or XP – at some point you’ve squeezed everything that you can from this way or working, and the team’s performance will plateau.

What can you do about diminishing returns?

First, understand and expect returns to diminish over time. Watch for the signs, and factor this into your expectations – that even if you maintain discipline and keep spending on tools, you will get less and less return for your time and money. Watch for the team’s velocity to plateau or decline.

Expect this to happen and be prepared to make changes, even force fundamental changes on the team. If the tools that you are using aren’t giving returns any more, then find new ones, or stop using them and see what happens.

Keep reviewing how the team is working, but do these reviews differently: review less often, make the reviews more focused on specific problems, involve different people from inside and outside of the team. Use problems or mistakes as an opportunity to shake things up and challenge the status quo. Dig deep using Root Cause Analysis and challenge the team’s way of thinking and working, look for something better. Don’t settle for simple answers or incremental improvements.

Remember the 80/20 rule. Most of your problems will happen in the same small number of areas, from a small number of common causes. And most of your gains will come from a few initiatives.

Change the team’s driving focus and key metrics, set new bars. Use Lean methods and Lean Thinking to identify and eliminate bottlenecks, delays and inefficiencies. Look at the controls and tests and checks that you have added over time, question whether you still need them, or find steps and checks that can be combined or automated or simplified. Focus on reducing cycle time and eliminating waste until you have squeezed out what you can. Then change your focus to quality and eliminating bugs, or to simplifying the release and deployment pipeline, or some other new focus that will push the team to improve in a meaningful way. And keep doing this and pushing until you see the team slowing down and results declining. Then start again, and push the team to improve again along another dimension. Keep watching, keep changing, keep moving ahead.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.