Building Real Software: Hardening Sprints. What are they? Do you need them?

For anyone who is developing software using Scrum, XP or another incremental development approach, the idea of a “hardening sprint” or a “release iteration” is bound to come up. But people disagree about what a “hardening sprint” should include, when you need to do one, and if you should do them at all. There is a deep divide between people who recognize that spending some time on hardening is needed for many environments, and people who are adamant that allocating some time for hardening is a sign that you are doing some things – or everything – wrong.

Hardening to make sure that Done means Done

In a hardening sprint, the team stops focusing on delivering new features or architecture, and instead spends their time on stabilizing the system and getting it ready to be released.

For some people, hardening sprints are for completing testing and fixing work that couldn't be done – or didn't get done – earlier. This might include UAT or other final acceptance testing if this is built into a contract or governance model.

Mike Cohn recognizes that teams may need a “release sprint” at the end of each release cycle, because the team’s definition of “done” may not be enough – that a "potentially shippable product" and a system that is actually “shippable” or ready for production aren't the same thing. He suggests that after every 3-5 feature iterations, the team may want to schedule a release sprint to do work like expensive manual system and integration testing and extra reviews, whatever is needed to make sure that what they think is done, is actually done.

Anand Viswanath, in “The end of regression, stabilisation, hardening or release sprints”, describes a common approach where teams schedule 1 or 2 stabilization sprints every 4-6 iterations to do regression testing and system testing in a staging environment, and then fix whatever bugs are found. As he points out, it’s hard to predict how much testing might be required and long it will take to fix whatever problems are found, so the idea is to time box this work and then triage the results.

Because this can be an expensive and risky and stressful way to work, Vishwanath recommends following Continuous Delivery to build an automated test pipeline through to staging in order to catch as many problems as early as possible. This is a good idea, but most large projects, especially projects starting from a legacy code base, will still probably need some kind of hardening or integration testing phase at regular points regardless of what kind of continuous testing they are doing.

Some testing, like interoperability testing with other systems and operational testing, can’t be done effectively until later, when there is enough of a working system to do end-to-end testing, and some of this testing can only be done in staging (if you have a staging environment), or in production. For some systems, load testing and stress testing and soak testing also needs to be left to later, because these teams don’t have access to a big enough test system to run high load scenarios before they get to production.

Is Hardening a sign that you aren't doing things right?

Not everyone thinks that scheduling a hardening sprint for testing and fixing like this is a good idea:

“[a hardening sprint] might take the cake for stupid things invented that has lead to institutionalized delusion and ‘Agile’ dysfunction.” Janelle Klein, Who Came up with the “Hardening Sprint”?

For many people, a hardening sprint or release sprint is a bad “process smell”: a sign that the team isn't working properly or thinking clearly:

“The problem with “hardening sprints” is that you are lying. You make believe your imaginary burndown during the initial sprints shows that you are approaching Done. But it’s a lie--you aren't getting any closer to being ready for Production until you begin your Test phase. You wrote a pile of code that you didn't test adequately. You don’t know how good it is, you don’t know how much work you have left to do, and you don’t know how much longer it will take, until you are deep into your Test phase.” Richard Kasperowski, Hardening sprints? Sorry, you’re not Agile

Ron Jeffries says that a hardening sprint for testing and fixing is a clear anti-pattern. I agree: if you need a separate sprint to fix bugs, then you’re doing something wrong. But that doesn't mean that you won’t need extra time to fix things before the system goes live – knowing that it is wrong doesn't make the bugs go away, you still have to fix them. As somebody else on this same discussion thread points out, there is a risk that your “definition of done” could fall short of what is actually needed by the customer, so you should plan for 1 or more hardening sprints before release, to double-check and stabilize things, just in case.

In these cases, the need for hardening sprints is a sign of a team’s immaturity (from a post by Paul Beavers):

A beginning agile team will prefer to schedule 6 hardening iterations after a 12 iteration development plan. This is “agile” to the hard core “waterfall guy”.
As time goes by, the team will mature a bit and you will see the seasoned agile team will shrink the number of required hardening iterations at the end, just because they understand they need to “fix” the high severity bugs as they go and QA understands they need to test closer and better early up in the release cycle.
Further down the road the team will notice that by adding a hardening iteration in the middle of the development cycle (and flushing out even lesser priority bugs earlier on in the process), it will help them to maintain cadence later on.
The final step of maturity is there when the team starts understanding “hardening is not required any more”, because they made fixing bugs part of their daily routines.

Hardening is whatever you need to do to Make the System Ready for Production

Another way of looking at hardening, is that this is when you stop thinking about features and focus all of your time on the detailed steps of deploying, installing and configuring the system and making sure that everything is working from end-to-end. In a hardening sprint, your most important customers are operations and support, the people who are going to make sure that the system is running, rather than the end users.

For some teams, this kind of hardening can come as an ugly and expensive surprise, after they understand that what they need to do is to take a working functional prototype and make it ready for the real world:

“All those things that got skipped in the first phase - error handling, monitoring, administration - need to get put into the product.” Catherine Powell, The "Hardening Myth"

But a hardening sprint can also be when when you take care of what operations calls hardening: reviewing and preparing the production environment and securing the run-time, tightening up access to production data, double-checking system and application configs, making sure that auditing is enabled properly, wiring the system in to operations monitoring and metrics collection, checking system dependencies like platform software versions and patch levels (and making sure that all of the systems are consistent, that there aren't any snowflakes), completing final security reviews and other review and release gates, and making sure that the people installing and running the software have the correct instructions.This is also when you need to prepare your roll-back plan or recovery plan if something bad happens with the release, and test your roll-back and recovery steps. Walk through and rehearse the release process and checklists, and make sure that everyone is prepared to roll out patches quickly after the release is done.

Hardening is something that you have to do

Some people see an obvious need for hardening sprints. For example, Dean Leffingwell includes hardening sprints in his “Scaled Agile Framework”, because there is some work that can only really be done in a final hardening phase:

Final exploratory and field testing
Checklist validation against release, QA and standards governance
Release signoffs if you need them
Ops documentation
Deployment package
Communicate release to everyone (hard to do in big companies)
Traceability etc for high assurance and regulatory compliance

Leffingwell makes it clear that hardening shouldn't include system integration, fixing high priority bugs, automating test scripts, user documentation, regression testing and code cleanup. There is other work that should be done earlier – but in the first year or so, will probably need to be done in a late hardening phase:

Cross-component integration, integration with third-party/customer
Integrated system-level testing
Final QA sign-offs
User doc finalization
Localization

Dan Rawsthorne explains that teams need at least one release sprint at first to get ready for release to production, because until you've actually done it, you don’t really know what you need to do. Release sprints include tasks like:

Exploratory testing to double check that key features are working properly
Stress testing/load testing/performance testing – testing that is expensive to setup and do
Interoperability testing with other production systems
Fix whatever comes out of this testing
Review and finish off any documentation
Train support and sales and customers on new features
Help with press releases and other marketing material

The Software Project Manager’s Bridge to Agility anticipates that teams will need at least a short hardening iteration before the system is ready for release, even if they frontload as much testing as possible. A release iteration is not a test-fix phase – it’s when you prepare for the release: capturing screenshots for marketing materials, final tests, small tweaks, finish documentation for whoever needs it, training. The authors suggest however that if some developers have any time left over in the release iteration, they can do some refactoring and other cleanup – which I think is bad advice, given that at this point you don’t want to be introducing any new variables or risks.

Disciplined Agile Delivery, a method that was developed by Scott Ambler at IBM to scale Agile practices to large organizations and large projects, includes a Transition Phase before each release to take care of:

Transition planning and coordination
End-of-lifecycle testing and fixing
Testing and rehearsing deployment
Data setup and migration
Pilots and beta testing (short UAT if necessary)
Reviewing and finalizing documentation
Preparing operations and support
Stakeholder training

This kind of transition can take almost no time, or it can take several weeks, depending on the situation.

Hardening – taking some time to make sure that the system is really ready to be released – can’t be avoided. The longer your release cycles, the further away development is from day-to-day production, the more hardening you need. Even if you've been doing disciplined testing and reviews in stream, you’re going to find some problems at the end. Even if you planned ahead for transition, you’re going to run into operational details that you didn't know about or didn't understand until the end.

When we first launched our platform from startup, we had to do hardening and stabilization work before going live to get the system ready, and some more work afterwards to deal with operational issues and requirements that we weren't prepared for. We included time at the end of subsequent releases for extra testing, deployment and roll back planning, and release coordination.

But as we shortened our release cycle, releasing less but more often, and as we built more fail-safes into the system and as we learned more about what we needed to do in ops, and as we invested more in simplifying and automating deployment and everything else that we could, we found that we didn't need time any outside of our regular iterations for hardening. We’re still doing hardening – but now this is part of the day-to-day job of building and releasing software.

8 comments:

Anonymous said...: I think that this concept - hardening sprint - probably will be used by organizations like "testing sprint", transforming it in a dysfunction of scrum (creating another mode of waterfall).; January 11, 2013 at 9:53 AM
Jim Bird said...: @Anonymous, I don't necessarily see a hardening sprint as a form of waterfall. Unless you're deploying every release to production, you need to schedule some time before deployment to take care of release-related issues. If this is the first time that you've released to production, or if it's been a long time because you're production release cycles are several months, then you'll need to spend a good amount of time getting ready - so a hardening sprint, or more than one, is going to be needed.; January 11, 2013 at 10:47 AM
Rina Noronha said...: Hi Jim, my name's Rina, I'm the content manager at iMasters, one of the largest brazilian communities for developers. I'd like to talk to you about translating your articles to Portuguese and re-publishing them. Can you please contact me at rina.noronha@imasters.com.br? Thanks!; January 14, 2013 at 5:51 AM
Tommy Norman said...: Jim,
Love the article. I agree that a hardening Sprint is an anti-pattern but one that companies early in their Agile transition may have to implement. Some may always need some element of it such as a client I had with a huge amount of legacy code, very complex interactions between multiple systems, and a hardware element that would never allow for full automation. As long as you recognize why you are having to do it and have a plan to address as many of those underlying causes as you can, then I think it an acceptable (albeit hopefully temporary) practice. Those who paint it with a broad brush as a complete anti-pattern and say you should never do it usually have very flippant answers for what you should do instead. "You should automate all your tests!" Then I would like to borrow your test automation magic wand because that is not something that happens over night at most places.

Thanks for a very in-depth article that explored both sides of this hotly debated issue.; January 20, 2014 at 8:54 PM
RJ said...: Highly detailed and a useful article!

I find people who are Anal about NOT using Hardening (no pun) sprints either -
(a) Have never successfully delivered software themselves, or
(b) Are so far up there with the whole Agility elitism that they fail to understand that in the end - it's not about creating quality processes but delivering quality product and business value to PO and end-users.; February 20, 2014 at 11:15 PM
Alejandro Teruel said...: ¡Excellent article! Perhaps as the agile development team matures, it starts moving towards Devops -which can also help defuse the dangers of hazardous release sprints...; November 10, 2014 at 1:01 PM
Prashant said...: I think it should be used only in the scrum of scrum scenario, though I have seen a situation where as a vendor we were developing webservices and customers in house team was working on UI. However this team had most of the individuals on contract and therefore it was not possible to have them available throughout the release. Workaround was to have them work on the webservices after certain functionality is available. In this situation it was necessary to have iteration H. Any other suggestions would be welcome.; December 22, 2014 at 9:32 PM
Susmita said...: I agree with the Hardening sprint and it really is very dependent on the complexity of the product, where the final RC build is validated along with other artifacts, like licensing and covers a end to end testing from downloading to activation. A challenge which we face is in case of encryption changes the export compliance approval is time consuming and getting that clearance take more time than to release in production. That is a legal requirement. Looking forward for suggestions.; February 1, 2016 at 11:08 AM