Thursday, January 27, 2011

There's more to Managing Software Debt

If you work with a big system over time, you have to learn how to recognize and deal with technical debt, and how to keep debt under control. Like a married couple with 2 kids in college, paying off a mortgage and a couple of car loans, debt is a fact of life. You have to understand it, come to terms with it.

Debt can be a problem, but it is also a tool. As Steve McConnell explains in Managing Technical Debt there are perfectly valid reasons to take debt on intentionally and strategically. Some things should be done, but they don’t need to be done right now, and as long as you understand the costs and risks involved, you can make a trade off decision. But you will also take on debt unintentionally through honest mistakes (you didn’t or couldn’t know any better) and carelessness (you did know better, or you should have). This is inevitable.

There is a lot to successfully managing technical debt. I have to give props to Chris Sterling for trying to take on this problem on in his book Managing Software Debt.

Sterling looks at different aspects of what he calls software debt:
  • Technical Debt: taking short cuts, folding under pressure, deferring work, duplication – the copy-and-paste problem

  • Quality Debt: not enough attention to quality, not enough tests, not taking responsibility for fixing bugs

  • Configuration Management Debt: mistakes and inefficient practices in version management, build, deployment and release practices

  • Design Debt: under-designing and over-designing – expect to get the design right the third time

  • Platform Experience Debt: not sharing information, over-specialization
I like the idea of broadening the discussion of debt from code and design considerations to include testing and configuration management, and team considerations as well.

Sterling is an Agile Coach and Trainer, so he predictably looks for answers to managing software debt problems in the context of Agile development. The book is 220 pages long, but only the first 30 pages take on software debt directly; the rest of the book is about how to follow good practices in Agile development, and how this will help you minimize the amount of debt that you take on. It’s more of a primer on Agile development management and design than an analysis of software debt.

If you are building new software and want to take a responsible approach to managing debt for the future, and if you want to know more about how Agile practices and ideas can help you do this, then you may want to read this book. If you want to just get an overview of the book, most of the ideas are introduced here.

I wanted more than a review of good Agile practices. I want more on how to responsibly deal with debt. I want to know more about how to measure and asses the direct and indirect costs of debt, and how to recognize when a team has taken on too much – how to qualify and quantify it. And I want to understand more about how and when to make the case for paying off debt.

Technical debt is about making tradeoffs in cost and risk. I expected more numbers, more analysis on the cost of decisions, the time value of money. And I also expected more war stories, to see more evidence from real life. Everyone is dealing with software debt. There must be more success stories and stories of failures, concrete examples to learn from, more data to chew on and spit out. Instead, there is a couple of short case studies, and a handful of dubious graphs that attempt to quantify the cost of debt, and the return on paying debt back. But on closer review these aren’t graphs, they are pictures: there’s no data to back them up.

In the beginning, Sterling introduces a case study of a company that appears to have run into a wall. They are maintaining a 15-year-old C application, and while they are “doing everything right”, they are facing serious quality problems and can’t meet their delivery commitments. Sterling implies that they have taken on too much technical debt, and now they have to pay the price. But the case isn’t convincing. They are certainly doing something wrong. Yes their code base sounds like it is a mess, the kind of mess you might expect in a big 15-year-old C application, so it’s clear that they have some technical debt problems. But it’s also clear that their planning and technical practices, and their risk management controls aren’t good enough either – or they wouldn’t be in the mess that they are, and they wouldn’t be surprised by it.

There’s not enough to this story to make the case. And the story is left unfinished. The team had a problem and asked for some training. What happened next? Did they find out how much technical debt was holding them back, where the costs were coming from? Did they find a way to pay down their debt and get back on track? What action did they take, what worked, what didn’t? That’s what this book should have been about.

There’s another story later on about a team that had too many open issues (thousands of them), and were now having to spend too much on fixing bugs. Again, not a lot of context, but the lesson was clearer this time –leaving thousands of bugs and problems open for years is going to bite you in the ass eventually. That’s obvious. But there’s not enough here to draw good conclusions from or learn much from.

My other concern is that Sterling’s analysis focuses on a narrow part of the debt problem: preventing yourself from taking on too much unintentional debt by following good practices and disciplined development. doesn’t explore how to make intentional cost/risk trade offs, how to use debt tactically or strategically. It doesn’t help you deal with debt that you have created, or inherited. It doesn’t provide guidance on how to pay debt back, where to start, where you can get the most return, or the fastest return.

There’s more to managing software debt than good clean living, even good clean Agile living. Than doing it right the third time. If you are dealing with technical debt problems, if you have inherited a lot of debt and need to find a way to pay it off, if you want to understand what to do with debt and where to start paying it off, you will get a lot more out of Michael Feathers’ book Working Effectively with Legacy Code.

Monday, January 24, 2011

SWEBOK V3 and Software Security

Steve Tockey at Construx Software gave me some good news recently: the new revision of the IEEE’s Software Engineering Body of Knowledge (SWEBOK) will include software security as a fundamental concern in software engineering.

The SWEBOK is intended to document a common understanding of software engineering, and to act as a map to everything that anybody who designs, builds and tests software should know and understand. This is part of the IEEE’s attempt to establish basic accreditation for software engineers: a Certified Software Development Professional (CSDP) designation, similar to certified project managers (PMP) and certified IT security professionals (CISSP) and so on.

The CSDP has not gained much traction in the software development community, although the accreditation initiative has the support of companies like Boeing and Lockheed Martin. Most developers that I know haven't heard of the SWEBOK or know about CSDP certification. The only real-life CSDPs I have met have been instructors from Construx, which also sponsored some of the SWEBOK work.

But the SWEBOK has become a reference for some university programs, and a handful of universities now offer entry-level CSDA certification training as part of their software engineering programs - similar to PMI's CAPM associate certification. So the SWEBOK has the potential to influence future software development.

The security updates in V3 of the SWEBOK look like they will wire security into requirements, design, construction, testing, maintenance, configuration management, software engineering management and processes, tools and methods, and software quality. Everywhere really. This is exactly right.

We have to be realistic. It will take a while for these and other changes to be made and reviewed and approved, and for the new SWEBOK to be published. (It looks like the latest revision is already some months behind schedule). And it will take years after that before the new version will be adopted. Adding software security in the SWEBOK isn’t going to change how people design and build software in the real world soon. But I am still glad to see it. It’s a small step, in the right direction.

Thursday, January 13, 2011

What I like (and don't like) about DevOps

I’ve spent a lot of time in my career working on problems that cross the lines between development and operations. That’s why I am interested in the emerging DevOps community: a bunch of smart people who are trying to bring development and operations closer together, and applying ideas and practices from Agile development to improving system and application operations. I try to keep up as much as I can with what DevOps is about, what people are doing and thinking, and understanding what it means to me. I went to Velocity last year, I’ve been reading the DevOps blogs, following the DevOps discussion group, I read and enjoyed the Web Operations book.

So, from a software development manager and CTO perspective, this is what I like (and don’t like) about DevOps so far.

What I like

The DevOps community is real: it's about people solving real problems, hard problems that a lot of us face in managing large systems. It's about people working together to come up with new ideas on how to do a better job. It's about taking ownership of problems and the solutions to these problems.

It is not an idea that has been crafted outside and that is being sold to people because it is good for them – not like Rugged Software for example. DevOps has authentic buy-in, it is being built from the ground up, and I can see things actually happening.

And there is no right or wrong, there’s no Manifesto (please please keep it that way), there’s no fundamentalism or religious wars – so far at least. DevOps is still in that early place and time where things are fluid and searching. People can disagree, and as long as they are smart and pragmatic and fair, they are listened to. Nobody is worried about contradicting one of the Creators (unlike in the Scrum community, or the the software Kanban community). Most of the people involved in DevOps today seem to be practical and open minded, and focused on finding answers. I hope it stays this way, forever and ever and ever.

DevOps has strong support from vendors. There’s configuration management and deployment automation technology like Puppet and Chef of course, and companies selling Continuous Delivery services and technology, the stuff needed for Infrastructure as Code, for effectively integrating application and infrastructure changes. And there is an important need in DevOps for highly scalable but lightweight data management technology, network management and monitoring platforms, tools to collect and analyze system and application metrics and logs. And then there's the Cloud.

Yes, there has to be a role for vendors: somebody has to act as a sponsor, to help sustain momentum, and people need technology to solve DevOps problems, to make deployment and management and operations of large-scale systems possible. But DevOps is not vendor-led, or consultant-led. Big Blue and HP haven’t figured out what DevOps means to them, or how to turn it to their advantage. Most of the technology is Open Source, making it easier for the community to get involved in solving their own problems.

The people in DevOps who are worth listening to are techies or hands-on managers who have real day-to-day responsibilities and solve real problems and who are sharing what has worked for them, and who are actively trying to find ways to do their jobs better and are interested in answers. Most of the thinking and writing in DevOps is being done by the doers, not by people trying to sell something, trying to convince you that you have a problem and that you need their help to solve it.

DevOps is about solving some interesting and challenging problems, in some cases extreme problems. I don’t have to manage the technology for a platform with thousands or tens of thousands of servers or millions of online customers, but I can learn from people who do. It’s fun and it’s worth paying attention to.

What I Don’t like

There’s a lot of enthusiasm in the DevOps community, people excited because they are onto something that is working, creating new tools, finding new solutions to problems. There are times when those of us who have been around for a few years recognize old ideas, even when they’ve been given cool new names. Listening to someone excitedly describing “dark launching” and “feature flippers” and pretending that this is a new idea is tiresome… But you can get over it, you can still enjoy the enthusiasm and wait to see if there are any new lessons to be learned.

Not surprisingly, there is much talk by DevOpsers about Values and Culture and Collaboration and Trust and other Capitalized Important People Stuff. This is all necessary when you are trying to get people to work together in new ways, but it has the potential to slide into pop psychology and New Age philosophizing like too much of what passes for thinking in the Agile development community today by consultants and "coaches". Hopefully, DevOpsers will stick to solving problems and getting things done in an open way, and not wandering down this path.

DevOps hasn’t come to terms with ITIL, and it needs to. ITIL is used as a straw man by some of the DevOps thinkers, in the way that Waterfall development is used by some Agile development enthusiasts: as an outmoded way of thinking and working, a collection of antipatterns. This naive approach is not fair or practical, it alienates a large community of people working in enterprise organizations and government IT organizations who have adopted more structured frameworks to get their work under control. There’s a lot to like in DevOps, but there’s a lot that DevOps can learn and has to learn from ITIL as well.

More fundamentally concerning to me is that DevOps has failed to take responsibility for security. They don’t seem to understand or care about what’s involved in deploying and operating secure systems, or if they do, they don’t explain how it fits. This will limit DevOps adoption, because most operations and development teams who want to take advantage of the ideas and practices and technologies coming out of DevOps, to become more efficient and more agile, also need to do this in a secure way. And this is reflected in the ongoing security problems that some of the organizations that are pioneering DevOps ideas continue to face. Would you trust your email to Facebook, would you really?

There’s a lot of interesting ideas coming out of large-scale online social media companies like Facebook and Twitter, and Flickr and Etsy, and some of the online multi-player Internet game platforms. But there’s a gap in trying to apply their lessons to the problems and requirements that the rest of us deal with: smaller-scale, but just as hard, just as real. Secure design and deployment and operations, all kinds of compliance and auditing, real data privacy, real transaction reliability, B2B integration, protecting compatibility with customers and legacy systems. Ideas taken to extremes like Continuous Deployment are exciting and challenging, they force you to think about how you can get better. But they aren’t ready for the enterprise yet.

DevOps doesn’t represent or meet the needs and priorities of most of us working on enterprise systems. But that doesn’t mean that they shouldn’t keep going. And that doesn’t mean that the rest of us shouldn’t keep following them – and trying to keep up.

Saturday, January 8, 2011

Software Security and the Long Tail challenge

Secure and Resilient Software Development, a new book by Mark Merkow and Laksh Raghavan, explains the problems that developers face in designing and building secure software, and where to find resources to help deal with these problems.

The book covers basic principles of secure software, and how to include security in analysis and design and coding and testing. It does a good job of mapping security problems and requirements to available solutions: where you can find training and certification, commercial software and services, and open source tools and checklists, especially tools and checklists from OWASP. It walks through the OWASP Top 10 risk list in detail, and introduces how to use OWASP’s open source Enterprise Security API (ESAPI) library to build secure apps.

The authors lay out a high-level secure SDLC explaining where checks and controls are needed. Like most secure SDLCs, it’s serial, waterfall-based, with gates between analysis and design and coding and so on. There’s no guidance on how to adapt these ideas to Agile methods like Scrum, or to maintenance, but they point to another (inactive) OWASP project CLASP which can help you integrate security controls into the way that you build software.

The book also covers publicly-available software security maturity models: BSIMM and OWASP’s SAMM, used to assess an organization’s software security program and identify gaps and weaknesses. I’ve used SAMM before with my own team as a management scorecard: it helped me structure my thinking and planning on how to improve our software security program. BSIMM is a much bigger beast, intended for enterprise-scale software security programs.

The Enterprise

And there’s the rub. Maturity assessment frameworks, and review-heavy waterfall-based secure SDLCs, are targeted to big companies - like the expensive commercial security testing and checking tools that are also covered in this book. Too much of software security is about the enterprise. There is a lot of money being spent (and a lot of money to be made) here, and big companies have more people and time to invest in software security work as part of their broader risk-management and compliance programs. Software security has made the big time, which is why IBM and HP have bought their way in, and why the remaining independent technology vendors are tooling up for the enterprise (and raising their prices).

Enterprise customers are an important market and this market demands and deserves to be well-served.

The Long Tail

But it’s important to remember the Long Tail.

Sure, a lot of software is built and used by enterprises. But most application software developers will spend their careers working in small software companies or in internal IT groups in small and medium-sized businesses. Small teams like these are looking for simpler and easier ways to build software cheaper and faster and better. Because they can, and because they have to. Agile methods to break projects down into 1-week or 2-week chunks and deliver features to the customer faster. Open source tools and frameworks that work. Practices like unit testing and refactoring that help developers get control over development. Continuous integration, and continuous delivery, even continuous deployment. Ideas applied from Lean Manufacturing to strip out waste. Whatever is practical, whatever helps them to get the job done.

And these ideas and approaches are making it into the enterprise. Teams in enterprise businesses, and in the enterprise software vendors and offshore companies that service big businesses, are adopting lighter-weight project management and development practices because they help them cut costs and deliver faster, and because they work.

There aren’t many small teams that are going to bother with a software security maturity assessment framework, or understand it. BSIMM has 109 different measurement areas: that's around 100 more than most small teams would be able to deal with. And asking small teams who follow Agile practices to measure themselves against a comprehensive process model and work their way up a maturity ladder doesn’t compute. Agile methods were created as a reaction against this kind of thinking and the costs and inefficiencies involved.

It’s important and necessary to find solutions and ideas that will help smaller, faster-moving software organizations – and bigger organizations that are also trying to find ways to move faster. We need practices and ideas that fit with the way these teams work. And that’s not going to be easy.

At OWASP’s AppSec USA 2010 conference, Adrian Lane critically reviewed the security challenges that Agile teams face and the security problems that they create, in his presentation Agile + Security = Fail. As is clear from the title, there are more problems than solutions. Agile teams move too fast for heavyweight security models, they don’t leave time for exhaustive systemic tests and reviews, they don’t have time to do everything right the first time – and they are trying to move even faster.

OWASP tools like ESAPI and the Top 10, and OWASP's secure development and testing guides and secure coding checklists are important and useful resources for developers trying to build secure software. It's good to have a book that points out where to get help like this. And Microsoft's SDL-Agile shows how a company with Microsoft-level capabilities and resources can build secure software using Agile methods.

But we need more. We need more tools and ideas that work for smaller development and maintenance teams, that we can fit in with lightweight methods and that work under pressure. We need patterns and practices that are simple to understand, easy to follow and practical. We need improved languages and frameworks that make it harder to write bad code; and simple and affordable tools that make it easier to find and fix vulnerabilities, because people will still write bad code. And we need the leaders of the development community and the Open Source communities to take security seriously and include security in the way that we build and delivery systems, and in the Open Source code that we share and rely on. We need more solutions that small teams can understand, can afford, and will use.

Because the challenges and problems of writing secure software are both bigger than, and smaller than, the enterprise.