Technical Debt - What it is and what to do about it

Durer's Usury In spite of all that's been written on the subject of technical debt, it's still a common occurrence to see it defined as simply "bad code". Likewise, it's still common to see the solution offered being "stop writing bad code". Technical debt encompasses much more than that simplistic definition, so while "stop writing bad code" is good advice, it's wholly insufficient to deal with the subject.

Steve McConnell's definition is much more comprehensive (and, in my opinion, closer to the mark):

A design or construction approach that's expedient in the short term but that creates a technical context in which the same work will cost more to do later than it would cost to do now (including increased cost over time)

While it's a better definition, I'd differ with it in three ways. Technical debt may not only incur costs due to rework of the original item, but also by making more difficult changes that are dependent on the original item. Technical debt may also end up costing nothing extra over time (due to a risk not materializing or because the feature associated with the debt is eliminated). Lastly, it should be noted that the cost of technical debt can extend beyond just effort by also affecting customer satisfaction.

In short, I define technical debt as any technical deficit that involves a risk of greater cost and/or end user dissatisfaction.

This definition encompasses debts that are taken on deliberately and rationally, those that are taken on impulsively, and those that are taken on unconsciously.

Code that is brittle, redundant, unnecessary, unclear, insecure, and/or untested is, of course, a type of technical debt. Although Bob Martin argues otherwise, the risk of costs to be paid clearly makes it so. Likewise, aspects of design can be considered technical debt, whether in the form of poor decisions, intentional shortcuts, decisions deferred too long, or architectural "drift" (losing design coherence via new features being added using new technologies/techniques without bringing older components up to date, or failing to evolve the system as the needs of the business change). Deferring bug fixes is a form of technical debt as is deferring automation of recurring support tasks. Dependencies can be a source of technical debt, both in regard to debt they carry and in terms of their fitness to your purpose. The platform that hosts your application is yet another potential source of technical debt if not maintained.

As noted above, the "interest" on technical debt can manifest as the need for rework and/or more effort in implementing changes over time. This increase in effort can come through the proliferation of code to compensate for the effects of unresolved debt or even just through increased time to comprehend the existing code base prior to attempting a change. As Ruth Malan has noted, strategy may drive architecture, but once the initial architecture is in place, it serves to both enable and constrain the strategy of the system going forward (strategies requiring major architectural changes typically must offer extremely high ROI to get approval). Time spent on manual maintenance tasks (e.g. running scripts to add new reference values) can also be a form of interest, considering that time spent there is time that could be spent on other tasks.

Costs associated with technical debt are not always a gradual payback over time as with an ordinary loan. Some can be like a debt to the mob: "they come at night, pointing a gun to your head, and they want their money NOW". Security issues are a prime example of this type of debt. Obviously, debts that carry the danger of coming due with little or no notice should be considered too risky to take on.

Having proposed a definition for the term "technical debt" and identified the risks that it entails, it remains to discuss what to do about it. The first step is to recognize it when it's incurred (or as soon as possible thereafter). For debt taken on deliberately, recognition should be trivial going forward. Recognition of existing debt in an established system may require discovery if it has not been cataloged previously. Debt that has been taken on unconsciously will always require effort to discover. In all cases, the goal is to maintain a technical debt backlog that is as comprehensive as possible. Maintaining this backlog provides insight into both the current state of the system and can inform risk assessments for future decisions.

Becoming aware of existing debt is a critical first step, but is insufficient in itself. Taking steps to actively manage the system's debt portfolio is essential. The first step should be to stop unconsciously taking on new debt. Latent debt tends to fit into the immediate, unexpected payback model mentioned above. Likewise, steps taken to improve the quality up front (unit testing, code review, static analysis, process changes, etc.) should reduce the effort needed for detection and remediation on the back end. Architectural and design practices should also be examined. Too little design can be as counter-productive as too much. Striking the right balance can yield savings over the life of the application.

Deciding whether or not to take on intentional technical debt is less black and white. Often this type of debt is taken on for rational reasons. An example of this is what Ruth Malan characterizes as "...trading technical learning and code design improvement for market learning (getting features into user hands to improve them)". Other times, the balance between risk and reward (whether time to market or budget) may tilt in favor of taking on a debt. When this is the case, it is critical that the owner(s) of the system make the decision in possession of the best possible information you can provide. An impulsive decision taken on the basis of "feel" rather than information will likely carry more risk.

Retiring old debt should be the final link in the chain. Just as the taking on of new debt should be done in a rational manner, so should the retirement of old debt. Not all debt carries the same risk/reward ratio and efforts that carry more bang for the buck will be an easier sell. Although some may disagree, I firmly believe that better outcomes will result from making those who own the system active partners in its development and evolution.

It's highly unlikely that a system will be free of technical debt. Perversely, being free of such debt could actually be a liability. That being said, there is a world of difference between the two poles of debt-free and technical anarchy. Effort spent to rationally manage a system's debt load will free up time to be put to better use.

Re-posted from Form Follows Function