"Hence the paradox: how is
it that a team of brilliant senior engineers need 6 months to clean up
after that one early programmer’s weekend kludge job?"
I don't think
things happen quite like that - what I have seen happen is that a
"weekend kludge job" becomes the foundation for a lot of subsequent work
and the "kludge job" sets the so after a year you can have a vital
production system that has been extended and "improved" in a completely
ad-hoc fashion and it's that that takes a lot of effort to reverse
engineer and structure properly.
In my experience it is really
difficult to raise the quality level on a project once it gets going -
thing generally only get worse as external things start to impact
(timescales, scope creep etc.). The only way things seem to keep high
quality is to consciously start high and fight to keep standards high -
which can be pretty difficult.
I see technical debt
differently. Technical debt is something you take on in exchange for
moving faster. Maybe you skipped writing unit tests. Maybe you didn't
respond to all the style feedback, or even do a code review. But at the
time you make the choice, you should know that you're borrowing time
from your future self in order to move faster now.
Like real life
monetary debt, technical debt has 2 aspects: the principle and the
interest. Sometimes the interest ends up being variable rate and
backbreaking for your project - you should have written those tests up
front and saved yourself a lot of time. Sometimes the whole project dies
before you pay back the debt - you go bankrupt but it's no big deal,
your creditors are friendly. In either case, the principle usually isn't
smaller when you go to address it. You're just delaying payment in
favor of speed.
What the original artical addresses are several
aspects that I might call "technical assets" that have depreciated
significantly. That kludge, if it wasn't done intentionally, isn't debt -
it's now an asset. If you didn't have a plan to pay it off in the
future, you've already bought it and now you need to replace it. And the
replacement is very expensive. That is a technical investment you're
deciding to make. You aren't just paying off debt at that point, it's a
whole new capital expense.
As en engineer I understand
and agree with what the author is saying: technical debt is a loaded
term and it feels wrong to throw it around liberally to cover things
that should have discrete and distinct meanings. However: the term
"technical debt" is not for engineer-to-engineer communication: it is
for communicating to management and other non-technical actors - and in
that role the term perfectly suits its purpose.
To understand why,
let's consider a normal interaction with a corporate legal team. The
content team submits some copy to the legal team for approval and the
legal team needs to communicate to the content team that they can't make
the claim they are trying to make about the product. Now, it's more
true in the technical sense for legal to say things like "Section 3
paragraph 7 of the Fair Sales Act states that Consumer Entities as
defined in Section 1 Paragraph 3 may not represent their products in
such a way as to potentially mislead a Purchaser as defined in Section 2
paragaph 9. Furthermore this issue has been thoroughly litigated at
both the federal appellate and Supreme Court levels which have
reinforced those interpretations unfavorable to the aforementioned
Consumer Entity." But that wouldn't be great for either department.
It's more useful for Legal to just say "You can't claim that - we'll get
sued." In a similar way, it's more useful as a software engineer to
just say "that decision will add to our technical debt" than to try and
discuss the minutiae of how a certain bad architecture decision will
make the system "resistant to change". As a fellow engineer, I'd rather
have the latter conversation with you and the one about "technical
debt" with my business manager.
I feel like this post
misses the point of the whole "technical debt" analogy. They way the
author talks, it seams like their org. uses it as a catch-all term for
complaining about the code base. But that's never been the point of the
term.
"Technical debt" is the justification for process and
development decisions that require more time up-front but will save time
in the long run (especially as a term to communicate this to
non-technical management). It only comes into play when you're talking
relatively about two or more different choices.
Just as an
example: when deciding to move ahead with less testing or to move more
slowly and write unit tests for every function, you should consider the
first to come with more technical debt. That doesn't mean it's the wrong
decision, it just means that it should be a factor in your thought
process.
If
you look at a practice like Unit testing, you tend to find some
organizations are having success with it and others not. Some of them
see it as a "waste of time" and others see it as a "timesaver". How the
practices are implemented are as important as the practices you decide
to use. If the unit tests for your project take 3 minutes to run you
may have a problem. If you are writing a large number of hard to
maintain mock objects you may have a problem. If the program is not
divided into units that are easy to test that is a different problem.
Another
good example is the use of Vagrant. It is very possible to have
Vagrant spinning like a top, but in a lot of places you hear software
management grumbling that a team of 3 people wasted 2 weeks screwing
around with it.
I see it so common for developer discussions to
center around "cargo cult" positions (i.e. "singletons are evil") and
not around a real dialog about the work to be done and how to do it.
There's a lot of ways to
parse technical debt. The simplest one I've found, while it has its
shortcomings, is this from a Deloitte article:
They state, in general, it costs $3.61 technical debt / line of code.
"Technical
debt is a way to understand the cost of code quality and the impacts of
architectural issues. For IT to help drive business innovation,
managing technical debt is a necessity. Legacy systems can constrain
growth because they may not scale; because they may not be extensible
into new scenarios like mobile or analytics; or because underlying
performance and reliability issues may put the business at risk." (Tech
Trends 2014, Deloitte University Press).
It's an easy to use
metric to weigh the cost of supporting programs, and relatively simple
for managers to understand. It has the added benefit of encouraging
reducing the size of the code-base when possible.
Well the problem with that analysis is, what qualifies as a "line" of code?
An
efficient algorithm may be punished over an inefficient one. An
optimal one-liner can certainly cost a lot to develop (maybe you need a
smart engineer, a lot of testing and a lot of time to figure it out) yet
it would appear to be cheap.
Then there's just textual
differences. It depends on the programming language. And it can be
different even in the same language...
You are pointing out that "lines of code" is a terribly imprecise metric. That is TRUE, but not HELPFUL. We all know
that "lines of code" is a terrible metric. And for only $100/hr, for a
few hours you can hire an expert to evaluate one file from your codebase
and determine its size in something more reliable -- "function points"
or some other system that you devise.
The thing about "lines of code"
is that it's CHEAP and EASY. It is also WAY better than having nothing.
Which system is more complex, system A written in C++ 2003 or system B
which was written Java in 2013? Which system is more complex, System C
which is 200,000 lines or system D which is 10,000 lines? Even if I
don't tell you how "lines" were defined in those two estimates, you can
still tell more about C and D than about A and B.
"Lines of code"
is a terrible metric, which can give only order of magnitude estimates.
But that makes it enormously better than no metric at all, or a complex
metric which we haven't actually measured.
The idea is to encourage
the best behavior. A bad measurement encourages developers to focus on
the measurement instead of the real problem, and in this respect it is worse than nothing at all.
A shorter function is a great side effect but it should never be the point.
I want someone who can do things like: find obsolete code, create a
better algorithm, decides on a better language for the task. If I say
that a system is bad because it's 200,000 lines, I may discover that
developers are really good at removing comments and obfuscating code to
"improve" it down to 150,000 lines. If I come up with specific
performance improvements, and state clear expectations such as "find
functions we never use", I end up with a more maintainable system.
Let me be
clear: I will defend "lines of code" as a metric useful for getting and
order-of-magnitude estimate of the complexity of any given codebase.
Anyone using it as a metric to determine how much to reward developers
is just plain stupid.
Ideally,
you want minimal technical debt. Any code you develop should aim for
minimizing technical debt. Technical debt is not about evaluating the
value of a line of code or program, but about it's potential maintenance
cost.
I work at a large organization that maintains over 300+
modules of code and custom software programs, varying in size from a
thousand lines of code to massive ones. Any one of which can be sold at
any time, and all of which is supported in some fashion or another, and
all of which should be able to work together.
You need some sort
of metric to see where to invest ones time. If you don't manage the
technical debt, everyone spends all their time doing tech support.
So,
how the helpful concept of technical debt is evaluating where to put
effort into improving the code base, and evaluating potential support
cost of adding more modules? Lines of code, while flawed, provides a
pretty good starting point.
As I mentioned in another
comment, you don't want developers focusing on lines of code. The
directions have to be much clearer, such as the goals of finding unused
code or improving performance.
If resources are handed out based on
how bloated a project is, you would quickly discover how creative
developers can be in increasing the relative "importance" of their code.
Systems
are composed of lots of things, and sometimes a tiny piece is the most
complex and critical. A measure of lines of code should not prevent you
from adding 3 more people to a tiny project, and removing a person from
a team with a giant code base.
(This is not disagreement with the original post, but further musings.)
One of the other problems with the technical debt idea is, debt relative to what?
Since I think most people do not consciously instantiate an answer to
that question, I think most of us non-purposefully choose by default a
comparison to an idealized perfect code base that is somehow perfectly
correctly factored, yet also as fast as completely optimized code,
completely documented without being overdocumented, simultaneously
optimized for all possible future changes, and despite not existing at
the moment, also perfectly understood by the original team such that
they can do anything they like to it, and such understanding will
survive any such refactoring, all accomplished in exactly the same
amount of time that it took to write the terribad code that we actually
have.
OK, but that was never on the table, though. What I
described isn't even possible to manifest, and even if you back it down
to what is at least sort of possible, you don't have the resources to
manifest it before you simply run out due to lack of customers. You're
not really taking on "debt" if you fail to manifest that, any more than
you are taking on "debt" if you fail to make every possible dollar you
could if you perfectly correctly harnessed your skills and made the
perfect deals.
I think what a lot of people call "debt" is simply
the natural state of code. Debt is something somewhat more exceptional
than that. There is a core idea of value there, because there definitely
is an operation where we deliberately make a short-term choice at the
cost of long-term pain, but it requires more thought than I think has
traditionally been given to it. The baseline needs to be more carefully
defined.
As I said at the top, I'm just musing here; I don't have a ready-to-go definition of "baseline quality" here.
>OK, but that was never
on the table, though. What I described isn't even possible to manifest,
and even if you back it down to what is at least sort of possible, you
don't have the resources to manifest it before you simply run out due to
lack of customers. You're not really taking on "debt" if you fail to
manifest that, any more than you are taking on "debt" if you fail to
make every possible dollar you could if you perfectly correctly
harnessed your skills and made the perfect deals.
I don't think that's such a problem, though. The "perfect" codebase should be the baseline to which you compare your current debt.
Just
like in real life, you take on some debt to have a basic standard of
living. You might have a mortgage or maybe a car loan or some minor
credit card debt, and you're normally always working on paying a little
bit of that off. In a perfect world you'd want no debt at all, just like
in a perfect codebase. But, a small amount is natural. It's only when
it gets unruly and unmanageable that it becomes a problem.
"I don't think that's such a
problem, though. The "perfect" codebase should be the baseline to which
you compare your current debt."
What I described isn't perfect; it's
impossible. It destroys the utility of the idea of technical debt if
that's the baseline, because it means that all choices are between
"really bad" and "really bad".
"Aside, the first and last lines of your comment reminded of this post:"
Oh,
certainly. But people do tend to assume that all comments that are not
completely complimentary are intrinsically disagreement, and, well,
let's be honest... statistically, it's true, so it's hard to be too
annoyed that people's brains make that inference by default.
I assumed that "technical
debt" was an analogy to help management people understand that they
might have to allot some time to activities other than getting a program
to run for the first time.
I'm currently on a
month-long crusade against what initially was a single small poor code
choice (a denormalized model in Rails) from 4 years ago.
As it stands,
over 5% of the codebase has been deleted so far without reducing
functionality. One could argue that a significant part of the
application were just hacks to get around the original hack, and hacks
to get around those etc.
Not sure how to break that down into the
categories provided, but given that the problems add up to proportions
unimaginable to anybody but IRS, debt as an analogy works well enough
for me.
Technical debt is a very
artificial concept. With real financial debt one knows immediately the
exact dollar amount, the interest rate, the payment terms and so forth.
There is usually no ambiguity or judgment. In contrast, technical debt
is based on theories about software development and uncertain
predictions about future changes to the software. As the old saying
goes: "prediction is hard, especially about the future."
As a
business, if I buy a computer on a company credit card, I know
immediately that I owe $1200.00 at 23 percent interest per year. In
contrast, if I develop a piece of software and decide to use global
variables to speed up development, I may immediately gain a mythical
man-month in development time BUT I have no idea how much it may slow me
down in the future if at all. Sometimes global variables are the way
to go and development will be consistently faster for the entire
lifetime of the software compared to using local variables. This tends
to happen, not surprisingly, when the data in the global variables needs
to be used widely throughout the program.
Obviously this is about
development but that's not the only place where you can accrue technical
debt. In operations it's shortcuts or bandaids so you can get something
done quickly or fix a production issue "ASAP". While it is a trade off
it's a crippling one. Without good engineering principals and management
support to regularly pay it down you can easily get to a point where
the collective fragility has you putting out fires constantly and no one
wants to touch anything.
Unfortunately management often thinks that
they're supportive but ask them what project they need to axe or put on
the back burner to free up time and the ephemeral promises and deferment
to some hazy future date get put on the table.
Perhaps it's not
"real" in development but it is very much a real issue that needs to be
thought about by anyone in charge of operations.
I combat this by ensuring that it is perfectly acceptable for a change to consist primarily of deletions.
You
don't want a culture that only adds features, forever growing the pile.
Developers need to be free to identify and aggressively throw out old
functions, old tests, obsolete documentation or anything else. This
means that a list of deprecated items should be expected in most
projects, and mean it when you warn other teams that you plan to completely remove those items in 6 months.
This
freedom to delete doesn't mean that total rewrites will be encouraged.
Rather, it acknowledges that in an evolving system, some things do
become obsolete and mistakes will be found (despite highly competent
architects) that ought to be corrected. Ultimately it frees people to
easily identify and maintain only what really matters.
At my current job, I've
removed 20kloc more than I've written in a 200kloc ish code base.
Deleting code is one of the most enjoyable experiences of the
development process in my mind.
Technical debt is usually a
result of poor planning, a misunderstanding of requirements or complete
lack of requirements, or trying to do too much when you should have
built an MVP with limited scope.
I don't think things happen quite like that - what I have seen happen is that a "weekend kludge job" becomes the foundation for a lot of subsequent work and the "kludge job" sets the so after a year you can have a vital production system that has been extended and "improved" in a completely ad-hoc fashion and it's that that takes a lot of effort to reverse engineer and structure properly.
In my experience it is really difficult to raise the quality level on a project once it gets going - thing generally only get worse as external things start to impact (timescales, scope creep etc.). The only way things seem to keep high quality is to consciously start high and fight to keep standards high - which can be pretty difficult.