Friday, December 3, 2010

DRY vs KISS

Two developers are playing poker. The chips are all on the table and it's time to show their hands. One throws down the DRY hand. The other throws down KISS. Which hand wins? Who gets to take the pot home?

There are many guidelines in software development. These are two prominent ones: DRY and KISS. For those who haven't heard of them, DRY stands for Don't Repeat Yourself and KISS stands for Keep It Simple, Stupid.

The underlying principle behind DRY is that there should be one authoritative source for any artifact in a piece of software. This can take different forms depending on the context. One example might be a constant literal. If a literal, for example a numeric value or a string value, exists more than one place in a program, it's a bad thing. The duplication should be removed and replaced with a named constant. The constant becomes the authoritative source for the value. This also has the side effect of making the software more readable if a good name is chosen because the meaning can be conveyed, not just the value. Another example is duplicated code. If lines of code are copied and pasted, there becomes more than one authority for the code's behavior. Instead, those duplicated lines should be put in a function that can be called from the various places the behavior is needed.

The KISS principle is based on the idea that complexity makes code that's hard to understand, maintain and debug. Code that is hard to understand will have more bugs in it to begin with and take longer to get working properly. It will also be harder for the person who has to maintain it to make changes and hence will increase the cost of ownership over the lifetime of the product.

On the surface, I think most developers will agree that both these principles are good. However, I think there is disagreement upon which one trumps the other.

An argument could be made that removing duplication increases complexity. In the examples for DRY above, adding a function or constant declaration adds indirection. As you read the code, the value is not immediately obvious and the behavior is not readily apparent. You have to go somewhere or use some feature in the IDE to find the value or determine what the routine does. In addition to readability, as you write, you have extra work to make a separate function. It is much faster to simply copy what you need and paste it somewhere else.

I understand these arguments. I know first hand the temptation to grab a section of code and paste it somewhere else. I know the pain of slowing down and having to think deeply about how to not make the duplicate copy of the code. Given all that...

DRY trumps KISS every time.

When I look at the long term maintainability, having one place where behavior or values or anything else is defined is always the better thing. I've experienced first hand maintenance of code with high levels of duplication. I have made changes in one place without knowing about the duplication and not also fixing the same code elsewhere. I have seen others do the same thing. The testing and validation in these scenarios can get very painful. When maintaining software and making changes, dealing with a single authority is much more accurate and faster, even with increased indirection.

In addition to long term maintainability when behavior should be changed, not repeating yourself reduces the chance of copy and paste errors. More times than I care to count, I've found bugs where a block of code was copied and minor changes made throughout it. Missing even one thing that should be changed will result in insidious bugs setting up residence in your application. Many times these won't be caught until much later in the product's life-cycle, increasing the cost of fixing them.

Just today I violated this principle... even after thinking "do I really want to do this?"... even with the draft of this article fresh in my mind... and I introduced a bug that took me a minute to figure out. I had these two lines:
const string name1 = "value one" ;
var value1 = !computeValueForName(name1);
That I duplicated and changed to this:
const string name2 = "value two" ;
var value2 = !computeValueForName(name1);
See the error? When I did, I promptly refactored this to:
var names = new [] { "value1" , "value2" };
var values = from n in names select !computeValueForName(n);
So, yeah, when I see violations of DRY I have almost a compulsive need to fix it. How about you?