Longer is Sometimes Better

The post that I swore never to write


(I never wanted to write this post. In fact, I wanted not to write it. I think it focuses on stylistic aspects which are overrated. I do think that a clear style is important, but I don't think we should be religious about it. In other words, If the method's style gets an "A" I don't see any point in trying to raise its score to "A+". Writing another unit test will give a much higher ROI).

A comment by Martin in response to one of my latest posts said that sometimes longer is better. Since people tend to prefer compactness, I think that it may be worthwhile to mention some points in favor of length. Please bare in mind that I am not saying that longer code is better. Not at all. Shorter should be the norm, but it should not be followed religiously.

Width vs. height

Programmers often attempt to reduce their method's height (number of lines) by consolidating two (or more) lines into one. If this attempt increases the width of the method (number of columns) then they actually make it more difficult to understand the code. This is because we don't read content left to right, line by line. Our eyes movement follows the F-shaped pattern:

  • Users first read in a horizontal movement, usually across the upper part of the content area

  • Next, users move down the page a bit and then read across in a second horizontal movement that typically covers a shorter area than the previous movement

  • Finally, users scan the content's left side in a vertical movement

Here's how it looks like:



(Copyright © 2006 by Jakob Nielsen. ISSN 1548-5552)

This means that we have greater chances of reading things on the left hand side of the screen. We can better understand greater parts if most of the code is located closer to the left.

One may argue that the cited research studied users looking at web-sites rather than programmers looking at source code. Thus, maybe the F-pattern is not relevant WRT programming. This is a valid argument, but it needs to be verified by research. To the best of my knowledge (and I'd be glad to stand corrected), there is no support for this argument in the literature.

Stacktraces

You get a bug report from a user. The stacktrace shows a null pointer exception on this line:

m1().m2().m3().m4()

You don't know which method call returned null. Was it m1(), m2(), or m3()?
If each call were on a line of its own you'd have much more information. Sometimes this can make a huge difference.

A single-line if

Writing the "else" part in the same line as the if() makes the code shorter but more difficult to debug:

for(int i = 0; i < getTop(); ++i)
if(get(i).equals(b)) return;

Suppose you're interested in the return statement, but since you have a single line if() You cannot place a breakpoint on the return statement. Placing a breakpoint on the if() is not practical if getTop() returns a large value (even 50 is large value in this context).

Another example:

void f(int n, int m) {
int x = g(n);
h(m, x);
if(!isValid(x)) return;

// do something useful
}

You are trying to find out why f() is not doing what it was supposed to do. So you step through f() and suddenly, surprise!, you find yourself in another method.

After some thinking you realize that you are back in f()'s caller. That's because the last thing you remember is that you were standing on "if(!isValid(x) return;".

Now, here are your thoughts: "Aha! That's exactly what's wrong. I should not exit from f() under these circumstances. Is the problem in g() that computes x's value or in isValid() that checks it? Gee, I don't know. If only I knew what x's value is. Oh boy! I have to start the debugging session all over again"

This problem will be a no-problem if you split the if() across two lines. During debugging you will have a visual indication that the if() succeeded.

Note that this discussion is pertaining not only to if()'s but also to the ternary operator, "?:".

Temporary variables explain the code

There is this refactoring step of Replace temp with query. Each application of this refactoring saves one line - the line where the variable was defined.

Martin Fowler is a very smart guy. He knows that reducing line count is not the only game in town. That's why immediately next to "replace temp with query" he placed Introduce explaining variable. He knows that a variable that is used only once is still useful: it explains the code, and it worth to have it, despite the extra line that it occupies.

BTW, this duality is quite common Fowler's Refactoring book. See, for example, Hide delegate vs. Remove middle man. It seems that Fowler is well aware that every decision has its trade offs.

Capturing return values

So many I found my self debugging a method and not knowing its return value. This happens when the code is written by someone who wants to save a line. Instead of writing

int f() {
// do something
int result = g();
return result;
}

He writes

int f() {
// do something
return g();
}

The problem with the latter form is that it hides f()'s return value. If this is the common style, then when you step out of f() you find that f()'s caller is returning f()'s return value directly to *its* caller, which in turn.....

This is so annoying. You have to step out of many methods. When you get to inspect this value, you realize that it is wrong but now you lost the original stack frame. Think what happens if your code was called from library whose source code is out of your reach.

(Although this is a special case of the previous point, it is related to events so annoying that it deserves a section of its own)

Summary

It should be noted that other than the first point (the F-shaped pattern) the points made here do not capture some cosmic truth about programming. They depend on the language, on the tools and probably on some cultural factors. You'll have to reconsider them when you switch to a new language/tool/team.

Finally, by all means, don't try to make your code longer. Try to make it shorter. But as you make it shorter, think about these points. They may help you avoid code that is too short.

4 comments :: Longer is Sometimes Better

  1. IMO "terse" implies a level of abruptness not found in "concise", to its detriment. "Concise", OTOH, doesn't remove any necessary, or useful, information. Instead it's the shortest use of language that still conveys full meaning.

    "Short" code can be either concise or terse, but terse (again IMO) may be short to the point of removing valuable (and/or helpful) information, including information that can reduce cognitive overhead.

    I think the trend these days is to make code easily understandable: concise, not terse. Even idiomatic code (in languages that allow both concise and terse code) can be made easily understandable--the trick is to retain enough contextual information to make it concise without becoming abruptly short.

  2. At least you realize that most of this stuff is not about readable code but rather about working around debugger deficiencies. :-)

  3. Dave,

    I like your distinction between these "terse" and "concise". Another similar statement is the one made in a comment by Yardena: "no one should sacrifice that [Writing correct and clear code] for brevity."

    Although many agree with this distinction, I too often see people who believe the software quality follows the formula of "terser is better".I don't know. Maybe I'm just spending time with the wrong people :)

  4. I totally agree with Andreas the fact that the only bad thing you are saying is that it will bug the debugger means that someone should work on a better debugger.

    I really wish you had a good point against it because I am constantly in a state of preferring short to very short.
    If its easy to read you need less debugging cause you can read it :).

Post a Comment