Inline temps? I think not

0 comments
Are these two snippets equivalent?

Snippet A:
  this.binding.type 
= this.initialization.resolvedType;
this.type.resolvedType = this.binding.type;
this.initialization.setExpectedType(
this.initialization.resolvedType);
this.initialization = new CastExpression(this.binding.type);


Sinppet B:
  TypeBinding temp 
= this.initialization.resolvedType;
this.binding.type = temp;
this.type.resolvedType = temp;
this.initialization.setExpectedType(temp);
this.initialization = new CastExpression(temp);


Which of the two snippets is easier to read?

The reason that the second piece of code is more straight forward is due to the local variable "temp" which gives a symbolic name to some value. Such a symbolic name makes it easy to understand the code and follow the flow of data therein.

Too many people advocate the eliminatation of temporary variables. Even the (otherwise excellent) refactroing catalog shows you how to "Inline Temps" and "Replace Temps with Queries". True, the motivation there is to transform the code such that further refactoring steps are made possible, but still, it leaves the wrong impression.

My two cents: Introduce temps! Replace queries with temps!

Safe and easy elimination of dead code

1 comments
One of my projects is an extension to an open source Java project. A new version of the open source project has been released so I sat down to upgrade my baseline code to this new version.
This transition created some compilation problems, which were quite easy to fix (quite easy ~ 2 hours). Naturally, I used this opportunity to do some refactoring.

In particular, I introduced a global factory, and changed the code such that instantiations of major classes are now delegated to a factory method in this factory object, instead of having an explicit, hard-wired, new T() expressions.

Having completed that I noticed I still have a compilation error in one of my classes. It was a small factory class, which I had used for similar purposes as the global one, but to a smaller extent.

Anyway, I thought that this class is not needed anymore since its functionality is already provided by the global factory. If this is indeed that case, then I don't need to bother about fixing this class, I just need to delete it.

The problem: How can I determine whether my program uses a certain class?
The solution: Just delete it. If no new compiler errors appear, then this class is a dead class, and can stay in the trash forever. If not, you can always restore it from the CVS.

So I deleted the class. A new compiler error appeared at another class, but (again) this class seemed to be redundant in the new design. So I deleted this other class, and that's it: All my compiler errors disappeared, indicating that these classes are really, absolutely, definitely dead ("These classes wouldn't "voom" if you put four million volts through it!").

The moral: The easiest way to know if you can delete a class is to delete it. Let the compiler tell you if it really needs it.

The insight: Can you really do it in a dynamic language (SmallTalk, Ruby and the rest of the gang)? Would you 100% trust some tool saying that your Smalltalk program does not use a certain class?

Bad idiom: Returning a parameter

2 comments
Some Background

It is common to design a method whose return type is the same as the enclosing class. Logger.log is such a method:

class Logger {
int n = 0;
Logger log(String s) {
System.out.println("[" + (n++) + "] " + s);
return this;
}
void flush() { System.out.flush(); }
}

Basically, there are two reasons for doing that. First, this design makes it possible to send several messages to an object in a single compact expression, as in:
logger.log("abc").log("def").flush();


The second reason is related to immutable classes. In such classes a "state changing" method is actually a method that returns a new object. This style of programming (which I personally quite fond of) is ripped off the world of funtional-programming. Here is how it comes about in an immutable Java List:

class List<T> {
public final T t;
public final List<T> ts;
public List(T x, List<T> xs) { t = x; ts = xs; }
public List<T> prepend(T x) {
return new List<T>(x, this);
}
}

The actual sin (i.e.: Returning a paramete)


So we now know what are chain methods. The bad idiom I want to talk about is that of returning a parameter:

class SomeClass {
public StringBuilder dump(StringBuilder sb) {
sb.append(this.toString());
return sb;
}
}


Why would a programmer design such a method?
Well, (again) it makes it possible to write a compact expression with several method calls:

void someClient(SomeClass o1) { 
// Assumes that dump(x) returns x
StringBuilder builder = new StringBuilder("o1=");
o1.dump(builder);
builder.append("\n");
System.out.println(builder.toString());
}


So what's the problem?
It is completely unclear whether such a method (a-la SomeClass.dump()) should return its parameter or a new copy of it.

someClient() assumes that dump() returns the object that is passed in as the sb parameter, so it ignores since it already has this object pointed by the builder variable. Consequently someClient() would break if SomeClass.dump() were to return a new object.

In the other case, if the client expects the method to return a new object, the caller may mutate the object (passed as parameter) without knowing it is effectively mutating the result:

void someClient(SomeClass o1) { 
// Assumes that a new object is returned from dump(x)
StringBuilder builder = new StringBuilder("o1=");
StringBuilder result = o1.dump(builder);
builder.clear(); builder.append("....");
System.out.println(builder.toString() + " " + result.toString());
}


Just imagine a class with such a method, which is also overridden in a sublcass. The superclass' method takes the first approach where the overriding method takes the second approach. A nightmare, isn't it?

Aftermath

Why is a method retuning a parameter more risky than a chain method (as described in the background)?
A misunderstanding between a client and a provider can occur also with chain methods. But, there is one difference that make chain methods somewhat safer: the protocol of a class usually indicates whether it is an immutable class (in which case the return values should not be ignored) or not.

In the case of a method returning a parameter the indication is buried inside the body of the method (which is usually not available to the client), so we are much more likely to make the wrong guess.

One may argue that this problem can be easily solved using proper documentation. As a principle, design should not rely on comments, for two reasons: (1) comments are written in plain English so they tend to unclear and imprecise. (2) When the code changes there is nothing that forces the developer to keeps the comment in sync with the changes.

Finally, It seems that (imperative) programming languages need to support a "non-ignorable-return-type" annotation. The compiler would issue an error if the return value of method carrying such an annotation is not assigned to a variable. This would turn the problem described in this post into a non-problem.