Who Cares?

From time to time you see people who passionately argue that code such as this:


// Version 1
public static String[] findClasses(String jarFile)
throws Exception
{
List<String> list = new ArrayList<String>();
ZipFile f = new ZipFile(jarFile);
Enumeration<? extends ZipEntry> entries
= f.entries();
while(entries.hasMoreElements())
{
ZipEntry ze = entries.nextElement();
String name = ze.getName();
list.add(name);
}

for(Iterator<String> i = list.iterator();
i.hasNext(); )
{
String name = i.next();
int pos = name.lastIndexOf('.');
String suffix = "";
if(pos >= 0)
suffix = name.substring(pos + 1);

if(!suffix.equals("class"))
i.remove();
}

String[] arr = new String[list.size()];
for(int i = 0; i < arr.length; ++i)
{
String name = list.get(i);
name = name.replace('/', '.');
arr[i] = name;
}

return arr;
}


... is evil because it can be rewritten in a much more compact way, like this:

// Version 2
public static String[] findClasses(String jarFile)
throws Exception {
List<String> list = new ArrayList<String>();
Enumeration<? extends ZipEntry> entries
= new ZipFile(jarFile).entries();
while(entries.hasMoreElements()) {
String name = entries.nextElement().getName();
if(name.endsWith(".class"))
list.add(name.replace('/', '.');
}

return list.toArray(new String[0]);
}


In this post I don't want to debate whether the latter version is indeed better than the former. My claim is quite different, and can be summarized in two words: Who Cares ?!

Let me put it this way: even if we assume, for the sake of the argument, that version 2 is better than 1, there's still no need to refactor version 1. This is because version 1 is designed/implemented in such a way that its quality does not affect the quality of the program as a whole.

Here a few key observations about version 1:

  • It relies (solely) on classes from Java's standard library. It does not depend on application code.

  • It does not change the program's internal state. Specifically, the method does not change its input object (the jarFile variable); it is a static method so it has no instance fields; and, the method does not touch any static fields.

  • The method's code does not impose any restrictions of its own on the input object. The only restrictions (such as: jarFile must specify the path to an existing, readable, legal jar file) come from Java's standard library classes to which the input is fed.



These observation point out that this method is actually quite close to a library method. A library method never depends on application code; Library code has no access to the program's internal data structure (except for what it receives as inputs); Library code will not realize your program's business logic simply because it is not aware of your program.

If version 1 resembles a library method, why don't we turn into an actual library? After all this would make it reusable in different contexts. So, we move the method into a dedicated class which we then compile into a jar. We add the jar to our program's class path.

We now notice a very strange thing: we no longer care how version 1 is written. This is because we never care about the inner workings of library methods or classes - we are interested only in their external behavior. The key issue is what the library does, not how it does it. This is a direct result of the fact that a library hides implementation details and exposes only interfaces.

At this point a light-bulb goes off over your head. You realize that you don't need to bother yourself with creating artificial libraries. You just need to keep in mind the familiar principle of thinking in terms of API design. This means, among other things, that you force yourself to program against the API of your own modules.

(I am not saying that the API of this findClasses() method is perfect. The point is that both version 1 and version 2 offer the exact same API so they are virtually identical, API-wise).

One may err and think that we found a magic cure to all ill-designed programs: "Hey, I just need to treat a crappy piece of my code as a module with an API and my program's quality will rise".

Clearly, this is a misunderstanding of the process: not every piece of code can be treated as a module. In this example, the key issue was that our method presented library-like characteristics from the very beginning. The right conclusion is therefore this one:

if you have a well-isolated piece of code -- no dependencies; minimal, predictable side effects; no domain knowledge; a crisp interface -- then you can write it anyway you want.


Actually, there is an even deeper theme that runs throughout this post. It goes like this:
if you have a well-isolated piece of code, then you have a well-written piece of code. Stop fiddling with it.

0 comments :: Who Cares?

Post a Comment