The Axioms of Programming

0 comments
  1. There's no fundamental difference between programming and designing.
  2. Super-linearity: Programming is super-linear by nature.
  3. Instability: The external behavior of a component will need to change over time.
  4. Discontinuity: Small changes yield non-proportional results. (Gives rise to the Butterfly Effect: A human cannot predict the full effect of a change in the program. Automatic tools are pretty bad at it as well).
  5. Entropy: The total entropy of the code will naturally increase. One needs to invest external energy (time) in order to decrease entropy (Writing ugly code takes less time than writing clean code)
  6. The Uncertainty Principle: The more detailed the specification the more error-prone it is.
  7. Unpredictability: An accurate estimation of development time requires time that is proportional to the estimate itself.

Quoting Allen Holub

0 comments
Found this piece by Allen Holub that makes the same point as the one I made in this post: design is all about trade offs.
Design, by nature, is a series of trade-offs. Every choice has a good and bad side, and you make your choice in the context of overall criteria defined by necessity. Good and bad are not absolutes, however. A good decision in one context might be bad in another. If you don't understand both sides of an issue, you cannot make an intelligent choice; in fact, if you don't understand all the ramifications of your actions, you're not designing at all. You're stumbling in the dark

Who Cares?

0 comments
From time to time you see people who passionately argue that code such as this:


// Version 1
public static String[] findClasses(String jarFile)
throws Exception
{
List<String> list = new ArrayList<String>();
ZipFile f = new ZipFile(jarFile);
Enumeration<? extends ZipEntry> entries
= f.entries();
while(entries.hasMoreElements())
{
ZipEntry ze = entries.nextElement();
String name = ze.getName();
list.add(name);
}

for(Iterator<String> i = list.iterator();
i.hasNext(); )
{
String name = i.next();
int pos = name.lastIndexOf('.');
String suffix = "";
if(pos >= 0)
suffix = name.substring(pos + 1);

if(!suffix.equals("class"))
i.remove();
}

String[] arr = new String[list.size()];
for(int i = 0; i < arr.length; ++i)
{
String name = list.get(i);
name = name.replace('/', '.');
arr[i] = name;
}

return arr;
}


... is evil because it can be rewritten in a much more compact way, like this:

// Version 2
public static String[] findClasses(String jarFile)
throws Exception {
List<String> list = new ArrayList<String>();
Enumeration<? extends ZipEntry> entries
= new ZipFile(jarFile).entries();
while(entries.hasMoreElements()) {
String name = entries.nextElement().getName();
if(name.endsWith(".class"))
list.add(name.replace('/', '.');
}

return list.toArray(new String[0]);
}


In this post I don't want to debate whether the latter version is indeed better than the former. My claim is quite different, and can be summarized in two words: Who Cares ?!

Let me put it this way: even if we assume, for the sake of the argument, that version 2 is better than 1, there's still no need to refactor version 1. This is because version 1 is designed/implemented in such a way that its quality does not affect the quality of the program as a whole.

Here a few key observations about version 1:

  • It relies (solely) on classes from Java's standard library. It does not depend on application code.

  • It does not change the program's internal state. Specifically, the method does not change its input object (the jarFile variable); it is a static method so it has no instance fields; and, the method does not touch any static fields.

  • The method's code does not impose any restrictions of its own on the input object. The only restrictions (such as: jarFile must specify the path to an existing, readable, legal jar file) come from Java's standard library classes to which the input is fed.



These observation point out that this method is actually quite close to a library method. A library method never depends on application code; Library code has no access to the program's internal data structure (except for what it receives as inputs); Library code will not realize your program's business logic simply because it is not aware of your program.

If version 1 resembles a library method, why don't we turn into an actual library? After all this would make it reusable in different contexts. So, we move the method into a dedicated class which we then compile into a jar. We add the jar to our program's class path.

We now notice a very strange thing: we no longer care how version 1 is written. This is because we never care about the inner workings of library methods or classes - we are interested only in their external behavior. The key issue is what the library does, not how it does it. This is a direct result of the fact that a library hides implementation details and exposes only interfaces.

At this point a light-bulb goes off over your head. You realize that you don't need to bother yourself with creating artificial libraries. You just need to keep in mind the familiar principle of thinking in terms of API design. This means, among other things, that you force yourself to program against the API of your own modules.

(I am not saying that the API of this findClasses() method is perfect. The point is that both version 1 and version 2 offer the exact same API so they are virtually identical, API-wise).

One may err and think that we found a magic cure to all ill-designed programs: "Hey, I just need to treat a crappy piece of my code as a module with an API and my program's quality will rise".

Clearly, this is a misunderstanding of the process: not every piece of code can be treated as a module. In this example, the key issue was that our method presented library-like characteristics from the very beginning. The right conclusion is therefore this one:

if you have a well-isolated piece of code -- no dependencies; minimal, predictable side effects; no domain knowledge; a crisp interface -- then you can write it anyway you want.


Actually, there is an even deeper theme that runs throughout this post. It goes like this:
if you have a well-isolated piece of code, then you have a well-written piece of code. Stop fiddling with it.

A Solid Definition for Quality in Software

1 comments
Is there a criteria for distinguishing well-written (or well designed) programs from ill-written programs?

Here is what may very well be the best definition for quality in software:

A well-written program is a program where the cost of
implementing a feature
is constant throughout the program's lifetime.



The rationale is simple - if it gradually takes more and more time to add a feature to a program, you'll ultimately reach a point where it is just too expensive to change the code. This is plain economy: past that point it won't be worthwhile to add even the tiniest feature and the project will practically stall.

This was the intuition. To make things more precise, let me explain the basic terminology:
  • Cost: time needed for programming. In this post I will speak about the total cost of the program and about the cost per feature.
  • Feature: A change in the external behavior of the program whose completion can be determined objectively. For the purposes of this discussion there is no need to distinguish between adding something new, changing something old or fixing something that is broken. We will collectively refer to these activities as "implementing a feature". This definition coincides with the definition of a story in the Extreme-Programming world.

The last term that needs to be precisely defined is "constant". Providing a good definition for this term is hard because development costs tend to fluctuate so we need to find a definition that can tolerate fluctuations. In short, the challenge boils down to this: If it takes 10 hours to implement feature #1 and 12 hours to implement feature #2 does this mean the cost is not constant?

The best way to abstract away these natural fluctuations in costs, is to use the big-O notation that is typically used for expressing the performance of algorithms. Using this notation we obtain the following precise definition of software quality:

A program is considered to be well-written if its total cost function behaves like o(n) where n is the number of features in the program


Note that a total cost of O(n) reflects an O(1) cost per feature (amortized), so we can rewrite this definitions in terms of a per-feature cost.

Why do I like this approach for quality? Four reasons:

First, it can be objectively measured.

Second, it is normalized: this approach establishes a standard scale by which program can be measured and compared. For example: a total cost of O(n*n*n) is certainly worse than O(n*n); or: a total cost O(n*log n) is usually a close enough approximation of O(n). In short, we can quantify how far are we from the "good" end of the scale.

Third, this measure of quality captures the bottom line: value delivered to clients. Many software metrics capture artifacts of the development process (hours worked, lines of code, classes written, method per class, etc.). Such metrics only testify to the quality of the process. They do not testify to the quality of the outcome. In order for these metrics to be useful one should show that the measured artifact is correlated with properties of the outcome. This rarely happens. The definition presented here largely ignores the process. It cuts to the chase and measures the outcome.

Finally, this approach acknowledges the fact that a program is a complex system. It does so by measuring the program as a whole. It is difficult (or even impossible) to estimate the quality of a program from the quality of its individual sub-parts. Here are a few examples to illustrate the pitfalls of measuring software quality in pieces:
  • Example #1: Suppose I have a piece of code that - when examined in isolation - turns out to be really crappy. Somewhere in the program there is an excellent suite of tests that cover this piece from every possible angle. This suite makes the piece much more manageable than a similar well-written piece that has no tests.

  • Example #2: Sometime bad code can be located in unimportant modules. Imagine a program that realizes sophisticated analysis on data coming from CSV (comma separated values) files. The input module of this program reads the input file and populates an in-memory matrix. Even if this input module is badly written, its effect on the overall quality of the program is minimal. At any point I can replace the implementation of the input module because it has a crisp interface (the said matrix) that isolates it from the rest of the code.

  • Example #3: The problem that my program needs to solve can be nicely expressed as a Domain Specific Language (DSL). Thus, I define a DSL (an external DSL to be precise) , write an interpreter for this DSL and then implement the solution to the problem by writing a program in that DSL. If the DSL is a good DSL (that is: solutions to domain problems can be easily expressed in it) the quality of the whole program is high regardless of the quality of the interpreter.


The definition of quality presented here is quite close to the notion of RTF as coined by Ron Jeffries. However, the definition here is a bit more relaxed - it does not require tested features. This is because constant delivery of features requires constant refactoring (you can't plan everything in advance). And, constant refactoring requires testing - otherwise you will not be able to detect incorrect refactoring steps an you'll end up introducing bugs into the program which sooner or later will slow you down.

The Essence of Programming

1 comments
What's the difference between a using a calculator and writing a computer program?

I argue that the fundamental difference is in mechanisms for reuse.
if you are to write a program the simplest program that computes n! then your code would look like this:

if(n == 1)
return 1;
if(n == 2)
return 2;
if(n == 3)
return 6;
if(n == 4)
return 1*2*3*4;
if(n == 5)
return 1*2*3*4*5;
...


This is a very simple. If you are writing computer programs like this then you are essentially using your computer as a calculator: anyone can write it because you specify the output for every possible input. It is also very easy to understand the code. However, it would take many days to write a program (using this style) that computes n! for n values between 0 and 10000. Moreover, if you need to change something (for example compute (n-1)! rather than n!) then you have to make many changes which again take a lot of time.

Fundamentally, the power of a computer program over calculators comes from loops. We can take a sequence of computations and repeatedly execute it many times
int r = 1;
for(int i = 1; i <= n; ++i)
r *= i;
return r;


Wow!. We wrote only 4 lines of code and yet we now have a program that computes n! for practically every n value (Clearly, this also implies the notion of variable - otherwise the body of he loop would compute the exact same value every time). The downside is that this program is a bit more complicated to write. It requires some skill to write and understand the code.

This loop is an example for a very simple reuse mechanism: instead of having the computation of n! duplicated for every possible n value, we repeatedly use the expression r *= i (with a different set of values in the r and i variables) such that it computed the correct result for us. This is illustrated the basic property of code reuse: write once use many times.

(Side note: The full explanation regarding the importance of loops, goes back to the seminal works of Turing and Church: a programming model cannot be Turing complete if it does not support loops or recursion.)

Of course there are many other mechanisms for reuse: functions, inheritance, mixins, traits, genericity and the likes. As soon as you start using these the complexity of the code rises. With functions you can rewrite the factorial program as follows:
function factorial(n) {
if (n==1)
return 1
else
return n * factorial(n-1);
}

Now the code is more elegant (that is: we reuse a bigger part of it) but again, it becomes more difficult to write and read. If you take someone who has just started learning programming he will probably be more comfortable with the loop version than with this recursive version.

If we are to move on to the inheritance, then we are in the object-oriented (OO) water. To explain OO-based reuse we need to understand objects, methods, overriding, which is even a greater step in terms of complexity.

My point is this: most of the complexity of programming comes from the need to reuse code, but we cannot get rid of this complexity because without reuse our programming language is reduced to a simple calculator.


This is something that should be remembered when you think about visual programming environments and other tools that are supposed to make it possible for non-programmers to write programs: many of these devices usually support only a simple form of reuse so an excessive use of these will lead to the familiar problems of duplicated code. If you try to improve the tool by introducing reuse mechanisms then it rapidly becomes so complicated that it eventually requires programming-like skills to use it, which beats it own purpose.

Dependencies

0 comments
A dependency graph is often used for appreciating/understanding the design of a (statically typed) program. The idea is simple: take every class in your program and create a corresponding graph vertex. Then, for each class X that uses a class Y create a graph edge between the corresponding vertices.

This graph can then be presented in various formats: a diagram, a list of pairs (of class names), etc. It is common to check the existence of cycles in a program's graph. Cycles are usually considered to be evil, so programmers are often complemented for developing cycle-free programs.

Having presented the background, I'd like to make two quick statements on this topic:
  1. Self dependency. Every class (and I mean every last one of them) always depends upon itself. Why? you cannot shield a class from changes in itself. For example, if you remove a private filed (let alone a public one) from a class, C, which classes will be affected? Right, C itself.

    It is not rare to see programmers who think that a class will depend upon itself only if it has a recursive method or a a recursive structure. This is a common fallacy. Every class has self-dependency, even if its methods are not recursive.

  2. Cycles. Suppose you have a cycle of two classes, that is: X depends on Y and Y depends on X. The standard trick for breaking the cycle is to extract an interface Z out of Y. Then make X use Z instead of Y. You may now think that X is no longer depending on Y.

    Now, ask yourself this question: "If you change something in Y, will X not bleed?". The answer is: Yes. When your program is running, X will still invoke Y's methods (through the Z interface) and if these methods behave differently, X will be affected. In other words, the introduction of Z eliminated the compile-time dependency but not the run-time dependency.

    Am I saying that there's no point in detecting compile-time cycles? No. Compile-time cycles are important but for a different reason: reusability. If X and Y are cyclic, then you cannot just take X and reuse in some other program/module. You will always have to carry class Y as a baggage (and vice versa). This restricts reusability. On the other hand, if X only depends on the (extracted interface) Z, then you can take only X and reuse it just by plugging-in some implementation for Z.

The Four P's

0 comments
In his excellent talk at OOPSLA 2005, Martin Fowler introduced the notion of the Four P's of software design: Philosophies, Principles, Practices and Patterns.

When discussing principles Fowler argued that a good principle should be universal: other than a few extremely rare cases, a good principle should hold for all kinds of programs, domains and languages. That is why good principles are hard to find.

Side note: After this talk I realized that many of the "principles" for producing high-quality code are conditioned by a trade-off and thus should be considered as Practices rather Principles. A quick example for such a trade-off is cohesion vs. coupling: if your class is doing too many things you can refactor some of its functionality into other classes. The refactored class will be more coherent, but also more coupled: its sensitivity to changes in other classes has increased.

If I recall correctly, Fowler mentioned two example principles: (1) Eliminate Code Duplication and (2) Separate User Interface Code.

Below are two additional notions ideas, which should qualify as principles:
  • Abstract your Operations as Values - Your program should be structured such that its operations can be abstracted over. This abstract-ability means that operations can be stored in variables, polymorphic-ally executed, rolled-back, or composed. Applying this principle is trivial in Functional Programming (where every function is also a value). In the OO world you will usually need the Command pattern to achieve this type of abstraction.
    Once the operations are represented as run-time values it can easily support features such as: access control, redo/undo, filtering, monitoring, automation and even transaction safety.
  • POJO: Plain Old Java Objects - Sophisticated OO framework give rise to complex classes which cannot be easily used by the programmer. For example, although writing a servlet class is a fairly simple task (you just need to subclass Servlet and overrride its doGet() method) creating a fully working instance of this class is practically impossible: your code needs to create many other objects (ServletRequest, ServletContext, ...) and feed these into the Servlet object. The cost - in programming time - of writing the code for creating these additional objects is usually overwhelming.
    Of course, the servlet container (Tomcat) can do this for you when the servlet is deployed, but there are other scenarios where you may want to instantiate a servlet: How can you test a servlet if you can't create one? How can a stand-alone application reuse the servlet's code?
    This principles states that you should place the vast majority of your code in POJO classes (If you do have to use some framework-managed class just write a very simple class that delegates to a POJO). In other words: always prefer the design with more POJO code.

A Side-Affecting toString() is Evil

1 comments
The principle of command-query-separation states that a method should either return a computed result or change the state of the receiver object. It should not do both. The rationale is two fold:
  • A method that does only one things is more coherent and thus easier to write/read/debug
  • The overall structure of a class that follows that principle tends to be simpler (i.e.: better).

Although there is a public consensus as to the value of this principle, there a few exceptions. For example, in java.util.List the remove(int) method removes an element from the underlying collection and returns that element. This design clearly violates command-query-separation but for a good reason: it enables implementing methods to be thread-safe. Anyway, there's one place where this principle should be followed blindly: the toString() method.

I had the misfortune to come across side-affecting toString() methods a few times. Every time my flow was interrupted due to unexpected side effects that popped up out of no where. It always took me a lot of time to find why the object's state changes.

This bad-enough problem is gravitated by two factors:
First, some toString() calls are implicit, as in:
   "x=" + x;

If x is not a String then the expression above has an implicit method call, like this:
   "x=" + x.toString();


Second, toString() calls are performed by the debugger when it shows the contents of variables. Hence, in the presence of a side-affecting toString(), your program will behave differently in debug-time. Moreover, this change in behavior is not consistent: it depends on which variables you decide to inspect. A programmer's living hell.

Real-life Example #1:

In Sun's Java compiler, ClassType.toString() invokes the typarams() method which in-turn invokes the side-affecting complete() method. I realized this only when I worked on the following fragment:

ClassType t = ....
if(t.interfaces_field.size() > 0) {
... // Do something
}

I got a null error since t.interfaces_field was null. When I opened the debugger the problem vanished: the debugger invoked t.toString() which started the chain reaction of completing the object and assigning values to its fields.

Real-life Example #2:

public static String toCsv(String[] line) {
String result = "";
final Separator comma = new Separator(",");
for (String s : line)
result += comma + s;

return result;
}


Looking at this method you'd expect it to return a comma-separated string with a comma before the very first item. After all, what else could "result += comma + s;" mean? Examining the Separator class reveals a counter-intuitive implementation:

public class Separator {
private final String s;
boolean first = true;

public Separator(String s) { this.s = s; }

@Override public String toString() {
if (!first)
return s;
first = false;
return "";
}
}


Of course, the final keyword attached to the comma variable just adds to the confusion. For sake of credibility, let me say that this is an exact copy of real code from a real project. I didn't invent it to make my point....

Testing UI

5 comments
Whenever I hear someone saying that his team is practicing automatic testing (TDD or whatever other buzzword) I find my self asking: "But how do you test the UI?". I get all sort of answers which only prove the inherent difficulties.

So today, I tried UISpec4J. It is too early for me to comment on the overall quality of this library, but the first impression is quite good. However, their documentation fails to provide the reader with a quick, fully-working example (which is the single most important piece of information that the user of a library needs). The following fragment is my attempt to fill this gap: take it, compile it, run it.


The subject program:

import java.awt.event.*;
import javax.swing.*;

public class Win extends JFrame
{
private static final long
serialVersionUID = -9206602636016215477L;

public Win()
{
JPanel panel = new JPanel();
final JButton b = new JButton("A Button");
b.setVisible(false);

JButton t = new JButton("Toggle");
t.addActionListener(new ActionListener()
{
boolean show = false;
public void actionPerformed(ActionEvent e)
{
show = !show;
b.setVisible(show);
}
});

panel.add(t);
panel.add(b);

getContentPane().add(panel);
setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
}
}


...and the test program:
import org.junit.Test;
import org.uispec4j.*;
import org.uispec4j.assertion.UISpecAssert;

public class Win_Test
{
static
{
UISpec4J.init();
}

@Test
public void test1() throws Exception
{
Panel panel = new Panel(new Win());

Button b = panel.getButton("A Button");
Button t = panel.getButton("Toggle");

t.click();
UISpecAssert.assertTrue(b.isVisible());

t.click();
UISpecAssert.assertFalse(b.isVisible());

t.click();
UISpecAssert.assertTrue(b.isVisible());
}
}


Comments:
  • The constructor of the tested widget (class Win) should not display the widget. This will make UISpec4J angry because it runs the tested window in some private, hidden, realm.
  • The test program will need both the JUnit and the UISpec4J jars to compile
  • The current version supports Java 5 but not Java 6.

You can't always get what you want

1 comments
So you're using TDD and you are test-infected and you have extremely low defect rate, the world is smiling at you and everything is super. Can you reach a point where your test suite does not need any more test cases?

Sadly, the mathematical answer is No. There's no such thing as a "big-enough" test suite.

Here is the explanation: Suppose your test suite contains n test cases. We can say that this suite is large enough if it guarantees that the n+1 test case is already covered by the existing n cases.

I claim that if you give me a class that passes these n tests I can give you a class that still passes these n test but fails for the n+1 test. In other words: passing the first n tests guarantees nothing with respect to any subsequent tests.

If you're really interested in the details, then here's how I can make the class fail in this last test: I will obtain the stacktace - new Exception().getStackTrace() - and look for the test method of that n+1 test. Of course this is a highly superficial way for a program to fail a test but it is still a bug just like any other "natural" bug (an int overflow with large inputs, a collection stored in a static field that eventually explodes, ...).

A few corollaries:
  1. Every test suite always has one more thing that it can check for.
  2. A bug that was not caught is not an indication to the test suite's lack of strength.
As you can see my point here is that blindly growing your test suite is almost a futile effort. (mathematically speaking, no matter how many tests you can write the total amount will still be just a tiny fracture of infinity). Instead of going for the masses, you should strive to make your suite "smarter". Here are three suggestions:
  • Don't treat the tested class as a black box. If you know nothing about the implementation then you have to test each method separately. On the other hand, if you allow your self to know that method f() merely calls g() with some default parameters, then you can dramatically reduce the amount of testing for f(). In theory, the Black-box approach for testing sounds like an excellent idea (you test the interface not the implementation) . The sad reality is that this approach is impractical - you number of tests you have to write is astronomical.
  • Prefer thin interfaces. If the public interface of your class has just one method, it can be tested much more effectively than a class with (say) two public methods.
  • Invest in high-level test. Unit testing is great. But higher level tests (aka: functional-, acceptance- tests) examine larger parts of your code in real-world scenarios. Thus, if you feel your suite is not strong enough, I would recommend adding more high-level tests. The return on investment (that is: the increase in the strength of your suite) will be much bigger than with adding low-level test.