Three Kinds of Bugs

I tend to classify bugs into one of these three groups:
  • Laziness
  • Lack of Creativity
  • Betrayal

The running example that I'll use here is a method that finds the largest value in a list of integers.

Laziness

public static Integer maxA(List<integer> list) {
int result = Integer.MAX_VALUE;
for(int i = 0; i < result)
if(result < list.get(i))
result = list.get(i);
return result;
}


Do you see the problem? result's initial value is completely wrong. Such bugs often happen when the programmer copies the code of a similar method but neglects to make all the neccssary adjustments.

Why is this group called "laziness"? That's because this method will not pass even the simplest test: assertEquals(new Integer(30), maxA(Arrays.asList(10, 20, 30))); In other words, if you weren't lazy you could have caught the bug with the very first test you threw at it. Actually, you don't even need a test case: just run it once on any piece of input and you'll find the bug. BTW, this "run it once" practice shows why read-eval-print shells are useful. Too bad they are common only in the arena of dynamically typed languages.

Lack of Creativity

public static Integer maxB(List<integer> list) {
int result = 0;
for(int i = 0; i < result)
if(result < list.get(i))
result = list.get(i);
return result;
}


maxB() is clearly better than maxA(). It will not fail at the very first attempt. In fact, as long as your input contains only positive integers this method will seem fine. You need a certain degree of creativity to devise an input that will manifest the bug, such as: assertEquals(new Integer(-4), maxB(Arrays.asList(-4)));

Betrayal

public static Integer maxC(List<integer> list) {
int result = list.get(0);
for(int i = 1; i < result)
if(result < list.get(i))
result = list.get(i);
return result;
}


This version works quite well. It will pass both of the aforementioned tests. The bug here is much more subtle. With Betrayal bugs the root of the problem is a disagreement between two methods.

Such a disagreement may occur if the method that builds the list (which is then fed into maxC()) will also generate empty lists. This situation contradicts maxC()'s assumption that the list is non empty. In a sense, maxC() is betrayed by its own program.

It is very tempting to say something like: "Well, maxC() has a clear precondition. This precondition is not met, so the bug is in the caller". However tempting, this view is overly simplistic. Suppose the program method that the code that calls maxC(), looks like this:


public class Driver {
public int n;

public List<Integer> listBuilder() {
List<Integer> newList = new ArrayList<Integer>
for(int i = 0; i < n; ++i)
newList.add(i);
return newList;
}

public Integer m() {
return maxC(listBuilder());
}
}


The contract of listBuilder() says that it should return a list with the integers 0..n-1 (inclusive). In particular, if n is zero the returned list should be empty. The contract of m() says that it should apply maxC() on the list created by listBuilder(). The contract of maxC() says that it should return the highest number from a non empty list.

So, every method is doing exactly what it is supposed to be doing, and yet the net effect is a runtime error. Despite the fact that maxC() is the one being betrayed we may decide that the correct way to fix the program is to change maxC() rather than listBuilder(). This is typically the case when we realize that our program needs to handle a wider range of inputs.


This type of bug cannot be detected by a testing maxC() in isolation. We need a higher level test, to expose the bug.

One final comment: Earlier I defined m()'s contract to be: "inbvoke maxC() on whetever listBuilder() returns" . I guess that some people may claim that this is not a valid contract for a method since it specifies implementation details rather than taking a black-box approach for specifying m()'s behavior. I am sorry to disappoint you guys, but such an approach will only work with simple programs. If your program is doing complicated computations you will not be able to express the contract (purely) in terms of input- vs. output- data. The input data will undergo such complicated transformations that you will have to use some form of abstraction, which is exactly what I did. Given that abstraction is the process of ignoring some details, there is an inherent lose of precision when one tries to formulate the contracts of elaborate computations.

0 comments :: Three Kinds of Bugs

Post a Comment