Crup

Lately I've been using the acronym C.R.U.P as a shorthand for "Code that is RedUndant in the Program". In this post I will explore some of the negative implications of crup. In order to reduce verbosity I will use the word "crup" instead of the acronym C.R.U.P. (Too bad the word crup has no meaning. Really wish it did).

Program development can be thought of as a process where you repeatedly add new pieces of functionality. The implementation of each such piece may depend on existing functionality but may also require the introduction of new functionality, which in turn spawns additional functionality pieces. This model can be naturally represented as a tree where child nodes represent pieces of functionality that were introduced as part of the implementation of the parent. Note that a node can be method, a class, a collaboration of classes, a module, whatever.

Redundant functionality (that is: redundant nodes, crup) is a node that provides more services that its parent expects. In other words, you can replace the subtree rooted at this node with a smaller tree and the program as a whole will still deliver the behavior it was expected to deliver.

programmers often introduce crup into the code-base, thinking that a certain piece of functionality will be needed in the future, and that it is less effort to introduce it sooner than later. This resonates with the famous YAGNI debate: The pro-YAGNI people say that crup is really bad. The pro-crup people say that by predicting future behavior of a program you can ensure your program's structure is well optimized for the long run, and that no post-facto refactoring will be able to match that.

So, let's throw some numbers into the picture: each node has 5 children. 1 in 5 children is a crup node. The tree has 4 levels (that is: a root node, five children, 25 grandchildren, 125 grand-grand children). Total number of nodes is therefore: 1+5+25+125=156.

How much crup do we have in the program? Let's see. The root is obviously not redundant. However, one of its children (level-2) is redundant along with its whole subtree. That's 1+5+25 = 31.

The other four nodes at level-2 are not crup, but each of them is a proud parent of a level-3 crup node: 4*(1+5) = 24. We are left with sixteen non-crup level-3 nodes. These nodes have eighty children, of which one fifth is crup: 16.

So in this program the total crup factor is: (31 + 24 + 16) / 156 = 71 / 156 = 45%. In other words, almost half of the effort went into redundant functionality. That's a lot of extra work. The programmer must posses good future-prediction skills to make this extra work pay off.

The crup factor rises exponentially with the height of tree. If the tree had 5 levels, our crup factor would rise to 56%.

This little post is by no means a scientific study. There are a lot of other factors to consider if one wants to make a scientific claim regarding crup, which is kind of ironic since empirical study of real life software projects/software development is a thorny issue. Here is one factor that mitigates the negative effects of crup: if a prediction regarding a crup node, which is close to the root, turns out to be correct, then the vast majority of the cruppy nodes in the program will stop being crup.

On the other hand, other factors compound the crup effect. For instance, Maintenance and developments costs are probably super-linear with respect to program size (e.g.: program size doubles the but costs quadruple). Thus, Even a small amount of residual crup has a significant negative impact on the program's complexity.

As I said, I am not trying to make a scientific claim here. I just wanted to make the intuition behind YAGNI a bit more concrete. The aforementioned example, with its 45% crup factor, suggests that you could be much more productive if you just cut the crap.

0 comments :: Crup

Post a Comment