Is there a criteria for distinguishing well-written (or well designed) programs from ill-written programs?
Here is what may very well be the best definition for quality in software:
A well-written program is a program where the cost of
implementing a feature
is constant throughout the program's lifetime.
The rationale is simple - if it gradually takes more and more time to add a feature to a program, you'll ultimately reach a point where it is just too expensive to change the code. This is plain economy: past that point it won't be worthwhile to add even the tiniest feature and the project will practically stall.
This was the intuition. To make things more precise, let me explain the basic terminology:
- Cost: time needed for programming. In this post I will speak about the total cost of the program and about the cost per feature.
- Feature: A change in the external behavior of the program whose completion can be determined objectively. For the purposes of this discussion there is no need to distinguish between adding something new, changing something old or fixing something that is broken. We will collectively refer to these activities as "implementing a feature". This definition coincides with the definition of a story in the Extreme-Programming world.
The last term that needs to be precisely defined is "constant". Providing a good definition for this term is hard because development costs tend to fluctuate so we need to find a definition that can tolerate fluctuations. In short, the challenge boils down to this: If it takes 10 hours to implement feature #1 and 12 hours to implement feature #2 does this mean the cost is not constant?
The best way to abstract away these natural fluctuations in costs, is to use the
big-O notation that is typically used for expressing the performance of algorithms. Using this notation we obtain the following precise definition of software quality:
A program is considered to be well-written if its total cost function behaves like o(n) where n is the number of features in the program
Note that a total cost of O(n) reflects an O(1) cost per feature (amortized), so we can rewrite this definitions in terms of a per-feature cost.
Why do I like this approach for quality? Four reasons:
First, it can be objectively measured.
Second, it is normalized: this approach establishes a standard scale by which program can be
measured and compared. For example: a total cost of O(n*n*n) is certainly worse than O(n*n); or: a total cost O(n*log n) is usually a close enough approximation of O(n). In short, we can quantify how far are we from the "good" end of the scale.
Third, this measure of quality captures the bottom line: value delivered to clients. Many software metrics capture artifacts of the development process (hours worked, lines of code, classes written, method per class, etc.). Such metrics only testify to the quality of the
process. They do not testify to the quality of the
outcome. In order for these metrics to be useful one should show that the measured artifact is correlated with properties of the outcome. This rarely happens. The definition presented here largely ignores the process. It cuts to the chase and measures the outcome.
Finally, this approach acknowledges the fact that a program is a complex system. It does so by measuring the program as a whole. It is difficult (or even impossible) to estimate the quality of a program from the quality of its individual sub-parts. Here are a few examples to illustrate the pitfalls of measuring software quality in pieces:
- Example #1: Suppose I have a piece of code that - when examined in isolation - turns out to be really crappy. Somewhere in the program there is an excellent suite of tests that cover this piece from every possible angle. This suite makes the piece much more manageable than a similar well-written piece that has no tests.
- Example #2: Sometime bad code can be located in unimportant modules. Imagine a program that realizes sophisticated analysis on data coming from CSV (comma separated values) files. The input module of this program reads the input file and populates an in-memory matrix. Even if this input module is badly written, its effect on the overall quality of the program is minimal. At any point I can replace the implementation of the input module because it has a crisp interface (the said matrix) that isolates it from the rest of the code.
- Example #3: The problem that my program needs to solve can be nicely expressed as a Domain Specific Language (DSL). Thus, I define a DSL (an external DSL to be precise) , write an interpreter for this DSL and then implement the solution to the problem by writing a program in that DSL. If the DSL is a good DSL (that is: solutions to domain problems can be easily expressed in it) the quality of the whole program is high regardless of the quality of the interpreter.
The definition of quality presented here is quite close to the notion of
RTF as coined by Ron Jeffries. However, the definition here is a bit more relaxed - it does not require tested features. This is because constant delivery of features requires constant refactoring (you can't plan everything in advance). And, constant refactoring requires testing - otherwise you will not be able to detect incorrect refactoring steps an you'll end up introducing bugs into the program which sooner or later will slow you down.