An Arithemetic Expressions Solver in 64 Lines

A few months ago Martin Fowler published a post titled Parser Fear. As always (well, almost always) I found myself agreeing with his main point: people are often intimidated by parsers.

Fowler's post advocates the use parser generators (in particular, ANTLR). I'd like to go a little further and make the following points:
  • Writing a recursive decent parser is much easier than you think.
  • You don't need a parser generator to overcome your parser fear.
  • Get over the trauma induced by the compiler class you took.

Actually I only need to show a good argument for point #1. The other two points naturally follow.
I will make my case by using the well established technique of proof by example: A 64 lines Java class that evaluates arithmetic expressions supporting +, -, *, /, negation, and parenthesis.

public class CompactSolver {
private String input;

public CompactSolver(String input_) {
input = input_.trim();
}

private char peek(int offset) {
return offset >= input.length() ? '\0' :
input.charAt(offset);
}

private boolean consumeIf(char c) {
if (peek(0) != c)
return false;
consume(1);
return true;
}

private String consume(int numChars) {
if (numChars == 0)
throw new RuntimeException("syntax error");
String result = input.substring(0, numChars);
input = input.substring(numChars).trim();
return result;
}

public double eval() {
double lhs = mul();
if (consumeIf('+'))
return lhs + eval();
else if (consumeIf('-'))
return lhs - eval();
return lhs;
}

private double mul() {
double lhs = unary();
if (consumeIf('*'))
return lhs * mul();
else if (consumeIf('/'))
return lhs / mul();
return lhs;
}

private double unary() {
if (consumeIf('-'))
return -unary();

if (consumeIf('(')) {
double result = eval();
if (!consumeIf(')'))
throw new RuntimeException("Missing ')'");
return result;
}

return literal();
}

private double literal() {
for (int i = 0; true; ++i)
if (!Character.isDigit(peek(i)))
return Integer.parseInt(consume(i));
}
}



Update 1: Changed the precedence of the (unary) negation operator (Many thanks to Orlando).

3 comments :: An Arithemetic Expressions Solver in 64 Lines

  1. Nice post.

    I noticed that the expressions are evaluated in a different way to common interpretations. The binary operators are right associative instead of left and the unary negation operator has the lowest precedence instead of the highest.

    e.g.
    2-3*4+20 gives -30.0 instead of 10.0
    -2-3*4+20 gives 30.0 instead of 6.0

    Here's how I'd write the parser using Waxeye. I wrote evaluators for Ruby and Java and they ended up being about the same size as your parser. However, things really start to pay off once the language gets bigger.

    calc <- ws sum

    sum <- prod *([+-] ws prod)

    prod <- unary *([*/] ws unary)

    unary <= '-' ws unary | :'(' ws sum :')' ws | num

    num <- +[0-9] ws

    ws <: *[ \t\n\r]

  2. Hi Orlando,

    My main point is that parsing (recursive decent or otherwise) is simple. I am not preaching against parser generators...

    Anyway, thanks for the comment. I fixed the precedence of the negation operator. Left-associativity is also not very complicated: the penalty should be about 5-6 extra lines of code.

  3. Care to share the 5-6 extra lines of code?

Post a Comment