Trees: Expression Trees

    Trees: Expression Trees

      Your calculator doesn't have very many ways of expressing itself. Sure, there's 1337 speak, and maybe, if it's both patient and desperate, it could try commandeering your game of Tetris to spell something out, but for the most part it doesn't have many modes of expressing itself.

      Except for, of course, what it was built to do: calculate things.

      Ignoring the totally adorable idea of a rom com where a calculator falls in love with a human (we've seen weirder), most calculators in the real world like to bottle up their feelings and just do calculations for you as their way of expressing themselves.

      Turns out that you and the calculator might not even be speaking the same language, though, and it's all because they use trees to store values. Expression trees.

      Even simple math like 2 + 2 on a calculator from a few decades ago becomes pure gibberish. That calculator only accepts input in Reverse Polish Notation, where all the operators have to go at the end, turning 2 + 2 into 2 2 +.

      No big deal, right? That's easy to figure out, but let's say you've got a big old math problem like (((4 + 4) - 7) × 9) ÷ 3. Turn it into 4 4 + 7 – 9 3 ÷ and you lose your place as soon as you start.

      Don't start hyperventilating about all the things your calculator wanted to say but couldn't. As it turns out, you can turn this math expression into a tree—called an expression tree—where each node represents a different number or operator. All of the leaf nodes will be the numbers; the internal nodes will be your standard operators, like +, -, ×, and ÷.

      Start at the left-hand side and work to the right side, building the tree node by node. The first three nodes are 4, +, and 4. Since we're making any operator an internal node, + becomes the parent of 4 and 4, which tells the calculator that you add 4 and 4 to get the answer for this subtree. That way, 4 + 4 looks like this:


       

      Next, add nodes for the minus sign and 7. Since 4 + 4 happens before subtracting 7, – will become the parent of + and 7.


       

      So far, we just have (4 + 4) – 7. Now we need to multiply everything by 9.


       

      Almost done. All we need to do is divide everything by 3.


       

      Expression, meet binary tree. Sure, the tree's a teensy bit on the unbalanced side, but balance isn't much of a problem in this case. We (our calculator, more accurately) just need to be able to traverse the tree to get an answer and chances are good that the expression we build won't be in the tens of hundreds of numbers.

      If you traverse the tree using the in-order traversal, you'll get back the same expression you started with (minus all those parentheses): 4 + 4 – 7 × 9 ÷ 3. To mathematicians, this form of expressing math is called infix notation, since the operators are inside the expression.

      People love infix notation: it just makes sense to us. Historically, though, calculators have a hard time figuring out how things go. Since there aren't any parentheses, for example, it gets pretty tricky for the calculator to decide whether to go with the multiplication or the addition first. If it follows the PEMDAS order of operations, it's going to multiply and divide first, which gives the wrong answer this time.

      Nope, that tree traversal isn't too helpful.

      Let's try a different route. Using the pre-order traversal puts every operation before the numbers it acts on ending up with something like this:

      + –  × ÷ 4 4 7 9 3.

      This notation has two names: Polish notation (because it was invented by a Polish mathematician) and prefix notation (because…prefixes).

      If the calculator had to solve this, it would search through the list until it found an operator followed by two numbers. It would then apply the operator to the numbers and include the result in the list of numbers. On this particular list of numbers and operators, the calculator would evaluate

      1. 4 + 4 = 8 
      2. 8 – 7 = 1
      3. 1 × 9 = 9
      4. 9 ÷ 3 = 3

      Unlike infix notation, parentheses aren't needed to make sure the right operations go first.

      That's cool, but we've been boasting on and on about Reverse Polish Notation. Is it because we love how esoteric it sounds? Do we have a thing for three letter acronyms? Does including the word "Reverse" make it sound like we moonlight as competitive Cha Cha Slide dancers?

      Not really (but if you know anyone looking for Cha Cha Slide dancers, we have a great video for you to send along). Reverse Polish Notation is easier for English-speaking people to read, since the operators start from the left and move right the way that you read things in English. (Source)

      All you need is a post-order traversal of the list, which means RPN also known as—wait for it—postfix notation.

      If we roll with RPN for this expression, we'll get:

      4 4 + 7 – 9 × 3 ÷

      Expressions in RPN were (and still are) really easy for calculators to understand and solve, at least compared to infix notation. Humans have a little more trouble, but calculator engineers in the 60s and 70s thought making everyone learn RPN was much easier than trying to get the computer to understand how infixing worked.

      Yup, it's that hard.

      RPN works by taking the first two numbers encountered and using the first operation after them to manipulate those numbers, which is basically the opposite of PN. Just like with PN, we don’t need parentheses to resolve potential mathematical ambiguity.

      Conveniently for both us and the calculators, the expression trees made from the infix, prefix, and postfix versions of an expression will always look exactly the same, which can make it easier for us to communicate with our calculators.

      Unless trying to spell out commands in your game of Tetris is your thing. You do you.

      (Source)