Expression equivalence, as far as a numbers solver cares
When I wrote the recapping program for Innis and Mark's Comic Relief Countdown marathon, I wrote a numbers solver for it. Writing one of these things makes you think about whether two expressions are equivalent or not.
In principle, you have to generate every possible sub-list of the six numbers, then for each sub-list, generate every arithmetical expression that can be made using all numbers in the sub-list, evaluate all the expressions, and return the one closest to the target. This takes an enormous amount of time, though, so you have to use some short-cuts. Some of them are obvious. For example, when generating expressions, don't generate the expression A - B if A is less than B, don't generate A / B if A is not divisible by B, don't bother generating multiplications or divisions if one of the operands is 1 (unless your solver is running in Platt mode), and so on. However, there are still a huge number of expressions you have to generate - the number of expressions possible from a list of numbers increases exponentially as you add more numbers to the list.
It turns out that finding a good way of deciding that two expressions are equivalent is important when writing a numbers solver. When it generates an expression, it can check whether it's equivalent to any existing expression it already has, and ignore it if so. You might think this laborious checking would slow the program down, but in the long run it speeds it up. The more expressions that can be eliminated, especially when generating expression lists for the smaller lists of numbers which are then used as the basis for forming larger expressions, the less fuel there is for the combinatorial explosion of expressions that occur when you add more and more numbers.
In fact, the minimum a solver needs to do to decide whether two expressions - let's call them E1 and E2 - are equivalent, is to ask:
- Do both expressions come to the same total?
- Do both expressions use the same numbers?
If the answer to both of those questions is yes, then you can forget about E2, since E1 is just as useful to you in building larger expressions. As an example, if your program already has (10 - 3) * 5 = 35, then it doesn't also need (10 * 3) + 5 = 35. For the Countdown numbers game, the former is as useful as the latter in any situation. So forget about (10 * 3) + 5. It's a waste of time plugging both of them into every four-, five- and six-number expression you generate.
Expression equivalence as far as everyone else cares
The above definition of expression equality is useful when writing a solver which just needs to decide what the closest possible to the target is and come up with one example solution. However, it doesn't really work as an intuitive definition of equality for expressions. We wouldn't say that (10 - 3) * 5 and (10 * 3) + 5 are equivalent expressions, even though they're of equal use in all situations in the numbers game. And we certainly wouldn't say that, for example, (100 * 3) + (7 * 8) + 6 + 4 and (100 - 8) * 4 - (3 - (7 - 6)) are the same expression. Also, if you wanted your solver to come up with every solution, sans any rearrangements of earlier solutions, this isn't enough.
An ideal definition of equivalence
Handily, there's already a mathematical definition of two statements being
equivalent. There's even a special symbol, like the equals symbol but with a third horizontal line, to indicate equivalence. We might say that A + B = C for certain values of A, B and C, but something like A + B = B + A is always true, no matter what the values of A and B are, so we say A+B is equivalent to B+A.
For Countdown purposes, the ideal definition of equivalence for two expressions E1 and E2 might go like this. First, we've got the basic conditions that the two expressions have got to come to the same total and use the same numbers to be considered for equivalence. Suppose we replaced every distinct number in expression E1 with a symbol. We'll use the letters A to F, as we've got at most six distinct numbers. Then we also replace the same numbers in E2 with the same symbols. E1 and E2 are equivalent if they always come to the same total regardless of the values we substitute for the letters A-F. (Incidentally, in this hypothetical substitution we can't just pick any values - it still has to make a legal expression for Countdown. So no clever-clogs antics like dividing by zero, subtracting a larger number from a smaller one, dividing 6 by 4, etc).
So if E1 were 5*6+7, and E2 were 7+6*5, we could write them as E1 = A*B+C and E2 = C+B*A. Whatever the values of A, B and C, E1 = E2. So they're equivalent.
If E1 were (10-3)*5, and E2 were (10*3)+5, then E1 = (A-B)*C and E2 = (A*B)+C. If we say that A is 3, B is 2 and C is 1 then E1 does not equal E2, so they're not equivalent.
Note that this definition has a couple of quirks - some very similar expressions are not considered equivalent, such as 2 * 2 and 2 + 2. Furthermore, two different-looking expressions such as ((24/(6/2)) * (7 - (9-8))) / 3 and 2 * 24 * (7 + 8 - 9) / (3 * 6) are considered equivalent. But for the purposes of a solver which has been tasked to come up with all solutions for a numbers round, deciding whether to tell you about (5*100)/(6/3) if it's already told you about (5*100*3)/6, it could get away with saying "meh, that's basically the same solution as the other one".
Working out whether two expressions are equivalent
But how do we tell whether two expressions are equivalent? I've come up with a method of "normalising" an arithmetical expression. For each set of expressions that are all equivalent to each other, there should be one defined way of writing that expression. Once you've normalised your two expressions in this way, you can tell whether they're equivalent just by comparing them. In a computer program you'd compare the
expression trees, but you can equivalently think of it as just comparing the textual representation of the expression character by character.
The goal of the normalisation process is that expressions which only differ by rearrangement of their terms or factors should be considered identical.
Writing the expression in a standard form
Every expression that isn't just a number can either be thought of as:
- The sum of a series of expressions, possibly with the sum of another series of expressions subtracted, or
- The product of a series of expressions, possibly divided by the product of another series of expressions.
We're going to try to write every expression like this. So let's say we have the expression:
4 + 5*(1+6) - 10 + 6/2 - (7 - 6)
This is just the sum of 4, 5*(1+6), 6/2 and 6, minus the sum of 10 and 7. So we'll write it like this:
(4 + 5*(1+6) + 6/2 + 6) - (10 + 7)
Analogous rules apply if our expression is a product of a series of expressions, possibly with divisions, such as (6 / (10 / 5)) * (4+1) * (5*2 + 18) / 14. This might be written as (6 * 5 * (4+1) * (5*2 + 18)) / (10 * 14). It's the product of a series of expressions, divided by the product of another series of expressions.
Ordering of terms/factors
At this point, we need to define an order to write the summed (or multiplied) elements in. It doesn't matter how you define the order, as long as your definition can consistently say, for two differing expressions, which one should come first. Besides that, the nitty-gritty of this isn't particularly important, but if you care, I decided on this:
- If one expression has a higher value than the other, it goes on the left.
- Otherwise, compare the top-level operators of each of the two expressions (the "top-level" operator is the one you evaluate last, so in (3+4)*5 it's the multiplication sign). The operator with the highest rank (*, /, +, -) goes on the left. A bare number has the lowest rank. So (6 * 7) + (84 / 2) is the correct order, not (84 / 2) + (6 * 7).
- If it's still a tie, the expression with more numbers in it goes on the left.
- If it's still a tie, then look at the leftmost sub-expression in each of the two expressions, and compare them according to these rules. If they tie, look at the second sub-expression from the left in each expression, and so on.
- If it's still a tie after doing all that, then (I hope) you've just got two identical expressions, so it doesn't matter which way round you put them.
This is mainly for addition and multiplication, because for subtraction and division you can't just switch the order of the operands around. Well, you can if you're doing X-X to make 0 or X/X to make 1, but you never need to use 0 so subtraction isn't an issue. If you're dividing an expression by an equivalent-valued expression to make a 1, which is often useful, we'll say that we just apply the rules to each expression and put the higher-ranking one on the left as normal (or top, if you prefer), so it's (75+25)/100 and not 100/(75+25).
We'll have to apply the rule to all sub-expressions as well, and their sub-expressions recursively, so if we have our partly normalised expression as follows (the brackets around the first set of sums aren't strictly necessary, but they're to illustrate that we're doing the sum of a bunch of expressions minus the sum of another bunch):
(4 + 5*(1+6) + 6/2 + 6) - (10 + 7)
We'll rearrange the terms in the expressions on each side of the minus sign and make this:
(5*(6+1) + 6 + 4 + 6/2) - (10 + 7)
Our "products divided by products" expression:
(6 * 5 * (4+1) * (5*2 + 18)) / (10 * 14)
Would be re-ordered like this:
((18 + (5*2)) * 6 * (4+1) * 5) / (14 * 10)
Now we've got a set of rules for defining a normalised arithmetic expression. Any expression can be reduced to this normalised form, and an expression can only be normalised in one way. Then all you have to do is compare the two expressions, removing any redundant brackets before you do so. Having applied this to the solver I wrote earlier in the year, it seems to work, and when told to find all the solutions for a numbers game, I haven't yet seen it give two solutions that are just rearrangements of each other.
What I don't know is whether this method of normalising expressions always produces the same expression from two different expressions which are equivalent. In other words, given any two expressions using the symbols A-F in place of numbers, which are equivalent according to our ideal definition, will they always normalise to the same expression or have I missed something? Maybe it's possible to prove it, or maybe there's a glaring error I haven't spotted? Not sure. How should we treat no-ops? A solver probably wouldn't bother generating things like "multiply by 1", so it's not an issue for solvers, but if contestant 1 said "I did (100+6+5)*7" and contestant 2 said "I did (100+6+5)*7*1" should we treat those as equivalent? Our definition says no, because the two expressions don't use the same numbers, but C2 would probably be asked to show their paper.
I would try to answer some of these, but I've run out of day and this post is too long as it is. So, um, yeah. I'll just leave it for Gevin to pick holes in.
*wanders off just leaving this post sitting here*