## Thursday, 14 April 2011

### Sucking the pips off

Around any randomisation machine humans build rules which allow you to throw away information.  How do you analyse this mathematically.  Here, somewhat artificially, I'll constrain the analysis to dice.

Imagine one die with no distinguishing marks.  What's the information content?  Yes, 0.  You learn nothing new on  each throw.  Mathematically, the information content is $-\sum{p_i \log_2(p_i)}$.  You can think of this as a single probability $p$  value $1$.

Now re-imagine the pips. This time $I = -6\times{\frac{1}{6} \log_2(\frac{1}{6})} = 2.6$ bits. You can still throw a pipped die and mentally decide to ignore the number which lands face up.  What have you done?  You've created in your head an equivalence class.  You've said to yourself, 'for my own purposes I will assume an outcome which is a combination of all six elementary outcomes'. Given each elementary outcome is independent, you can find the probability of your all-encompassing equivalence class as the sum of the elementary classes.  Let $E_c$ be your all-encompassing equivalence class  and $e_1$, be the elementary outcome of getting a 'one' face up, etc. Then $E_c = e_1 \cup e_2 \cup e_3 \cup e_4 \cup e_5 \cup e_6$ and, by one of the three basic axioms of probability, $P(E_c) = P(\bigcup_{i=1}^6 e_i) = \sum_{i=1}^6 P(e_i) = \frac{1}{6}\times 6$.

So, just by imagining it, you can turn off your randomisation machine.  The same trick can be used to turn your randomisation machine into a coin-flipper, which as you can guess, provides just $1$ bit of information.  Just imagine two elementary outcomes, even numbered pips and odd numbered pips.  So what you have is a randomisation machine which has a maximal amount of information on offer to you.  The rules of your game, your context, determine how you might want to throw some of that information away for the purposes of your game.  You've combined elementary outcomes.  So one die can deliver a uniform distribution of 6 events of probability $\frac{1}{6}$, or a coin flip.  You can see how you could imagine  randomisation machines giving two unbalanced equivalence classes, of probability $\left\{ \frac{1}{6} \frac{5}{6}\right \} I=0.65$ bits and  $\left\{ \frac{1}{3} \frac{2}{3}\right \} I = 0.9$ bits.  You could chose to implement this in a number of different ways.  For example, in the $\left\{ \frac{1}{6} \frac{5}{6}\right \}$ by imagining 'one pip' to be the first of your equivalence classes and 'either 2 or 3 or 4 or 5 or 6 pips' to be your second.  $E_1$ and $E_{2\cup3\cup4\cup5\cup6}$, if you will.  But just as good a job could be achieved by $E_2$ and $E_{1\cup3\cup4\cup5\cup6}$, etc.

Equivalence classes are a mathematically formal way of expressing the idea of treating one or more possibilities as coming to the same thing for your current purposes.  It maps out a geography of interest and that geography is only constrained by the granularity of the maximal information state of your randomisation machine (playing cards, for example, are more fine grained, since you can have $52$ distinct elementary outcomes).

In the next post, I'll look at how to interpret multiple repeats of the die tossing experiment but I'll end by pointing out that, from an analytical point of view it doesn't matter if you consider multiple repeats as either happening simultaneously (you roll two  differently coloured dice) or serially (you roll the white die and note the result, then roll the blue die and note the result).  As long as you are consistent in which of the two parallel-roll dice you report first.  Since these two dice outcomes are genuinely independent, I'll show you how the informational additivity of independent random events works mathematically too.  This leads in to considerations about becoming indifferent to order or retaining order (combinations and permutations respectively).

This reminds me of the Samuel Beckett sucking stones extract from Molloy.