Simulation of correlated variables

Imagine you want to generate a long sequence of outcomes (x,y) from the probability distribution

 (1-f)/2 f/2 f/2 (1-f)/2
For some reason (perhaps to do with cryptography), you want to generate the outcomes using three sources of random bits: one source at a central location, r; and two independent sources at locations x and y. So we implement the correlated distribution with a latent variable, thus:
```P(x,y) = sum_r P(r) P(x|r) P(y|r)
```
where r is a sequence of discrete random variables with probability distribution P(r) of your own devising, and P(x|r) and P(y|r) are probability distributions for the sequences x and y given the sequence r. Random bits are gobbled up at some rate by each of these distributions. The aim of the puzzle is to choose P(r), P(x|r), and P(y|r) such that the number of bits used to generate r, H(R), is minimized.

An example solution that achieves H(R)=1 bit per outcome is P(r=0/1) = { 1/2, 1/2 }; P(x|r) = delta( x=r ) ; P(y|r) = Flip r with probability f. This is a simple solution in which the strings {r,x,y} are generated one bit at a time, i.i.d. You're free to generate them in blocks if you want, and r can live in any discrete alphabet you want.

Information theory says you can't have H(R) < I(X;Y) = 1-H_2(f). How small can you get H(R)?

This puzzle set by Jonathan Oppenheim.

SPOILER WARNING

SPOILER WARNING

SPOILER WARNING

SPOILER WARNING

SPOILER WARNING

Solution: I don't know the optimal solution. I have a solution that achieves H(R) = H_2( .5/(1-f) )

David MacKay