The data processing theorem states that data processing destroys information.
Prove this by considering an ensemble WDR in which w is the
state of the world, d is data gathered, and r is the
processed data, so that these three variables form
a chain
![]()
that is, the probability P(w,d,r) can be written
as
![]()
Show that the information that R conveys about W, H(W;R),
is less than or equal to the information that D conveys about W,
H(W;D).
Incidentally, this theorem is as much a caution about our definition
of `information' as it is a caution about data processing!