Better living through probability generating functions

The straightforward way to calculate the mean value, mean squared value, or any other moment of a distribution is by taking a weighted average:

\[\begin{align} \langle x \rangle &= \sum\limits_i x_ip(x_i) \\ \vdots \\ \langle x^m \rangle &= \sum\limits_i x_i^mp(x_i) \end{align} \]

For example, in the voting model, the mean value of $\hat{A}$ is given formally by

$\begin{aligned} \langle\hat{A}\rangle &= 0\cdot p(0)+1\cdot p(1)+2\cdot p(2)+\ldots+N\cdot p(N) \\ &= \sum\limits_m m\cdot p(m) \end{aligned}$

where $p(m)$ is given by the binomial distribution: $\binom{N}{m}p_A^m\left(1-p_A\right)^{N-m}$ .

As we expect, the mean value of $\hat{B}$ is

$\begin{aligned} \langle\hat{B}\rangle &= \sum\limits_m (N-m)\cdot p(m) \\ &= N - \langle \hat{A} \rangle \end{aligned}$

In principle, we can extract anything we'd like to know from a distribution using similar series. In general, however, these sums can get cumbersome and quite unfriendly. It would be preferred if we could find a way to bypass their use altogether, if we could find an equivalent representation that reveals its properties by simpler means.

A similar problem arises in statistical mechanics where we have a series of states, characterized by some energies $E_i$ , for which we'd like to know the likelihood of finding a system in any one of those states. In principle we have to calculate some dreadful weighted sum.

However, it is often possible to express the probability distribution in a compact form known as the partition function $Z(E_i)$ and obtain various bulk average by means of simple operations on $Z$ , rather than wading through the murky waters of series manipulation. Can we do something similar here?

Let's consider the object $\displaystyle Z = \left(p_A + p_B\right)^N$ . On its face this may not seem like much of anything. For one thing, it is always equal to 1.

If we expand it, we see that it is a sum of terms, each one corresponding to a $p(m)$ in the binomial distribution, i.e.

$\left(p_A + p_B\right)^N = \sum\limits_{m=0}^{N}\binom{N}{m}p_A^mp_B^{N-m}$

This is a good sign, we have found a concise expression that reproduces the binomial distribution. It would be great if we could implant something that allows us to easily transform the compact expression from distribution, to mean value, to mean squared value, etc.

Right now, we have an expression for $\displaystyle\sum\limits_{m=0}^{N}\binom{N}{m}p_A^mp_B^{N-m}$ while we'd like one for $\displaystyle\sum\limits_{m=0}^{N}m\binom{N}{m}p_A^mp_B^{N-m}$ .

Consider $\left(p_Az+p_B\right)^N$ . Upon expanding, we find that this is

$f(z) = \displaystyle\sum\limits_{m=0}^{N}z^m\binom{N}{m}p_A^mp_B^{N-m}$

and in particular $f(1) = \displaystyle\sum\limits_{m=0}^{N}\binom{N}{m}p_A^mp_B^{N-m}$ .

Cleaning things up a bit, we have $f(z) = \displaystyle\sum\limits_{m=0}^{N}z^mp(m)$

Now, if we take the derivative of $f$ with respect to $z$ , we can bring down a factor of $m$ in each term of the sum, i.e.

$f'(z) = \displaystyle\sum\limits_{m=0}^{N}mz^{m-1}p(m)$

and, by setting $z=1$ , we obtain the mean value of the distribution $f'(1) = \displaystyle\sum\limits_{m=0}^{N}m\cdot p(m) = \langle m \rangle$

This is amazing. Rather than summing the series by hand, manipulating the terms, applying identities, etc. we perform a single derivative and find $f'(1) = \langle m \rangle$ .

It is natural to then ask what happens if we take the second derivative. Let us see!

$\begin{aligned} f''(1) &= \sum\limits m(m-1)z^{m-2}p(m)\bigg|_{z=1} \\ &= \left[\displaystyle\sum m^2z^{m-2}p(m) - \displaystyle\sum mz^{m-2}p(m)\right]\bigg|_{z=1} \\ &= \langle m^2 \rangle - \langle m \rangle \end{aligned}$

Then, we have $\langle m^2 \rangle = f''(1) + f'(1)$

Alternatively,

$\begin{aligned} \sigma^2(m) &= \langle m^2 \rangle - \langle m \rangle ^2 \\ &= f''(1)+f'(1)-f'(1)^2 \end{aligned}$

It appears we have stumbled on something quite valuable: for each derivative we perform on $f(z)$ , we receive another piece of information about the distribution $p(m)$ .

At this point you might wonder what all of this is good for. After all, we began with some ugly looking sum, and now we have arrived at some slightly less ugly but more abstract sum to take its place. But remember, we are working with the abstract representation for $f(z)$ in order to derive rules to manipulate $f$ 's simpler form, $\left(p_Az+p_B\right)^N$ . Calculations are simpler and more transparent once these are in place.

Next, let's try out some of our tools.

Markdown	Appears as
`italics` or `_italics_`	italics
`bold` or `__bold__`	bold
- bulleted - list	bulleted list
1. numbered 2. list	numbered list
Note: you must add a full line of space before and after lists for them to show up correctly
paragraph 1 paragraph 2	paragraph 1 paragraph 2
`[example link](https://brilliant.org)`	example link
`> This is a quote`	This is a quote
# I indented these lines # 4 spaces, and now they show # up as a code block. print "hello world"	# I indented these lines # 4 spaces, and now they show # up as a code block. print "hello world"

Math	Appears as
Remember to wrap math in `\(` ... `\)` or `\[` ... `\]` to ensure proper formatting.
`2 \times 3`	$2 \times 3$
`2^{34}`	$2^{34}$
`a_{i-1}`	$a_{i-1}$
`\frac{2}{3}`	$\frac{2}{3}$
`\sqrt{2}`	$\sqrt{2}$
`\sum_{i=1}^3`	$\sum_{i=1}^3$
`\sin \theta`	$\sin \theta$
`\boxed{123}`	$\boxed{123}$

Better living through probability generating functions

Comments