Central Limit Theorem

  2 minute read


Let $X_{1}$, $X_{2}$, $X_{3}$,… be i.i.d random variables from some distribution with finite mean $\mu$ and finite variance $\sigma^{2}$.

As $n \rightarrow \infty$, let $S=\sum_{k=1}^n X_{i}$, we have $S \rightarrow \mathcal{N}(n\mu, n\sigma^{2})$ and $\frac{S-n\mu}{\sqrt{n\sigma^{2}}} \rightarrow \mathcal{N}(0,1)$

Equivalently, let $M=\frac{1}{n}\sum_{k=1}^n X_{i}$, we have $M \rightarrow \mathcal{N}(\mu,\frac{\sigma^2}{n})$ and $\frac{M-\mu}{\sqrt{\frac{\sigma^2}{n}}} \rightarrow \mathcal{N}(0,1)$


  • $\mathcal{N}(\mu,\sigma^2)$ denotes Normal distribution with mean of $\mu$ and variance of $\sigma^2$.


Naturally CLT appears in questions that invovle sum or average of a large number of random variablse and especially when the questions only ask for an approximate answer.

Here are a few examples.

Example 1:

Suppose we have a fair coin and we flip it 400 times. What is the probability you will see 210 heads or more?

Exact answer

Let the outcome of each coin flip be a random variable $I_{i}$. Thus we are dealing with the random variable $S=\sum_{i=1}^{400}I_{i}$. $S$ is te sume of a series of i.i.d Bernoulie trials, thus it follows Binomial distribution. So the exact answer is: $P(S\geq210)= \sum_{k=210}^{400}C_{400}^{k}\left(\frac{1}{2}\right)^{400}$ which requires a program to calculate (Actually try implementing this, beware of roudoff errors and compare it against the approximate answer below.).


  • $C_{n}^{k}$ is the notation for “n choose k”, which denotes the number of ways to choose k items from n items where order doesn’t matter.


We use CLT to easily get an approxmate answer quickly. First recognize that for each $I_{i}$ we have $\mu=0.5$ and $\sigma^2=0.5\times(1-0.5)=0.25$. Then, is approximately $\mathcal{N}(0,1)$. For $S \geq 210$, we have $Z\geq1$.

The 68-95-99.7 rule tells us that for a standard Normal distribution $\mathcal{N}(0,1)$, the probability of the random variable taking value more than 1 standard deviation away from the center is $1-0.68=0.32$ and thus the one sided probability for $P(Z\geq1) = 0.32/2 = 0.16$.

Example 2:

Suppose you use Monte Carlo simulation to estimate the numerical value of $\pi$.

  • How would you implement it?
  • If we require an error of 0.001, how many trials do you need?


One possible implementation is to start with a rectangle, say $x \in [-1,1], y\in[-1,1]$. If we uniformly randomly draw a point from this rectangle, the probability $p$ of the point following into the circle region $x^2+y^2\lt1$ is the ratio of the area between the circle and rectangle, i.e $p=\frac{\pi}{4}$

Formally, let random indicator variable $I$ take value 1 if the point falls in the circle and 0 otherwise, then $P(I=1)=p$ and $E(I)=p$. If we do $n$ such trials, and define $M=\frac{1}{n}\sum_{k=1}^n I_{k}$, then $M$ follows approximately $\mathcal{N}(\mu_{I},\frac{\sigma_{I}^2}{n})$. In this setup, $\mu_{I}=E(I)=p$ and $\sigma_{I}^2=p(1-p)$ (see Probability Distribution section for details on $\sigma_{I}^2$).

One thing we need to clarify with the interviewer is what error really means? She might tell you to consider it as the standard deviation of the estimated $\pi$. Therefore the specified error translates into a required sigma of $\sigma_{req}=\frac{error}{4}$ for random variable $M$. Thus $n = \frac{\sigma_{I}^2}{\sigma_{req}^2}=\frac{p(1-p)}{(0.00025)^2}\approx2.7\times 10^6$.

By the way, we can see that the number of trials $n$ scales with $\frac{1}{error^2}$, which is caused by the $\frac{1}{\sqrt{n}}$ scaling of the $\sigma_{M}$ in the CLT, and is generally the computationaly complexity entailed by Monte Carlo integration.

Leave a Comment