Can We Estimate the Area Under the Normal Distribution Using a few terms from the Taylor Expansion?

We know we can find the area under the normal curve by integrating it but if you don't know the trick to doing so then you might spend a few hours running in palce. Suppose we don't know the trick to integrating the distribution and suppose that we don't have a computer that lets us easily estimate this integral. Can we get a relatively good approximation using something like a Taylor Expansion and calculating a few terms? Short answer: not really. But let's see why.

The pdf of a normal distribution with $$\mu = 0$$ and $$\sigma = 1$$ is given by \[P(x) = \frac{1}{\sqrt{2\pi}}\exp\left({\frac{-x^{2}}{2}}\right).\] Let's do the Taylor Expansion. These are usually fairly friendly when we try to integrate them since they're just polynomials. The only thing we'll need is the Taylor expansion for $$\exp$$ since the rest is just a constant.

Let's set $$u = \frac{-x^{2}}{2}$$ for a moment. If we only do a second order approximation, \[\begin{align} &\approx \frac{1}{\sqrt{2\pi}}\left(1 + u + \frac{u^{2}}{2}\right)\\ &= \frac{1}{\sqrt{2\pi}}\left(1 + \frac{-x^{2}}{2} + \frac{\left(\frac{-x^{2}}{2}\right)^{2}}{2}\right)\\ &= \frac{1}{\sqrt{2\pi}}\left(1 - \frac{x^{2}}{2} + \frac{x^{4}}{8}\right)\\ \end{align}\] You might see a pattern here! Either way, let's integrate. \[\begin{align} &\int_{a}^{b}\frac{1}{\sqrt{2\pi}}\left(1 - \frac{x^{2}}{2} + \frac{x^{4}}{8}\right)\,dx\\ & = \frac{1}{\sqrt{2\pi}}\left(\int_{a}^{b}1\,dx - \int_{a}^{b}\frac{x^{2}}{2}\,dx + \int_{a}^{b}\frac{x^{4}}{8}\,dx\right) \\ &=\frac{1}{\sqrt{2\pi}}\left(x\Big|_{a}^{b} - \frac{x^{3}}{6}\Big|_{a}^{b} + \frac{x^{5}}{40}\Big|_{a}^{b}\right)\\ \end{align}\] At this point, we can fairly easily evaluate things. Let's try between 1 standard deviation (-1 to 1). Note that the original function is even so we can integrate from $$0$$ to $$b$$ and multiply by 2 if we know that $$a \leq 0$$. Just to keep it general, we won't do this here. \[\begin{align} &=\frac{1}{\sqrt{2\pi}}(2 - \frac{2}{6} + \frac{2}{40})\\ &=\frac{1}{\sqrt{2\pi}}(2 - \frac{1}{3} + \frac{1}{20})\\ &\approx \frac{1}{2.5}(1.71666) \approx 0.6866 \end{align}\] Which is surprisingly close to the actual value! If we use two standard deviations, do we get the same close approximation? Unfortunately, no. As we get closer to the tail ends of the function, we get worse and worse approximations; in this case, we'd get 0.85 when it should be around 0.95. Why? Let's look at the picture of the convergence of the Fourier Expansion compared to the original function. Python code is included here for reference.

import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

a = pow(2*np.pi, -0.5) # normalization constant
N = lambda x: a*np.exp(-x**2/2)
x = np.linspace(-4, 4, num = 100)

plt.axis([-3, 3, 0, 0.5])
# Actual Normal Curve.
plt.plot(x, N(x), c='b', linewidth = 5)
# Approximations.
plt.plot(x, a*np.ones(shape = x.shape))
plt.plot(x, a*(1 - pow(x, 2)/2))
plt.plot(x, a*(1 - pow(x, 2)/2 + pow(x, 4)/8))
plt.plot(x, a*(1 - pow(x, 2)/2 + pow(x, 4)/8) - pow(x, 6)/16)
plt.plot(x, a*(1 - pow(x, 2)/2 + pow(x, 4)/8) - pow(x, 6)/16 + pow(x, 8)/32)

We see here that this is very slow to converge on the tails. Anything after one standard deviation is not even close.

Homework for the reader. Modify the code above (or write your own) and see what order the Taylor approximation must be in order to get "close" to the curve between two standard deviations? What about between three standard deviations?

So, we have our answer. Can you do a nice pen-and-paper approximation of the normal curve with just a few terms of the Taylor Expansion? You can, but you probably shouldn't.