[Note: **I've made a Jupyter Notebook (Python) for this so that you can mess around with a few of these ideas yourself. The figures come from this notebook.**]

⚫ ⚫ ⚫ ⚫

After hearing the definition and the "gist" of the covariance of two discrete random variables, $$X, Y$$, you might think it sounds pretty close to the slope of the line of best fit when looking at $$Y$$ as a function of $$X$$. And you'd be right! But how close is it to slope? We quickly investigate this using some randomly generated data.

First, let's plot some sample data. Here, I've given the slope (as "M") in the title of these graphs, and the covariance of $$X$$, $$Y$$ in the title (as "CovXY").

Judging from these plots, we see that Cov($$X$$, $$Y$$) seems to be realated to slope somehow. But how? Let's see how far off slope usually changes from the covariance of $$X$$ and $$Y$$:

Where the slope here is 8.72 (for this sample data). Interestingly, we find that the same plot but against $$X$$ and $$X$$ results in the following image:

Which looks like nothing until we realize that the average value of these points is 8.65. That's quite the coincidence!

In fact, it is possible to derive the slope of the line of best fit for $$Y$$ as a function of $$X$$ is given by $$\frac{Cov(X, Y)}{Cov(X, X)}$$, or, equivalently, $$\frac{Cov(X, Y)}{Var(X)}$$. Similarly, the plot for $$X$$ as a function of $$Y$$ will be similar, but with $$Var(Y)$$ in the denominator.