Now in the picture you can see two values, denoted as μ and σ^2, for the different colored probability density functions. These are the two parameters that completely define a normally distributed random variable: μ is the Expected Value and σ^2 is the Variance.

This is incredibly important to understand. All normally distributed random variables only have 2 free parameters. What do I mean by “free” parameters? We will give this more precision over time, but basically for now think of it as follows: a given Expected Value and Variance completely define a normally distributed Random Variable. So even though these random variables can take on an infinity of values, the probability distribution across these values is very tightly constrained.

Contrast this with a discrete random variable X four possible values x1, x2, x3 and x4. Here the probability distribution p1, p2, p3, p4 has the constraint that p1 + p2 + p3 + p4 = 1 where pi = Prob(X = xi). That means there a 3 degrees of freedom because the fourth probability is determined by the first 2. Still that is one more degree of freedom than for the Normal Distribution, despite having only four possible outcomes (instead of an infinity).

Why does this matter? Assuming that something is normally distributed provides a super tight constraint. This should remind you of the discussion we had around independence. There we saw that assuming independence is actually a very strong assumption. Similarly, assuming that something is normally distributed is a strong constraint because it means there are only two free parameters characterizing the entire probability distribution.