瞭解正態分佈【統計學教程】。 (Understanding the Normal Distribution [Statistics Tutorial])

字幕列表影片播放

Welcome back!
In this lecture we are going to introduce one of the most commonly found continuous
distributions – the normal distribution.
For starters, we define a Normal Distribution using a capital letter N followed by the mean
and variance of the distribution.
We read the following notation as “Variable “X” follows a Normal Distribution with
mean “mu” and variance “sigma” squared”.
When dealing with actual data we would usually know the numerical values of mu and sigma
squared.
The normal distribution frequently appears in nature, as well as in life, in various
shapes of forms.
For example, the size of a full-grown male lion follows a normal distribution.
Many records suggest that the average lion weight between 150 and 250 kilograms, or 330
to 550 pounds.
Of course, there exist specimen which fall outside of this range.
Lions weighing less than 150, or more than 250 kilograms tend to be the exception rather
than the rule.
Such individuals serve as outliers in our set and the more data we gather, the lower
part of the data they represent.
Now that you know what types of events follow a Normal distribution, let us examine some
of its distinct characteristics.
For starters, the graph of a Normal Distribution is bell-shaped.
Therefore, the majority of the data is centred around the mean.
Thus, values further away from the mean are less likely to occur.
Furthermore, we can see that the graph is symmetric with regards to the mean.
That suggests values equally far away in opposing directions, would still be equally likely.
Let’s go back to the lion example from earlier.
If the mean is 400, symmetry suggests a lion is equally likely to weigh 350 pounds and
450 pounds since both are 50 pounds away from that the mean.
Alright!
Instead of going through the complex algebraic simplifications in this lecture, we are simply
going to talk about the expected value and the variance.
The expected value for a Normal distribution equals its mean - “mu”, whereas its variance
“sigma” squared is usually given when we define the distribution.
However, if it isn’t, we can deduce it from the expected value.
To do so we must apply the formula we showed earlier: “The variance of a variable is
equal to the expected value of the squared variable, minus the squared expected value
of the variable”.
Good job!
Another peculiarity of the Normal Distribution is the “68, 95, 99.7” law.
This law suggests that for any normally distributed event, 68% of all outcomes fall within 1 standard
deviation away from the mean, 95% fall within two standard deviations and 99.7 - within
3.
The last part really emphasises the fact that outliers are extremely rare in Normal distributions.
It also suggests how much we know about a dataset only if we have the information that
it is normally distributed!
Fantastic work, everyone!