字幕列表 影片播放
- [Instructor] What we're going to do in this video
is talk about hypothesis testing,
which is the heart of all of inferential statistics,
statistics that allow us to make inferences about the world.
So, to give us the gist of this,
let's start with a tangible example.
Let's say, hypothetically,
you run a website that has the mission
of giving everyone on the planet a free education,
and you wanna think about how you might change
the amount of time people spend on the site.
Ideally, you wanna increase the amount of time
people spend on the site
so there's more learning on the planet.
Well, currently the website has a white background like this
and the mean amount of time people spend
when you have a white background,
the mean amount of time when you have a white background
is 20 minutes.
And you or someone on your team,
maybe you read some type of study that says
people like to spend more time on yellow backgrounds.
I don't actually think that's true,
but let's just go with that for the sake of this video.
And so you have a hypothesis
that if you actually have a yellow background,
if you change your background to yellow,
that the mean amount of time that people spend
on a yellow background, on yellow,
is going to be different, is not going to be equal to
the mean amount of time people spend on a white background.
So, the question is how do you test this,
and how do you feel good about your inferences
that you make from your test?
And that is the heart of hypothesis testing.
And medical research, actually almost all research
involves some form of hypothesis testing.
So, how would you do this?
Well, the standard why to do this
is to set up a couple of hypothesis.
Hypotheses, I should say.
The first one is known as your null hypothesis,
and I often think about this as the skeptic's hypothesis.
Skeptics think that,
hey, it's hard to make a difference in this world,
or cynics feel like it's hard
to make a difference in the world
and so they always have this null hypothesis
that's saying, "Hey, you think you're making a difference,
"but you aren't."
So, the null hypothesis is that
the mean amount of time people spend on the yellow site,
or on a yellow site,
is going to be equal to the mean amount of time
that people spend on the current site
or the existing site or on a white site,
while the people who are thinking about,
"Hey, how do I make change?
"How do I make improvements in the world?"
they had some type of hypothesis
and we call that the alternative hypothesis.
And so the alternative hypothesis, A for alternative,
is that the mean time on the yellow site,
on the yellow site,
is actually different.
Is actually different.
It is not equal to the mean amount of time
on the white site.
So, how do we think about this
now that we set up these hypotheses?
Well, what we're going to do is
we are going to assume,
we assume the null hypothesis.
Then we build this yellow site
and then we take a sample
of the people using the yellow site,
and we say, "What is the probability
"of getting that sample mean,"
which is an approximation of the parameter of the true mean,
"what is the probability of getting that sample mean
"if we assume the null hypothesis?"
And if the probability of getting that sample mean
on the yellow site,
assuming the null hypothesis, is really low,
then we reject the null hypothesis,
which suggests the alternative.
On the other hand, if we get a sample mean
that seems pretty reasonable to get
if you assume the null hypothesis,
then we fail to reject the null hypothesis
and then that would not suggest the alternative.
Now, to make this a little bit more tangible,
and we'll go over this into a lot of videos,
if you assume the null hypothesis,
then there's a few things you can think about.
You can think about just the general distribution
of the amount of time people spend on the site.
It would look something like this.
We will, for this sake,
assume that it's a normal distribution,
and normal distributions are very important,
and/or things that are close to normal distributions,
for hypothesis testing.
But let's say that it's a normal distribution
of the amount of time people spend on the site
and so there is some mean.
We know that mean,
so the mean that people spend on that white site
is equal to 20 minutes.
And, remember, we're assuming the null hypothesis,
so we're assuming that this is also the amount of time
that people would spend on the yellow site.
We've assumed, assuming, the null hypothesis,
and you could view this as time or distribution
of time spent.
Now, one of the things we're going to talk about
in future videos is if you have this distribution,
you can actually come up with another distribution
of the means of samples you might get.
So, there's something else called the sampling distribution,
and I know it's very confusing at first.
Sampling distribution of the sample
of the sample mean,
and it'll be for a given sample size,
for sample size, sample size.
Let's say this is sample size 1,000.
I'm just making things up.
I could've said N,
but I'm just gonna make this a little bit more tangible.
Well, we're going to get statistical methods
for how you can think about this distribution
assuming this distribution we have on the left.
And it turns out this distribution
is going to look like the one on the left,
but it's going to be narrower around that mean.
It's going to look something like this.
And, actually, the larger your sample sizes are going to be,
the narrower it's going to get.
Now, remember, this isn't just the distribution
of the amount of time people spend on the site.
This is the distribution that if I were to take a sample
of the amount of time people spend on the site
and calculate the means,
this is the distribution of those sample means I might get.
Now, the center of this distribution is still
our mean for white which is equal to the mean for yellow.
Remember, we're assuming the null hypothesis.
The mean for yellow.
But each of these points,
for example, if I think about this,
this is amount of time that someone might spend
and you can see that there's a low probability about it.
This over here, this would be a sample mean you might get
for a time that you sampled 1,000 people
and you calculated the mean,
and you see that there's a low probability for it.
So, then what you would do is,
if you were able to statistically generate these things
assuming the null hypothesis,
and don't worry too much,
we'll find out the techniques for doing this
and the assumptions we need to make for doing this,
what we do is then take a sample of 1,000.
So, you take your sample of 1,000,
so sample 1,000,
and then from that you are able to calculate a sample mean.
You are able to calculate that.
And let's say you get a sample mean of 30 minutes.
And let's say, actually, that that is right over here,
that this is 30 minutes right over here.
The center was 20 minutes.
The next thing, what you do is you say,
"What's the probability of getting a result
"at least that extreme assuming the null hypothesis?"
And that high probability on these curves,
it would be this right tail here
and it would be the left tail
that is equally far on the left side,
so it'd be like that.
And what you do is you look.
You look at this probability,
which would be these yellow areas there,
and then we think about the probability
of getting a result at least as extreme as 30 minutes.
So, probability of getting,
getting a sample mean at least
as extreme
as the sample mean equaling 30 minutes,
assuming, assuming
your null hypothesis,
and that's exactly what those yellow areas are all about.
And you compare that to some pre-specified threshold.
So, that threshold is oftentimes 5%.
Sometimes it's 1%.
But if this probability is less than or equal to,
if it's less than or equal to your threshold,
and the threshold is oftentimes denoted
by the Greek letter alpha,
well, we say, "Hey, that was a very low probability
"of getting a result at least this extreme
"if we assume the null hypothesis,"
and so that will allow us to reject,
reject the null hypothesis,
which would suggest, suggest the alternative.
Notice we haven't proven the alternative.
We also haven't proven the the null hypothesis
is for sure false.
We've just said if we assume the null hypothesis,
there's a very low probability of getting a result
at least as extreme as what we just got,
so we will reject the null.
Now, if it's the other way around,
if the probability of getting a sample mean
at least as extreme as this is still reasonable,
if it's greater than your pre-specified threshold,
then you fail to reject the null.
You fail to reject
your null hypothesis.
So, I'll leave you there.
In future videos,
we'll go into much more depth into all of this,
but this is to give you a sense of how hypothesis testing
allows science or all of us in the world
to start making inferences that we can feel good about.