Name: T型測試。天作之合速成班統計27號 (T-Tests: A Matched Pair Made in Heaven: Crash Course Statistics #27)
Uploaded: 2021-01-14T10:30:02.000Z
Duration: 11 min 17 s
Description: 【看影片學英語】數萬部 YouTube 影片，搭配英漢字典即點即查，輕鬆掌握單字發音與用法，長久累積看電影不必再看字幕。

Hi, I'm Adriene Hill, and welcome back to Crash Course Statistics.

In the last episode we dove into the logic surrounding test statistics and talked about

a general formula that allows us to create them for lots different situations.

There are so many questions we might want to answer, and it would be rough if we had

to memorize a new formula for EVERY Single One.

And sometimes Statistics is taught in a way that makes it seem like there's a different

formula you need to know if you want to test whether your bus is late more often than the

Or if burns treated with aloe heal faster than those that are left alone.

We can adapt the general formula...in all sorts of situations.

Let's say that you just moved to a new place, and you're looking for the BEST coffee in town.

Since you've been watching Crash Course Statistics, you decide to do a little impromptu experiment.

Word on the street is there are two really popular coffee places near you, Caf-fiend

So one Sunday after brunch, you grab a random sample of 16 of your new friends, and randomly

give half of them an unmarked cup with coffee from Caf-fiend, and the other half an unmarked

You made sure to get the same roast--dark--to keep things as even as possible.

After delicate sniffs and sips of coffee in a process known as “cupping”, the tallies are in.

On a scale of 1 to 10, Caf-fiend got a mean score of 7.6 and The Blend Den got a mean

So we observe a difference between the coffee scores.

Coffee from Caf-fiend scored 0.3 points lower than Coffee from The Blend Den.

There's no difference between the two coffee shops.

And then our alternative hypothesis, that there is a difference.

In this case, we're interested in whether the mean scores for coffee are different between

With a little algebra, we can see that this is the same thing as asking whether the difference

Now that we have our hypotheses, we can do a t-test.

Specifically, we'll do a two sample t-test, also called an independent or unpaired t-test.

The formula for a two sample t-test follows our general test statistic formula:

If the null hypothesis were true and there's no difference between the coffee shops, we'd

For this kind of t-test, our measure of average variation is the standard error.

For two groups, the standard error is calculated a bit differently since we have to account

Here, we're squaring the standard deviation to get the variance and n1 and n2 are the

sizes of the two groups--both are 8 here.

Now that we have our t-value, we can figure out if there's a statistically significant

difference between the two coffee shops and there are two ways to do this.

We can calculate the critical t-value and if our t-statistic is GREATER than the critical

Or we can calculate the p-value from our t-statistic and we can reject the null hypothesis if the

p-value is SMALLER than our chosen alpha level.

To do either of these things, we'll need to choose our alpha level.

But usually people will use 0.05 since that means that in the long run, only 5% of tests

done on groups with no real difference will incorrectly reject the null.

So, we'll conform :) and use an alpha of 0.05 here.

To calculate our critical t-value we need to find the t-values which correspond to the

top 5% most extreme values in our t-distribution.

Usually a computer or a calculator will do this for you, so we won't go into the formula,

The cutoffs for our specific problem are about -2.145 and 2.145.

We have two cutoffs because we're doing a two tailed test.

We want to reject the null if coffee from Caf-fiend is better or if coffee from The

We can already tell that we should fail to reject the null.

That there's no clear difference between the quality of the coffee.

Our t-statistic of about 0.44 is isn't close to -2.145 OR 2.145.

The critical value and p-value approach will give you identical results, so we don't

But for the sake of showing we get the same outcome…our calculated p-value is 0.6684.

We reject the null if the p-value is smaller than alpha, so again we fail to reject since

One thing that's nice about the p-value approach, and the reason we'll mainly rely

on it throughout the rest of these examples, is that p-values are easier for us non-computers

A p-value of 0.6684 means that if there were NO difference in scores between coffee from

Caf-fiend and coffee from The Blend Den, we'd still expect to see a difference in our sample

means that's 0.3 or greater pretty often...

Since our observed difference of 0.3 or greater is pretty common under the null hypothesis,

we haven't found evidence that it's a bad fit.

So right now we don't have any evidence that one coffee shop is better than the other.

But remember, absence of evidence is not evidence of absence.

And while our coffee excursion and experiment were well designed, we can probably improve it.

If you look at the scores that your friends gave the coffees, you'll see that there's

one person who tried coffee from Caf-fiend and really hated it.

After looking through your scorecards, you realize it's Alex , who has mentioned in

the past that she just doesn't love coffee.

Even though you randomly assigned your friends to get either coffee from Caf-fiend or coffee

from The Blend Den, that design didn't account for the fact that some people just like coffee

Alex might give the best coffee in the world a measly 6 point rating just because...coffee's