Placeholder Image

字幕列表 影片播放

  • In this video and the next few videos,

  • we're just really going to be doing a bunch of calculations

  • about this data set right over here.

  • And hopefully, just going through those calculations

  • will give you an intuitive sense of what

  • the analysis of variance is all about.

  • Now, the first thing I want to do in this video

  • is calculate the total sum of squares.

  • So I'll call that SST.

  • SS-- sum of squares total.

  • And you could view it as really the numerator

  • when you calculate variance.

  • So you're just going to take the distance between each

  • of these data points and the mean of all of these data

  • points, square them, and just take that sum.

  • We're not going to divide by the degree of freedom, which

  • you would normally do if you were calculating

  • sample variance.

  • Now, what is this going to be?

  • Well, the first thing we need to do,

  • we have to figure out the mean of all of this stuff over here.

  • And I'm actually going to call that the grand mean.

  • And I'm going to show you in a second

  • that it's the same thing as the mean of the means of each

  • of these data sets.

  • So let's calculate the grand mean.

  • So it's going to be 3 plus 2 plus 1 plus 5 plus 3 plus 4

  • plus 5 plus 6 plus 7.

  • And then we have nine data points here

  • so we'll divide by 9.

  • And what is this going to be equal to?

  • 3 plus 2 plus 1 is 6.

  • 6 plus-- let me just add.

  • So these are 6.

  • 5 plus 3 plus 4 is 12.

  • And then 5 plus 6 plus 7 is 18.

  • And then 6 plus 12 is 18 plus another 18 is 36, divided by 9

  • is equal to 4.

  • And let me show you that that's the exact same thing

  • as the mean of the means.

  • So the mean of this group 1 over here--

  • let me do it in that same green--

  • the mean of group 1 over here is 3 plus 2 plus 1.

  • That's that 6 right over here, divided by 3 data

  • points so that will be equal to 2.

  • The mean of group 2, the sum here is 12.

  • We saw that right over here.

  • 5 plus 3 plus 4 is 12, divided by 3

  • is 4 because we have three data points.

  • And then the mean of group 3, 5 plus 6

  • plus 7 is 18 divided by 3 is 6.

  • So if you were to take the mean of the means, which

  • is another way of viewing this grand mean, you have 2 plus 4

  • plus 6, which is 12, divided by 3 means here.

  • And once again, you would get 4.

  • So you could view this as the mean

  • of all of the data in all of the groups

  • or the mean of the means of each of these groups.

  • But either way, now that we've calculated it,

  • we can actually figure out the total sum of squares.

  • So let's do that.

  • So it's going to be equal to 3 minus 4--

  • the 4 is this 4 right over here-- squared plus 2 minus 4

  • squared plus 1 minus 4 squared.

  • Now, I'll do these guys over here in purple.

  • Plus 5 minus 4 squared plus 3 minus 4 squared plus 4 minus 4

  • squared.

  • Let me scroll over a little bit.

  • Now, we only have three left, plus 5 minus 4 squared

  • plus 6 minus 4 squared plus 7 minus 4 squared.

  • And what does this give us?

  • So up here, this is going to be equal to 3 minus 4.

  • Difference is 1.

  • You square it.

  • It's actually negative 1, but you square it, you get 1,

  • plus you get negative 2 squared is 4, plus negative 3 squared.

  • Negative 3 squared is 9.

  • And then we have here in the magenta 5 minus 4

  • is 1 squared is still 1.

  • 3 minus 4 squared is 1.

  • You square it again, you still get 1.

  • And then 4 minus 4 is just 0.

  • So we could-- well, I'll just write the 0 there just

  • to show you that we actually calculated that.

  • And then we have these last three data points.

  • 5 minus 4 squared.

  • That's 1.

  • 6 minus 4 squared.

  • That is 4, right?

  • That's 2 squared.

  • And then plus 7 minus 4 is 3 squared is 9.

  • So what's this going to be equal to?

  • So I have 1 plus 4 plus 9 right over here.

  • That's 5 plus 9.

  • This right over here is 14, right?

  • 5 plus-- yup, 14.

  • And then we also have another 14 right over here

  • because we have a 1 plus 4 plus 9.

  • So that right over there is also 14.

  • And then we have 2 over here.

  • So it's going to be 28-- 14 times 2, 14

  • plus 14 is 28-- plus 2 is 30.

  • Is equal to 30.

  • So our total sum of squares-- and actually,

  • if we wanted the variance here, we

  • would divide this by the degrees of freedom.

  • And we've learned multiple times the degrees of freedom

  • here so let's say that we have-- so we

  • know that we have m groups over here.

  • So let me just write it as m and I'm not

  • going to prove things rigorously here,

  • but I want to show you where some

  • of these strange formulas that show up in statistics books

  • actually come from without proving it rigorously.

  • More to give you the intuition.

  • So we have m groups here.

  • And each group here has n members.

  • So how many total members do we have here?

  • Well, we had m times n or 9, right?

  • 3 times 3 total members.

  • So our degrees of freedom-- and remember,

  • you have however many data points

  • you had minus 1 degrees of freedom

  • because if you know the mean of means,

  • if you assume you knew that, then only 9 minus 1,

  • only eight of these are going to give you new information

  • because if you know that, you could calculate the last one.

  • Or it really doesn't have to be the last one.

  • If you have the other eight, you could calculate this one.

  • If you have eight of them, you could always

  • calculate the ninth one using the mean of means.

  • So one way to think about it is that there's

  • only eight independent measurements here.

  • Or if we want to talk generally, there

  • are m times n-- so that tells us the total number of samples--

  • minus 1 degrees of freedom.

  • And if we were actually calculating the variance here,

  • we would just divide 30 by m times n minus 1

  • or this is another way of saying eight degrees of freedom

  • for this exact example.

  • We would take 30 divided by 8 and we would actually

  • have the variance for this entire group,

  • for the group of nine when you combine them.

  • I'll leave you here in this video.

  • In the next video, we're going to try to figure out

  • how much of this total variance, how much of this total

  • squared sum, total variation comes

  • from the variation within each of these groups

  • versus the variation between the groups.

  • And I think you get a sense of where

  • this whole analysis of variance is coming from.

  • It's the sense that, look, there's

  • a variance of this entire sample of nine,

  • but some of that variance-- if these groups are

  • different in some way-- might come from the variation

  • from being in different groups versus the variation from being

  • within a group.

  • And we're going to calculate those two things

  • and we're going to see that they're

  • going to add up to the total squared sum variation.

In this video and the next few videos,

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級

方差分析1:計算SST(總平方和)|概率與統計|可汗學院 (ANOVA 1: Calculating SST (total sum of squares) | Probability and Statistics | Khan Academy)

  • 45 7
    Jack 發佈於 2021 年 01 月 14 日
影片單字