字幕列表 影片播放 列印英文字幕 Hi, I'm Adriene Hill, and Welcome back to Crash Course Statistics. Last week I ordered a pair of gold lame pants with DFTBAQ embroidered on them. The delivery guy said they could come by the next day at exactly 11am on the dot! Just kidding. That never happens. Instead of an exact time, the pants guy gave me a range of times...he said they'd be there sometime between 8am and 2pm. A lot of anticipation… We've focused a lot on point estimates, like the mean, which are our best guesses, but we can give ourselves a little more wiggle room. Let's talk about Confidence Intervals. INTRO It's useful to give pregnant mothers a “due date” when their children will most likely be born. But it might be more accurate to say that doctors expect the baby to come around the due date, not exactly on it. And...when pollsters claim that a candidate will get around 30% of the vote, plus or minus 2%. We can represent the “around” part with a confidence interval. You may have seen the term “confidence interval” paired with a percentage like 95%. A “confidence interval” is an estimated range of values that seem reasonable based on what we've observed. It's center is still the sample mean, but we've got some room on either side for our uncertainty. So when the delivery guy says my pants are coming between 8 and 2--he's reflecting his uncertainty...the very LARGE frustrating uncertainty, about when he'll be there. For example, a dentist thinks the mean number of cavities the average person has in a 5 year span is greater than 1 and wants to calculate a 95% CI to see if there's evidence that he's right. He rounds up a random sample of 100 patients from around the country, and finds that this group has a mean of 3 cavities with a standard deviation of 0.5 cavities. The way we choose that confidence range is related to the distribution of sample means. The dentist's estimate of the sampling distribution looks like this: And instead of grabbing just the mean, the dentist can include a range of the most common 95% of the sample means that we expect from this estimate of the distribution of sample means. So now we have a 95% confidence interval from 2.902 to 3.098 cavities. Giving a range of numbers instead of just an estimate for the mean better represents the fact that there's some uncertainty and variation when we estimate population parameters--like the mean, proportion, or regression slope--from a sample. The interpretation of this confidence interval is a bit more complex. To understand what a confidence interval really is, we have to ask ourselves “what if?”. If the dentist's sample was taken again, we wouldn't expect that the mean and standard deviation of cavities would be exactly 3 and 0.5. They'd probably be a little different. Which means that our 95% confidence interval would be different than the one we got before. And if we did it 100 more times with the same sample size, we'd get 100 slightly different confidence intervals. The 95% in a 95% confidence interval tells us that if we calculated a confidence interval from 100 different samples, about 95 of them would contain the true population mean. Our “confidence” is in the fact that the procedure of calculating this confidence interval will only exclude the population mean 5% of the time. That definition implies that it's possible that the confidence interval that we created doesn't include the true population mean. We have no way of knowing for sure. But the confidence intervals usually contain the true population mean. Now that we know what a confidence interval is, it might be useful to calculate it. A 95% CI is the range that contains the middle 95% of the values of our estimated sampling distribution. And to get that range, we can use a z-score. A z-score tells us the distance between the mean of a distribution and a data point in standard deviations. Previously, we've used z-scores to help us find percentiles. And we want the middle 95% of the data. So we want our cutoffs to be at the 2.5th percentile and the 97.5th percentile so that 95% of the values are within our range, and 5%--2.5% on either side--are not. To calculate the 95% confidence interval for a sample of 49 chocolate cakes with a mean of 3,000 calories and a standard deviation of 500 calories, we can use a z-score of 1.96 (which we got from a table) to calculate the 97.5th percentile, and a z-score of -1.96 to calculate the 2.5th percentile. But we need to turn our z-scores back into calorie values. To do so, we multiply by the standard error, 71.4 calories and add the mean of 3,000 calories to get the 95% confidence interval for our sample. We think it's likely that the real population mean for number of calories in a chocolate cake is in that range, though we're not sure. What we can have confidence in, is that if we're in a situation where we're constantly taking samples like this and we assume that the true mean is inside of every Confidence Interval, we'll only be wrong 5% of the time. For example, a gummy worm factory periodically checks whether their bagging machines are calibrated correctly. So each week, they take a sample of 100 bags of gummy worms, measure the mean weight and standard deviation, and calculate a 95% confidence interval. They use the Confidence interval to make a decision about whether to pay an expensive repair man to come repair the gummy worm bagging machine. They want their bags of gummy worms to have around 10oz of gummy treats, and decide that as long as the confidence interval contains 10oz--their ideal weight--they'll assume their machine is fine. Decisions based on their confidence intervals will lead them to call an unnecessary repairman only 5% of the time. Many researchers use confidence intervals to see if they contain a certain value of interest. A researcher may want to know if say a certain number of calories in cake is plausible. If the sampled value were to fall within their CI it would seem possible, but it's not possible to rule out even if it's outside the interval. Because you don't know if you got the 95% of CI's that contain the true mean or the 5% that don't. You don't always need to use a confidence interval of 95%, we can calculate other confidence intervals too. You can calculate a 99% confidence interval, or really any percentage confidence interval. But if you try to calculate a 100% confidence interval, it'll always be negative infinity to positive infinity, which just shows that the larger you want your confidence percentage to be, the wider your interval will be. You can be more hopeful that your confidence interval contains the true population mean, but it's not going to be that helpful. So there's a balancing act going on. You want a confidence interval that's narrow enough to be useful, but wide enough that the true population mean will usually be inside a confidence interval of that percent. We can't always have large samples. It's often the case that there's not enough time or money to collect 100s of data points to calculate a confidence interval. With small sample sizes, the distribution of sample means isn't always exactly normal, so we often use a t-distribution instead of a z-distribution to find out where the middle 95% of our data is. The t-distribution, like the z-distribution, is a continuous probability distribution that's unimodal. It's a useful way to represent sampling distributions. The t-distribution changes its shape according to how much information there is. With small sample sizes there's less information so the t-distribution has thicker tails to represent that our estimates are more uncertain when there's not much data. However as we get more and more data, the t-distribution becomes identical to the z-distribution. Generally, sample sizes that are greater than 30 are considered “large enough” because scientists generally believe that sampling distributions where the sample is 30+ are close enough to normal...though 30 is an arbitrary cutoff just like 0.05. However, when we're estimating population proportions, like the proportion of people who are color blind, the general rule is that your sample size need to be big enough so that on average, you'd expect to get at least 10 colorblind, and at least 10 non-colorblind people. For similar reasons, most people consider that “close enough”. Since about 8% of males are colorblind, if I only had a sample of 50 males, on average I'd expect around 4 males per group to be color blind, so my sample size wouldn't be quite big enough to assume it's normal. Instead I'd use the almost normal t-distribution. If a drug that's being developed claimed to reduce the proportion of colorblind males born to mothers who took it, we could take a sample of 50 male infants to see if the proportion of colorblindness is different from 8%. Though colorblindness isn't usually life threatening, it can be inconvenient, so you decide to calculate a confidence interval to see if it's likely to be effective. After randomly selecting 50 male infants from mothers who took the drug, you calculate the sample proportion of colorblind infants, which is 6%, and calculate the distribution of sample proportions which has a mean of 6%--the same as the sample mean--and a standard error of 0.033. Since our sample size isn't big enough to assume that the distribution of sample proportions is shaped like the z-distribution, we can use the t-distribution to calculate the range of our 95% confidence interval. I mentioned before that the t-distribution's shape changes with how much data we have. We'll talk more in detail later as to how to choose the right t-distribution, but for now, we'll use this one: While t-score tables do exist, it's often easier to have a statistical program calculate the t-values that correspond to the 2.5th and 97.5th percentiles, since there are many different t-distributions. Your computer tells you that the t-values corresponding to those percentiles are 2.01 and -2.01. And to convert to a raw score from a t-score, we again use this formula, just with a t-score instead of a z-score. Our confidence interval for proportion of colorblind males is -0.6% to 12.63%. 8% is inside our confidence interval, so it's not too much of a stretch to think that 8% could be the true population proportion, even though we only observed a sample proportion of 6%. Based on this confidence interval we don't have any evidence to conclude whether this medicine is effective or not. So since the company researching the drug is pretty cautious, they decide not to go ahead with it. One place you may have seen confidence intervals “in the wild” is in the news during election season. When newscasters report results from exit polls they'll usually say something like “Candidate A is tracking at 64%, with a margin of error of 3 %” or you may see a chart like this: The margin or error is usually telling you how far the bounds of the confidence interval are from the mean, and is represented by this part of the confidence interval formula: The margin of error, just like a confidence interval, reflects the uncertainty that surrounds sample estimates of parameters like the mean or a proportion. If a poll shows that a Presidential candidate is tracking at of 64% of the vote, plus or minus 3%, we shouldn't be surprised if it turns out that the true vote was 61%, since that's within the margin or error. You can think of values inside the margin of error or confidence interval as values that might be reasonable estimates of the true population parameter. Confidence intervals quantify our uncertainty. They also demonstrate the tradeoff of accuracy for precision. A 100% confidence interval will always contain the true population mean, but it's useless. We have to sacrifice a little bit of accuracy in order to gain more precision. A 99% confidence interval will give us a more useful range since it won't be infinitely long..., but It's now possible that our confidence interval won't contain the true mean. And you've probably encountered this tradeoff in your daily life. Say you're running a marathon (like everybody does) and you want to load up your iPhone with music, but you don't know how long you're going to take, you could buy 150 songs on iTunes, which is expensive, or you could buy only 70 and have a chance of running out of music. You increase your risk of not having enough, but then again you're saving yourself from having to buy 80 extra songs… Maybe it's time for a streaming service? Confidence intervals demonstrate this delicate balancing act... and help us understand how to hit the sweet spot of information vs. accuracy. Thanks for watching, I'll see you next time in my gold lame pants.