Placeholder Image

字幕列表 影片播放

  • Hi, I’m Adriene Hill, and this is Crash Course Statistics.

  • Welcome to a world of probabilities, paradoxes and p-values.

  • There will be games.

  • And thought experiments.

  • And coin flipping.

  • A lot of coin flipping.

  • Statisticians love to talk about coin flipping.

  • By the time we finish the course, youll know why we use statistics.

  • And how.

  • And what questions you ought to be asking when you run across statistics in the world.

  • Which is ALL THE TIME.

  • Statistics can help you make a guess whether or not youre going to be accepted to Harvard.

  • Marketers use them to sell us gold-lame pants.

  • Netflix uses stats to predict what show we might want to watch next.

  • You use statistics when you look at the weather forecast and decide what to wear--dress or jeans.

  • Policy makers use them to decide whether or not to invest in more early childhood education,

  • whether or not to spend more on mental health services.

  • Statistics is all about making sense of data--and figuring out how to put that information to use.

  • Today, were going to answer the questionWhat IS Statistics?”

  • INTRO

  • The legend says that during a late 1920’s English tea at Cambridge, a woman claimed

  • that a cup of tea with milk added last tasted different than tea where the milk was added first.

  • The brilliant minds of the day immediately began to think of ways to test her claim.

  • They organized eight cups of tea in all sorts of patterns to see if she really could tell

  • the difference between the milk first and tea first cups.

  • But even after they had seen her guesses, how could they really decide?

  • Because, she’d get about half the cups right just by randomly guessing either milk or tea.

  • And even if she really could tell the difference, it’s completely possible that she would

  • miss a cup or two.

  • So how could you tell if this woman was actually a tea-savant?

  • What is the line between lucky tea guesser and tea supertaster?

  • As fate would have it, future super-statistician and part time potato scientist Ronald A. Fisher

  • was in attendance.

  • During his lifetime, Fisher began work that set the stage for a large portion of Statistics

  • which is the focus of this series.

  • These statistics can help us make decisions in uncertain situations, tea-taste-tests and beyond.

  • Fisher’s insights into experimental design helped turn statistics into its own scientific

  • discipline.

  • And, although Fisher didn’t publish results of this tea-test...the story has it...the

  • woman sorted all the tea cups correctly.

  • Just in case you were curious.

  • At this point, it’s worth mentioning that there are two related--but separate--meanings

  • of the word statistics.

  • We can refer to the field of statistics... which is the study and practice of collecting

  • and analyzing data.

  • And we can talk about statistics as in facts about... or summaries... of data.

  • To answer the questionWhat is statistics?”, we should first...

  • ...ask the questionWhat can statistics do?”

  • Let's say you wake up at your desk after a long evening studying for finals with a cheeseburger

  • wrapper stuck to your face.

  • And you wonder... "why do I eat this stuff?

  • Is fast food controlling my life?"

  • But then you tell yourself, "No.

  • It's just super convenient.."

  • But you're worried, you're thinking about how great it is that McDonald's serves breakfast

  • all day RIGHT NOW.

  • But maybe that's normal, finals are this week afterall, so you google the questionFast

  • Food consumptionand you find the results of a fast food survey.

  • The first thing you might do is start asking questions that interest you.

  • For example, you could ask, Why do people eat fast food?

  • Do people eat more fast food on the weekend than on weekdays?

  • Does eating fast food stress me out?

  • Now that we have some interesting questions, we need to ask ourselves an even more important

  • one: Can these questions be answered by statistics?

  • Like I mentioned earlier, statistics are tools for us to use, but they can’t do all the

  • heavy lifting.

  • To answer the question about why people eat fast food, you can ask them to fill out a

  • questionnaire, but you can’t know whether their answers truly represent what theyre

  • thinking.

  • Maybe they answer dishonestly because they don’t want to admit that they scarf McDonalds

  • because theyre too tired to cook dinner, or because they are ashamed to admit they

  • think Del Taco is delicious, or because none of the given answers represented their reasons,

  • or they may not really know why they eat fast food.

  • Armed with the results of the survey, you could tell you that the most common reason

  • that people reported eating fast food was convenience, or that the average number of

  • meals they eat out each week is five.

  • But youre not truly measuring why people eat so much fast food.

  • Youre measuring what we call a “proxy”, something that is related to what we want

  • to measure, but isn’t exactly what we want to measure.

  • To answer whether people eat more fast food on the weekends, or whether eating it more

  • than twice a week increases stress, we’d not only need to know how much people are

  • eating fast food, which our questionnaire asked, but also which days they eat it.

  • And we’d need an additional measure ofstress”.

  • You can use statistics to give a good answer about whether youre going through the drive-thru

  • more on the weekend, but even the question of whether eating fast food is associated

  • with higher levels of stress is hard to answer directly.

  • What is stress and how can we measure it?

  • And are people eating fast food because they are stressed?

  • Or does eating all those calories make them stressed?

  • It’s often the case that some of the most interesting questions are the ones that can’t

  • be directly answered by statistics--like why people eat fast food.

  • Instead we find questions that we can answer-- like whether people who eat fast food often

  • work more than eighty hours a week.

  • The tools we use to answer these questions are statistics-plural--and there are two main

  • types: Descriptive and Inferential.

  • Descriptive statistics, well... they describe what the data show!

  • Descriptive statistics usually include things like where the middle of the data is--what

  • statisticians call measures of central tendency--and measures of how spread out the data are.

  • They take huge amounts of information that may not make much intuitive sense to us, and

  • compress and summarize them to ...hopefully... give us more useful information.

  • Let’s go to the the Thought Bubble.

  • Youve been working for two years in the local waffle factory.

  • Day in and day out, you create the golden-browny-iest, tastiest frozen waffles ever created.

  • The holes are perfectly spaced.

  • Screaming for syrup.

  • And now you want a raise.

  • You deserve a raise.

  • No one can make a waffle as well as you can.

  • But how much do you ask for?

  • An extra thousand dollars?

  • An extra 5-thousand dollars?

  • You know youre valuable, but have no idea what other waffle makers get paid.

  • So you dig around online and find there’s an entire subreddit devoted to waffle makers.

  • And someone usernamewaffleleakshas posted a spreadsheet of waffle maker salaries.

  • Now with a quick glance at this huge list of numbers, you can see whether the woman

  • who works a similar job at the rival frozen waffle company makes more than you.

  • You can see how much more you are making than the new guy, who’s just now learning to

  • mix batter.

  • But you still don’t know much about the paychecks of your waffle company as a whole.

  • Or the industry.

  • Cause it turns out there are thousands of waffle makers out there.

  • And all you see is a list with data points, not patterns that can help you learn more

  • about how much you might be able to convince the boss to pay you.

  • Here is where descriptive statistics come in.

  • You could calculate the average salary at your company as well as how spread out everyone’s

  • salaries are around that average.

  • You’d be able to see whether the CEOspaychecks are relatively close to the entry-level

  • batter makers, or incredibly far away.

  • And how your salary compares to both of their salaries.

  • You could calculate the average salary of everyone in the industry with your job title.

  • And see the high and low end of that pay.

  • And then, armed with those descriptive statistics, you could confidently walk into the waffle

  • bosses office and demand to be paid for your talents.

  • Thanks, Thought Bubble.

  • While descriptive statistics can be great, they only tell us the basics.

  • Inferential statistics allows us to make….inferences.

  • (Clever namers, those statisticians.)

  • Inferential statistics allow us to make conclusions that extend beyond the data we have in hand.

  • Imagine you have a candy barrel full of salt water taffy.

  • Some pink, some white, some yellow.

  • If you wanted to know how many of each color you have, you could count them.

  • One by one by one.

  • That’d give you a set of descriptive statistics.

  • But who has time for all that?

  • Or, you could grab a giant handful of taffy, and count just those you have pulled out,

  • which would be using descriptive statics.

  • If your candy was, in fact, mixed pretty evenly throughout the barrel, and you got a big enough

  • handful, you could use inferential statistics on thatsampleto estimate the content

  • of the entire taffy stash.

  • We ask inferential statistics to do all sorts of much more complicated work for us.

  • Inferential statistics let us test an idea or a hypothesis.

  • Like answering whether people in the US under the age of 30 eat more fast food than people

  • over 30.

  • We don’t survey EVERY person to answer that question.

  • Let’s say someone tells you that their new brain vitamin--Smartie-vite--improves your

  • IQ.

  • Do you rush out and buy it?

  • What if they told you that the average IQ increase for Group A-- twenty people who took

  • Smartie-vite for a month--was two IQ points, and the average IQ increase for Group B--twenty

  • people who took nothing--was one IQ point.

  • How about now?

  • Still not sure?

  • It is a pretty small difference right]?

  • Inferential statistics give you the ability to test how likely it is that the two populations

  • we sampled actually have different IQ increases.