Placeholder Image

字幕列表 影片播放

  • The following content is provided under a Creative

  • Commons license.

  • Your support will help MIT OpenCourseWare

  • continue to offer high-quality educational resources for free.

  • To make a donation or to view additional materials

  • from hundreds of MIT courses, visit MIT OpenCourseWare

  • at ocw.mit.edu.

  • CHARLES E. LEISERSON: OK, let's get started.

  • So today we're going to talk about measurement and timing.

  • And I want to start out by just showing you a study

  • that one of my students did--

  • actually, at that point, he was a former student--

  • whereas timing a code for sorting.

  • So here's the code.

  • This isn't exactly his code, but it's in the same spirit

  • so that you get the idea.

  • And so let's just run through this code

  • and take a look to see what it's doing.

  • It's pretty straightforward.

  • We're going to use the time.h header file to get access

  • to the clock_gettime() routine, which is going to be what we

  • used to get timing measurements.

  • And then we have a sorting routine

  • that we're going to time.

  • That I'm not showing you.

  • And there is also a fill routine,

  • which is going to fill up the array with numbers--

  • with random numbers, so we have something to sort.

  • And the clock_gettime() uses a struct that is defined here,

  • and so I'm defining two timing structs-- a start and an end.

  • So this is just absolute boilerplate setting up

  • for taking timing measurements.

  • And in this case, basically the high order part of the struct

  • tells the seconds, the lower part tells the nanoseconds.

  • And then we're going to loop over

  • arrays of increasing length.

  • And then what we're going to do is fill them up--

  • oh, I forgot the fill-- and then we're going to measure how much

  • time--

  • what the time is just before we sort.

  • Then we're going to sort, and then we're

  • going to measure the time after sorting.

  • And then we compute the difference,

  • and figure out what the elapsed time is, print that out,

  • and then we do it again for a little bit larger array.

  • So is that clear what's going on?

  • So we're just sorting a bunch of numbers,

  • then we're sorting some bigger ones,

  • sorting some bigger ones, so we can

  • see what the growth of the sorting routine should be.

  • People have a pretty good understanding

  • of what the code does?

  • OK, so what do we expect to see?

  • What's this curve going to look like?

  • What are some properties of it?

  • Yep?

  • AUDIENCE: [INAUDIBLE]

  • CHARLES E. LEISERSON: So micro is n log n,

  • but it's certainly going to grow, right?

  • It should be up and to the right.

  • In fact, one rule, if you ever get into marketing,

  • is that all graphs must go up and to the right.

  • If they're going down and to the right,

  • then your company's in trouble.

  • So it should be going up and to the right,

  • and it should follow, for example, n log n,

  • if it's an n log n sort, which is what this one was.

  • I think he was, in this case, timing a merge sort.

  • They should go up and to the right,

  • and should follow n log n, or whatever.

  • So let's see what actually happened

  • when he took the measurements.

  • This is actually his data from--

  • gosh, this must have been 20 years ago or something.

  • Here's what it looked like.

  • So the blue Xs there are the runtimes.

  • And then through that, we've plotted two curves,

  • one which is the best fit to order n

  • log n growth, and the best fit to order n growth.

  • You notice that for--

  • even though we're going up to 4 million here,

  • there's not that much difference between n log n and order n.

  • You can see it mostly down in the tails.

  • Definitely the n log n follows a little bit better,

  • but really, log n is pretty small already.

  • But wow, those measured times--

  • so if you look, there are points way up here--

  • really slow.

  • It starts out-- it goes slow a little bit,

  • and then it gets a little bit worse, and then a little bit

  • worse, and a little bit worse.

  • Notice also that the bumps are getting closer and closer

  • together.

  • What is going on?

  • Why?

  • I don't know about you, but I thought

  • the data would follow the green dots reasonably closely.

  • But you can see it doesn't.

  • It's always good to have a model for what

  • you think is going on because then, when you--

  • because some people will just take numbers.

  • They'll say, here's my numbers for my--

  • that I've measured.

  • And if you don't actually have a model

  • for what those numbers mean, you're

  • probably fooling yourself.

  • You're more likely to have made some sort of error,

  • or there's something going on that you're not

  • observing, or whatever, if you don't actually

  • have a model for what you think should be going on.

  • So what's going on here?

  • Who can suggest a hypothesis for what is going on?

  • So he took these numbers on his laptop, by the way.

  • What do you suppose is happening here?

  • Some ideas.

  • Yeah?

  • AUDIENCE: [INAUDIBLE]

  • CHARLES E. LEISERSON: Maybe it doesn't fit in the cache.

  • What would you expect to happen if things

  • didn't fit in the cache?

  • AUDIENCE: [INAUDIBLE]

  • CHARLES E. LEISERSON: Yeah, you sort of

  • think that it would go along, and then it would jump.

  • So interesting issue, but that doesn't seem to be what's

  • happening there.

  • It's going up, then it's going back down.

  • It's going up and going back down--

  • roller coaster.

  • What other ideas are there?

  • Good idea.

  • Good idea.

  • Let's think a little bit about what's going on in the machine.

  • What are some other good ideas?

  • Or bad ideas?

  • Let's eliminate some.

  • Yeah.

  • AUDIENCE: They're not powers of 2.

  • CHARLES E. LEISERSON: They're not powers of 2.

  • These are not powers of 2, right?

  • Because they're getting closer and closer as

  • we get bigger and bigger.

  • Yeah, so you're right.

  • It's not correlated with powers of 2.

  • Weird.

  • Because sometimes things are alignment issues,

  • and we'll talk more about that.

  • It will come up when we talk about caching after the quiz.

  • Everybody knows there's a quiz next time--

  • especially all of you who aren't here?

  • OK, so what else might be going on in the machine here?

  • Because this is reality.

  • This is what happens when you take measurements.

  • So we're being very nice to you.

  • We're giving you AWS run.

  • We have done everything we can to make

  • sure those numbers come out clean, and beautiful,

  • and untouched.

  • There they are.

  • That is quality measurements we're taking for you.

  • But if you had to do it yourself,

  • that's what this lecture, in part, is about.

  • Yeah?

  • AUDIENCE: [INAUDIBLE]

  • CHARLES E. LEISERSON: So you think

  • that there may be something having to do with the cache.

  • But I'm going through each time and I'm refilling the array

  • each time, so they're kind of starting from a clean slate--

  • similar clean slate each time.

  • What else is going on in the machine here?

  • Yeah?

  • AUDIENCE: [INAUDIBLE] totally unrelated stuff [INAUDIBLE]

  • CHARLES E. LEISERSON: Yeah, there

  • could be totally unrelated stuff running.

  • You might have daemons, you might have all kinds of things,

  • and so forth.

  • So he thought of that, and he shut down

  • everything he possibly could.

  • And this is what he got still.

  • But that's a great idea because often, there's

  • some external things going on.

  • In this case, it's called multi-tenancy.

  • There's more than one thing using the computer at a time.

  • Good idea, but happens not to be the one.

  • AUDIENCE: [INAUDIBLE]

  • CHARLES E. LEISERSON: Could be precision with the timing.

  • Yeah, sometimes there can be issues there,

  • but this was not a precision issue.

  • He could have used a really dumb timer

  • and gotten something very similar to this.

  • What else is going on your machine?

  • Yeah?

  • AUDIENCE: Maybe his machine's checking

  • for updates every minute.

  • CHARLES E. LEISERSON: Yeah, maybe it's

  • checking for updates.

  • That's once again some external things.

  • No, it wasn't checking for updates.

  • Wasn't checking for updates.

  • What is going on here?

  • What is going on?

  • Let's have some more ideas.

  • What other things might disrupt measurements?

  • Yeah?

  • AUDIENCE: [INAUDIBLE]

  • CHARLES E. LEISERSON: Yeah.

  • This was actually merge sort he was timing,

  • so there's no randomization.

  • But even that, if it were quick sort,

  • it'd be at random that things would tend to take longer,

  • rather than following this crazy pattern.

  • What is causing that crazy pattern?

  • Yeah?

  • AUDIENCE: Does the random fill have to do with the time?

  • CHARLES E. LEISERSON: No, because the random fill

  • is done outside the timer.

  • Each time through the loop, we fill,

  • and then we start the timer, and then

  • we take the measurement, and so forth.

  • Yeah?

  • AUDIENCE: [INAUDIBLE]

  • CHARLES E. LEISERSON: It's not allocating memory.

  • But that's an interesting idea, because sometimes you

  • have things going on where you think things are happening

  • right away, but the system is being clever and delaying it.

  • And so you end up paying for it at some later time,

  • and that could possibly create something.

  • Turns out not to be what's going on here.

  • So what's happening here is that the machine is

  • changing the clock frequency.

  • Why is the machine changing the clock frequency?

  • Your laptops change the--

  • the systems that we have, they change clock frequency.

  • Why do they change it?

  • AUDIENCE: [INAUDIBLE]

  • CHARLES E. LEISERSON: Because the laptop is getting hot.

  • So what do are doing?

  • We're running something computational.

  • And the smaller ones--

  • OK, we get a lot of those done, until it starts heating up,

  • and so it slows down the system clock to save power.

  • OK, and then what happens?

  • Slows it down a little bit, cools off a little bit,

  • starts to speed up again.

  • And then we run longer.

  • And why are these things getting closer and closer together?

  • Yeah?

  • AUDIENCE: Takes longer and longer to run the sorts.

  • CHARLES E. LEISERSON: Yeah, it takes longer and longer

  • to run the sorts, so you're going

  • to see the effect closer in an interval

  • here, even if it happened to be equal in time.

  • Even if it was equal in time, we're

  • doing bigger and bigger problems.

  • This is nuts, right?

  • We want to take measurements.

  • We want to know whether the software is faster.

  • What are you supposed to do?

  • So here, if you just took a measurement and said,