Placeholder Image

字幕列表 影片播放

  • When alex did a video on linked list there's a lot of comments on it it sort of thing

  • I want to use a linked list arrays a factor

  • What's the point of reading this and so I was sort of thinking about is that well actually "Are arrays

  • faster than the linked list or are the linked lists faster than an array, how do we go about finding out?"

  • Which one's faster, so let's have a shoot out arrays versus linked list

  • First question to start is

  • I should just remind ourselves what the difference is between a linked list and

  • An array so let's start with the array so probably the simplest

  • To understand so an array is just a way of storing

  • several

  • values

  • in memory so we start off with the first value and

  • for the purposes of this demonstration, I'm going to use a

  • Structure containing multiple values. We did talk about objects a while ago structures are what you had before

  • We saw this object which is a collection of values in memories and store to integer values

  • So this is array element zero and it's got two numbers. We then got array element one

  • And it's got two numbers

  • Looks like I'm playing break out got a great element

  • To and it's got two numbers and so we can refer to any of these

  • by

  • reference to the Index starts at 0 3 4 5 6

  • 7 10 so you can have an array of any size we have to know how big it is if you wanted to be when

  • You create so you allocate the memory for all the things and then?

  • You can store some values so we can write 42 in this one. We can put 21 in here

  • I'm going to put 3 in that one and we can store them or we can put different values in here as

  • Well as we need them so that's how an array works. Which is one big block of memory each thing stored one after the other

  • linked list works slightly different we start off at the beginning of

  • the linked list and it points at nothing but then as we add things to the list we

  • Allocate the memory for it. We're going to store two things so the first thing we'll label these things inside it p

  • And Q so it's a lot p here

  • And we've got q but we also have an extra thing that tells us where the next they will be and that's empty, too

  • So it's all put 42 and 16 so we just allocate space for that one

  • Structure that one thing that we're going to store and we connect it up

  • So we point this at the first thing so we often call this something like first

  • And we see Alex videos to see how that have done

  • So when we want to store the next thing we allocate explai for another one

  • And we can store the value, so we've still got 8

  • 21 in it now the difference between this could be anywhere else in memory, and we connect to the next pointer

  • to point in

  • To the next thing so we have some known starting point

  • which points to the first thing and the first thing 22 the second thing and the second thing would point to the third thing and

  • So on and that would point down and these could be all over

  • anywhere

  • In the computer's memory and then at the end you normally point it at some known value

  • Which is often referred to as null and is often the number zero?

  • So you can then walk through your list

  • And when you find at the next point of zero or null you know you've come to the end of it

  • There are two different ways of storing data the best or the same amount of data now

  • He's got some differences this one as I says it's fixed inside, and you create it you can store

  • How many things you know I've actually filled in but as soon as you get more than eleven since final top array?

  • You can't go past eleven if you can see yeah the numbers all go to

  • 11 on this however if we wanted to put something else thing we can just get rid of this thing

  • And update this to point

  • To a new thing so we can make this grow

  • Quite easily, and we can also be really clever

  • We can add things in the middle and the beginning very very simply, but we're going to go into that today

  • So creating them. We're probably only going to do once but which one is actually faster

  • Well, let's think about what actually to do. Let's define a very very simple problem

  • We've created the structure here, and it's got two things in it

  • It's got an integer called p and an integer called Q so saving the linked list and in

  • The array let's consider an operation where we're just going to go through all the elements in our array

  • Or all the elements in our linked list and sum the value of p are going to add up all the value

  • So we're going to add up. What's in array

  • Element Zeroes P in array of them at once P and so on so we're going to effectively do

  • Something's going to add them all up. We'll do the same with the linked list as well

  • So we're going to set what the first thing is to point to p

  • Now that one. No. There's no time out of its value of p the next one and so on and we'll add them all up

  • Why are we using missiles they want to the unique thing about this is

  • That it's going to visit every single element in either the array or every single element in the linked list

  • so we're going to visit every single element if you only have to visit one element then the very nature of

  • Random access into array means as we fast

  • There's no way you can make any difference unless it's the first element if you only one to ever access the first element

  • Then look at the same speed we fuel directly to random element

  • Then the array or win hands down you can just think about how it works that will be the case. We're going to consider

  • We want to access every single element and do something to it and in this case

  • You're just going to sum up the values, but it could be if things were representing say windows on the screen

  • We want to move them around we want to redraw them

  • Whatever it is will then do something to every single wall so we sort of set out the problem?

  • I'm going to go over an array of values and a linked list of values

  • I'm going to in sum in the visits all and we do actually add up all the numbers

  • Stored in the element P. Which is part of the structure so what I've done is I've written

  • To see poems which are more or less identical

  • I'm going to create

  • 125,000 elements in my array

  • or mailing list and in the linked list I've got a structure which contains an integer p and an integer Q and

  • in the array version

  • I've got a structure which contains an integer p and interested Q in the array item

  • The only difference is that the linked list has one extra item which is the next pointer to the next thing?

  • So very listen see so hopefully we'll go as fast as we possibly can most of the rest of the program

  • is

  • identical

  • so I have my main Routine here which creates a

  • List or an array of random number generator?

  • 125,000 random numbers which is a slowest part of the program you'll see and then store them

  • So allocate memory for each element in the linked list and then we set the value

  • Everything else will just leave with 0 and we do the same with the array just as we did on the paper

  • We then run the program

  • 100 times and all going to here is time

  • How long it takes to run a function which adds afforda numbers either in the list or in the array?

  • I will do that 100 times

  • And then we can work out the average time it takes to calculate one and we'll print out the values as we go

  • So we're just using the built-in clock function in C. Which should be accurate accurate enough for what we want to do

  • so the real meat the body of the program is this function here some list or

  • Some array

  • Let's start with the array one because possibly the simplest we said a variable to be zero that's going to be asked some

  • You have another variable

  • I which is going to be our index through the array and we go from I equals zero

  • I first element until we get to the end the number of elements in there

  • and we just say this sum equals the value that was already in sum plus the value in the ice element in the array a

  • P element

  • Remember we said the array had a p and a Q in it we want the value stored in the p space within the structure

  • in our array, so I'm just going to add get them together the list version very very similar we set the someone to be 0

  • We say while p does not equal null

  • So why we haven't reached the end of the list and we set p initially to be the first thing in the linked list?

  • So pointing out that first thing sum equals sum plus

  • P inside p. We using P inside p

  • Confusing variable names, but then I'm asleep programmer, and then we set p ir pointer to point to the next thing

  • So we follow the link to the next thing in the linked list and we keep doing that until we come to the end

  • There are two programs as simple as you get in terms of implementing a linked list or on array

  • so we're going to do is run them and

  • We should get some values output

  • But we're not going to run them on the I max we're not going to run them on the raspberry Pi

  • we're actually going to run them on the atari behind me which means I need a very high Peak high-tech piece of

  • Equipment, so I need a floppy disk to transfer other programs from the iMac

  • Go topping them off the iMac, and then we can run them on the AtAri

  • so go over to the Atari and

  • Assurance that it's cameras and spot. We can see it. So what I'm gonna do at first is they got to run

  • the list first also going to generate

  • 125,000

  • linked list values to let it Run

  • Right while it loads off the floppy disk

  • that's what it used to be like back in the 80s and 90s you had to wait one you crave them ran a

  • While a

  • long While

  • And this is just a simple program

  • So there is good

  • So it's not

  • Initializing the link list it's allocating all the memory for the different elements and then putting a random number

  • In the P value of each one, so we'll let that run

  • In the words of the old Hobbit computer game time passes or in the words of every apple of this

  • secret short

  • How long does this take?

  • A while lots of oil, I don't know another time this bit

  • 125,000 times however long it takes to allocate a random number and allocate some memory

  • On the clock speed of what this is actually an 8 MegAhertz cPU

  • So it's a takes a while the computer loaded off all the code for the program and off the floppy disk into memory

  • And it's now running it to generate the data set that we need then Gonna sum all the values

  • So everything is happening now will happen off memory

  • It's just taking a while to process it

  • But we are recording it in 4k which is a slight overkill. I hope you're getting into that

  • extra bit

  • So the programs going through now and we'll let it run 400 times

  • We'll get an average, but actually looking at the values of the coming up

  • I think we can safely assume that the average is going to end up being

  • 166 clock ticks now when I say qwop takes each of the machines. We're going to look at are going to have

  • Different clocks perms that they use the time things so we can't compare the values directly what we could come read with the seconds

  • But we can certainly see the relative patterns between it will call that

  • 166 clock ticks over to run the linked list version of the program it takes

  • 166 clock ticks

  • So let's now run the array version of the program

  • So now we're going to do exactly the same thing

  • we're going to allocate an array every time tWenty-five thousand elements and

  • populate them with random values because of the way

  • I've written the program

  • It will actually be the same pSeudo random number sequence that the sum should be the same and then we're going to fill

  • each of them up and time how long that takes and we'll they'll see whether an array actually is faster than the linked list or

  • Whether a linked list will be b. Array

  • So that to me looks like that little spinny thing is going faster

  • Yeah, well so and I haven't just upgraded the processor while they're off camera

  • What's actually happening is?

  • When we allocate the space for the linked list we have to allocate space for each element and then for the gym there

  • We allocate space for the next element. We could do that in a different way and speed things up, but we did it

  • That's the classic way with the array. We allocate the whole lot in one go, so we all take one

  • Huge block of memory under 25,000 times eight is about one meg's worth of memory

  • So allocate all that in one go, and then we're just going through it filling in all the values setting all the values

  • So that should be quicker

  • and it seems and spinning wheel is going slightly faster what why would that Spin wheel very fast can you just learn how that

  • You can read it if you've heard this thing you it yeah?

  • So to give some nice sort of visual Feedback while this is running

  • Well, I'm just a plain white screen what I'm doing is I'm printing every hundred element ideals

  • I update that thing and just printing either a dash a

  • Vertical line or one of the two slashes to make it look like it's getting round and those are printing it it creates a little

  • Effect which is see that something's happened and the computer hasn't crashed

  • So this should be faster, and we should get a value so is now working and the immediate thing we see is

  • That the array is taking about

  • 179 or

  • 178 ticks it's the same quad kick so we can compare these two values

  • the Array is

  • slower than a linked list I

  • Know down, okay? All right now. So you can't argue in - okay, so the numbers have come in the computers around

  • 170 angular call is to that 170 8.5. Always for the final average to come in but for that. So they're roughly comparable

  • Basically there's not much it sort of thing, but there's something about 200 qwop ticks per second on this machine

  • It's is probably a noticeable. It's about 10 percent slower

  • Okay, so that's it. I mean we can stop here come a that's it. We know. What's what arrays are slow

  • You wrinkly yep, so arrays are slower than ling Li

  • Gotta get my vendor video yeah and a video you can now bring up the slash computer, so

  • virtually

  • There are two ways you can walk over the Ray

  • We went up the array from the smaller numbers to the larger numbers

  • But we could also start at the larger indexes and move back to the smaller indexes, and I thought well

  • Let's try it both ways let's not have anyone posting some comments saying no, but if you went the other way

  • So let's run it the other way, so I wrote a version which walks down and because we're adding the numbers together

  • We'll get exactly the same answer a plus b equals b. Plus a and all that you can watch numberphile to find out

  • More online. I'm sure

  • You think this would take exactly the same amount of time, so just update our table

  • To do this and we're going to have the reverse array here

  • I'll just wait while it does that

  • So going backwards through the array the only thing can we do the same thing with the linked list

  • Well the way we've designed the linked list we

  • Have a pointer to the next thing in each thing, but we don't have a pointer back to the previous thing

  • It's a singly linked list so we can only move forwards along it

  • We could design it to be a doubly linked list and have a backwards and forwards pointer

  • but if you think about it if we started at the end

  • And move backwards that's going to be exactly the same

  • Operations as if we start at the beginning and before so there's actually no point in testing

  • It'd be exactly the same code. We just be using different offsets into memory so taking exactly the same amount of time

  • While we wait for that to do its thing let's run the same programs on

  • The Imac and see how much faster it is so let's compile them up. I am turning on as I did before all the optimizations

  • For the array and things that'll make it go as fast as poss but some of the comparison to the array

  • Test and I going to compile with the speed test

  • Done, and I'm going to do the same thing

  • for the link

  • Test I'm going to say something really obvious now

  • But it's honestly a heck of a lot quicker this machine just to get compiled up and ready. Yeah

  • Those machines much faster because while I'm compiling everything on here and then transferring it across

  • Rather than trying to get my old C compiler going on the AtAri

  • And also the chances are this will produce better code, so it'll take make the most benefit of the speed

  • We can pile them up, so let's run the linked list test as we did before

  • boom

  • It's done

  • Everything now you notice that took a lot quicker?

  • The numbers are still roughly in the same order still about 100 and around 200 but remember this is a different

  • Clock we cannot compare the ticks from that one - it's on the different numbers, but we get the average here two hundred and nine

  • Point five four I'm acting this is 166 atari ticks

  • They're much slower chicks than the I like ticks we can't compare that to that we can compare

  • horizontally

  • let's do the same now with the array test and

  • Wow

  • When we did the linked list test on the atari it was faster than the array test

  • Roughly take you about ten percent faster

  • Nothing really in it you could say look at the difference on

  • The Imac the IMac takes forty three point four four clock ticks to do the array

  • 209 that's five times

  • Faster for the Array was on the Atari the linked list was faster

  • So a reverse array is now going on

  • the Atari

  • 114 and

  • That's quicker. That's quicker than the original array, and that's quicker than the linked list, so if we do everything backwards

  • Doogie where it's very good if you do everything forwards

  • It's much slower. I'm a bit confused

  • Is there any possibility that's just a problem the code or something always are you going to reveal something to this year?

  • We need to delve a bit deeper here to see what's really going on

  • Remember we wrote these programs in C

  • So we wrote these in the high-level language and then they got compiled down

  • To the instructions that the machine can execute and what's actually happening here. Is that the research shows that the machine can execute?

  • Favor walking backwards down something compared with going further upwards now

  • Let's run that same array backwards

  • Program on the IMac for completing this so let me compile that one up and again. I'm

  • Optimizing everything to the hilt

  • Test and so we'll run speed tests back

  • And we'll run this one so before the arrays now much faster

  • According to this a reversal rate should be faster than the forward array. So what would you expect the value to be shown for?

  • Run on the reverse away about fifteen or twenty directly if it was the same as that did

  • All right

  • Would you like to stick with that answer or would you like to change that oh?

  • nervous excited actually

  • Because the Atari was faster to do a linked list in an array and the iMac was the other way around

  • I'm going to say it's going to be slower to do it in reverse. So you reckon firm supposed to do it in reverse

  • Oh, I do like an indecisive person there. We go average time forty seven

  • point four six

  • So marginally slow marginally slower about 10% slower

  • but still an order of magnitude

  • faster than this we could do the same on the raspberry Pi

  • So again, we'll compile all of these up, so we'll do it array test

  • Compile it up noticeably slower to compile will do the LL test. This is the list version

  • And we'll do the reverse test

  • So compile them all up on the last repair, so let's run these and get the numbers we've run the array test

  • we now get nine 165 point seven five as an average for the array will run the link list and

  • We get one eight five eight

  • Point six one for the linked list test and now we run the reverse test

  • and

  • We get 101

  • 9.5 five so we've run the test now. So we've we've assumed nothing

  • We've ran some tests or to get some data so we can see which is faster an array or liquids with this operation

  • We've been trying

  • And we've got some pretty interesting results

  • So we're running on the atari the linked list was faster than the array

  • unless he ran the array backwards in which case the array was faster than the linked list and

  • The array going forward we'll come back to that

  • We ran it on the raspberry Pi and here the array was about twice as fast as the linked list

  • Baby rarely backwards the Array was slower, and if you do it on the IMac view range height, I'm faster than the linked list

  • Whichever way you went so what's going on here?

  • Well, let's ignore the iMac for the minute the apple Haters will love that bit

  • but let's ignore the Imac for though and let's just have a look at the raspBerry Pi and

  • the attali so we've got

  • the Atari and the Raspberry Pi and we'll just go with

  • The array speed and the link to this so we've got about one hundred and seventy nine

  • For that and one hundred and sixty-six clock ticks that now when we can't compare the clock ticks between the different machines because they're different

  • different clocks used in the machines

  • but we can compare them relatively between the same thing on the same machine for the raspberry Pi it was

  • 966 and

  • that was a

  • 1859 Kish now what's going on here?

  • Well the thing we need to remember

  • if you look at the machine code

  • It's roughly the same length the same number of operation now on the atari some of the instructions will take slightly longer than to execute

  • But that's not what gets going on here

  • We need to think back to the video on

  • Cashiers that we did the difference between the raspberry Pi CPU would have been much faster and more modern and the Ataris

  • Is that the AtAris?

  • Doesn't have a cache

  • So every instructionally needs to fetch every bit of data that needs to fetch has to be fetched from memory each time

  • No, it's not cached

  • broadly speaking we believe something going on there, but broadly speaking with no cache is about half the

  • Prefetch buffer if you want to get into the details, but we can see there's no cash

  • So if we think about the cPU in the atari, then it's having to access

  • memory

  • For everything so everything that the cPU needs to fetch on the Atari the instructions

  • The data from the array or a linked list and of course the next pointer from the linked list has to come from memory

  • So it takes the same amount of time we get are the two weather fetching data or fetching instructions on the raspBerry Pi however?

  • We still got the main body the cPU which is going to execute things and it has memory

  • as well

  • but in between there we have a cache in fact we actually probably have two caches one for Data and

  • one for instructions as

  • I want to see if you activate as we looked at in the previous video is

  • Access to it via the caches, and then they if they haven't got it to get it

  • From memory, so why does this make a difference?

  • Surely things will still wake up so well on the atari the fastest

  • thing in the theory is basically the memory the memory is much faster than the cPU that's about twice as fast and

  • So the memory can provide a data exactly when the cPU needs it. There's no real need for cash

  • Move ahead to the raspberry Pi and the are make of course then the cPU is much faster than the memory

  • So when cPU ask for something it has to wait while the memory?

  • provides it

  • Now let's figure out how what happens when we run our program?

  • with the array

  • on the raspberry Pi

  • Every instruction after the first time will be accessed will be cached in the instruction cache

  • So the first time we go through the loop all the instructions are going to be used in that loop will have been cached in

  • The instruction cap so you can get these immediately the data

  • particularly with the array would also collapse a bit with a linked list we

  • Don't get just one or two bytes in each time

  • I'll get what we call a cache line pickups 128 bytes in a go, so we'll get some of the data that we already need

  • Into the cache as well, so some of the data will already be there in the array more so with your right

  • so the reason of the array runs faster on the raspberry Pi is that all the instructions are coming straight out of the

  • Institute basically giving us a fast lane for those instructions, so they get there immediately

  • There's only the data that needs to get from main memory, which will get cached as well

  • So the same happens with the route list program

  • Except for one crucial difference with the linked list poem we have to make one memory access

  • to get the data value that we're going to sum and

  • Then we have to make another memory access which has to go through the cat into main memory

  • for the address of the next thing

  • we have to make to

  • Memory request which may get satisfied by main Memory here

  • Where's our the array one?

  • We only have to make one for the value that we're interested in and because of the cache

  • We get sort of a fast passing instructions, and so actually not having to do that second memory access to get the next pointer here

  • Means that this the arraylist of the program ends are working about twice as fast on the atari because everything is coming from

  • memory it doesn't matter whether you're reading an instruction or reading a

  • bit of Data

  • It's still going to come from memory each time so actually the front of the value is already pre-calculated in memory means that actually runs

  • Slightly faster, I mean, we're talking perhaps 10% faster. Not really a billion so we get a slight speed benefit

  • Now to show that this is the case

  • I took the same version of the program on the atari and ran it on the atari router now the difference for the AtAri

  • St. We use there and the Falcon is that the falcon has a slightly later version of the 68000 processor

  • which has a cache

  • In it both instructions and data cache and when I ran that on there the times that came out

  • So the array was 46 clock ticks is much faster processor anyway, and the link list

  • was

  • 58.5

  • Clock ticks so on the Falcon because you've got the cache. They're just like on the raspberry Pi

  • The array version ends up being traffic exactly the same program exactly the same machine code

  • Because you've got the cache there the array version ends up being faster because the instructions of their

  • Professional and that sort of fat packs from the cache rather having to go to main memory each time

  • mean that the

  • irradiation becomes faster than the linked list version

  • Now tell them other things we haven't talked about yet. Why is the reverse array faster on the AtAri?

  • Simple answer to that it

  • Just uses slightly different instructions their instructions on the six 8000 that allow you to do a decrement

  • Testing with zero branch which isn't all in one instruction

  • So you can actually make the code slightly more compact and run slightly faster again. It's a small enough time

  • Why then is the imac significantly faster? We're getting sort of five times faster?

  • Well that's because I use a slightly different compiler that I use the clang C

  • Compiler on the IMac rather than GCC for the other two and it cheats it

  • It squats that you've got an array axises in the loop, and it says well

  • Okay, rather than doing a loop around one array access

  • I will do a loop around eight array accesses all in one go

  • So it actually removes some of the tests for the loop and it makes the program much faster for the compilers being clever

  • And optimizing the program

  • So it runs faster the answer to whether the raise faster than the link list or not?

  • Very much depends on how you cpu in the machine that you've built is configured to actually execute the code

  • So you can't make assumptions?

  • About how a cpu will execute code you really need to do the test to see what's happening even within the same

  • architecture family the difference between the six 8006 8030 in the Falcon mean that

  • Assumptions we make based on what they charity did where the Rally was slower than the link list aren't true on

  • The Falcon and when you move it on to a completely different architecture like the raspberry Pi or the X86 chip in the iMac

  • Then again you get different effects, so the best thing is when you're faced with a question question like this, and you're not sure

  • Come up with some test and collect real data, and you'll be able to see what's going on

  • could you

When alex did a video on linked list there's a lot of comments on it it sort of thing

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級

陣列與鏈接列表 - Computerphile (Arrays vs Linked Lists - Computerphile)

  • 7 0
    林宜悉 發佈於 2021 年 01 月 14 日
影片單字