Placeholder Image

字幕列表 影片播放

  • SPEAKER 1: All right, this is CS50 and this is week five.

  • And let's take a look at where we left off last time.

  • You may recall this guy here, Binky from our friends at Stanford.

  • And we used Binky to start talking about pointers.

  • What is a pointer?

  • So, a pointer is just an address, the location

  • of some piece of data in memory, because recall at the end of the day

  • your computer just has a few pieces of hardware inside of it, one of which

  • is RAM or Random Access Memory.

  • And in RAM you have the ability to store bunches and bunches of bytes,

  • or kilobytes, or megabytes, or gigabytes,

  • depending on how much memory you have.

  • And if you assume that no matter how much RAM you have you

  • can enumerate the bytes-- this is byte 0, this is byte 1, this is byte 2,

  • and so forth-- you can give each of the bytes of your computer's memory

  • an address and those addresses are simply called pointers.

  • And now in C we have the ability to use pointers

  • both to go to any location in memory that we want

  • and even to dynamically allocate memory in case we don't necessarily

  • know a priori how much memory we might need for a program.

  • Now, in terms of your computer's RAM, recall

  • that we divided the world into this picture

  • here whereby if this rectangular region, arbitrarily, represents your computer's

  • memory, here is how the computer divvies it up

  • when you're actually using a program.

  • At the bottom of your computer's area of memory, you have the so-called stack.

  • And recall that the stack is where any time you call a function,

  • it gets a slice of memory-- a frame of memory,

  • if you will-- for all of its local variables, all of its arguments

  • and anything else that it might need.

  • On top of that might go another slice or frame of memory

  • if that first function calls another.

  • And if that second function in turn calls another function,

  • you might have a third frame on the stack.

  • Of course, this doesn't end well if you keep calling function after function

  • after function after function.

  • And so, hopefully you don't accidentally induce some kind of infinite loop

  • such that these frames pile on top of each other infinitely

  • many times, because eventually they'll run the risk of hitting the heap.

  • Now, the heap is the same type of physical memory.

  • You're just using it in a slightly different way.

  • The heap is used any time you want to dynamically allocate memory,

  • when you don't know in advance how many bytes

  • you need but you do know once the program is running how many you now

  • want.

  • You can ask via functions like malloc the operating

  • system for some number of bytes, and those bytes

  • are allocated from the heap.

  • So, those two have addresses or numbers.

  • And so, the operating system, by way of malloc,

  • just figures out which of those bytes are not yet

  • being used so that you can now put whatever piece of data

  • you have in that particular place.

  • Now, beyond that [? appear ?] things like initialized data,

  • uninitialized data.

  • That's where things like global variables that are initialized or not

  • end up that might be outside of your main function.

  • And then above that is the so-called text segment,

  • which are these zeros and ones that actually compose your program.

  • So when you double click an icon on Windows or Mac OS

  • to run a program or you type dot slash something in the Linux command line

  • environment in order to run a program, the bits that compose your program

  • are loaded also into memory up into this region here.

  • So, at the end of the day, you have access to just pretty generic memory,

  • but we use it in these different ways.

  • And it allows us to ultimately solve problems

  • that we might not have been able to in the past.

  • Recall for instance this example here, deliberately shown in red because it

  • was [? buggy. ?] This does not work.

  • Now, logically, it does do the swap that we intend whereby a goes into b and b

  • goes into a.

  • And we achieve that result by way of this temporary variable

  • so that we have a temporary placeholder into which to store one of those values

  • while doing the swap.

  • But it had no permanent impact on the two variables that were passed into it.

  • And that was because by default in C any time you pass arguments to a function,

  • those arguments are passed so to speak, by value.

  • You get copies of those values being passed into a function.

  • And so, if main, for instance, has two variables, x and y--

  • as they did last time-- and you pass x and y

  • into a function like this one here swap, x and y

  • are going to get copied as a and b respectively.

  • So you might perfectly, logically, correctly swap a and b,

  • but you're having no permanent impact on x and y themselves.

  • But what if, per this green version here,

  • we reimplement swap to be a little more complicated

  • looking, but at the end of the day actually correct?

  • Notice now we've declared a and b not to be

  • integers but to be pointers to integers, the addresses of integers.

  • And that's what's implied by the star that we're putting

  • right there before the variable's name.

  • Meanwhile, inside of the body of this function,

  • we still have three lines of code.

  • And we're still using a temporary variable, and that in itself

  • is not a pointer.

  • It's just an integer as before, but notice

  • we're using this star notation again, albeit for a different purpose

  • to actually dereference these pointers.

  • Recall that int star a and int star b means give me a variable that

  • can store the address of an integer.

  • That's declaring a pointer.

  • Meanwhile, if you just say star a without declaring something

  • to the left of it with a data type like int,

  • you're saying go to the address that is in a.

  • So if a is an address, star a is at that address,

  • which of course per its declaration is going to be an integer.

  • Similarly, star b means go to the address in b.

  • Star a means go to the address in a and put the former into the latter,

  • ultimately putting the value of temp at the address in b-- so absolutely more

  • complicated at first glance, but if you consider again

  • the first principles of what's going on here, all we are doing

  • are moving things around in memory.

  • And we can do that now because we have the ability

  • to express the locations, the numeric locations of where

  • things are in memory.

  • But nicely enough, we, the programmer, don't have

  • to care where things are in memory.

  • We can access things symbolically as we're doing here with a and b.

  • So even though we might have seen on the screen

  • or you might see while debugging actual addresses of memory,

  • rarely does that actually matter in practice.

  • We can deal with everything we've learned thus far symbolically.

  • Now, last time we also took a look at the world of forensics,

  • and we took a look at how images are implemented and specifically file

  • formats like BNP, and JPEG, and GIF, and yet others.

  • And we glanced into [? Asmila's ?] here as we tried to enhance this image,

  • but of course, there was only finite amount of information.

  • So, what you see is what you get in terms of any kind of glint

  • or suspect in her eyes.

  • But we did this in part so that we could also

  • introduce another feature of C that allows us to declare our own data

  • types, indeed our own data structures.

  • For instance, we proposed that if you wanted

  • to write a program that stores a student,

  • you could actually declare your own student data type

  • inside of which is a name and inside of which is a dorm,

  • and anything else that you might actually want.

  • Meanwhile, this syntax here gives us a new data type called student

  • so that if we want to write a program that implements students,

  • we can actually wrap related information together like name and dorm

  • without having to maintain a whole bunch of strings

  • for just names and a whole bunch of strings for just dorms.

  • We can actually encapsulate things all inside of one structure.

  • And indeed encapsulation is another principle of computer science

  • that you'll see throughout program and throughout the field itself.

  • So, what do we now do this time?

  • So, today we introduce more sophisticated ingredients

  • with which we can solve problems and we revisit a problem from the past

  • that we thought we had rather knocked off and had solved.

  • So, this might represent a whole bunch of names,

  • a whole bunch of numbers, a whole bunch of telephone numbers in a phone book

  • back to back to back to back stored in this case

  • in the form of an array, the simplest of data structure, so to speak,

  • that we've discussed thus far.

  • And an array, again, is a contiguous block of memory each of whose element--

  • typically are of the same data type, integers, or strings, or the like--

  • and they are by definition back to back to back to back,

  • which allows you random access.

  • Which means you can jump to any of these locations

  • instantly just by using in C that square bracket notation

  • or as we saw last time using pointer arithmetic,

  • actually using the star operator and maybe adding some number two

  • and address to get at some subsequent address.

  • But it turns out there's a few problems with this fundamental approach.

  • Nice and as simple as it is, it would seem that we rather paint ourselves

  • into a corner with this approach.

  • This array has 1, 2, 3, 4, 5, 6 total elements, at least as depicted here.

  • So that's fine if you want to insert a number, and then

  • another number, and then four more numbers.

  • But what if you want to then insert a seventh number,

  • not to mention an eighth number or a ninth number or the like?

  • Well, where do you put them?

  • Well, you might think, well, that's fine.

  • I'm just going to go put the seventh number over here,

  • or the eighth number over here, or the ninth number over there.

  • But you can't just blindly do that.

  • If this memory is being managed not by you

  • per se but by malloc and by the computer itself inside-- and your program,

  • this memory over here, while it might physically exist,

  • might be used by some other part of your program all together.

  • It doesn't necessarily belong to you unless you've asked for it.

  • And the problem with an array is that as we've seen it typically

  • you declare their size in advance, as with the square bracket notation,

  • and say give me six integers or give me six something or others, but that's it.

  • You have to decide in advance.

  • You can't just grow it as you can in some programming languages thereafter.

  • You've rather painted yourself into a corner.

  • But with malloc and other functions like we saw last time,

  • you can actually allocate more memory using malloc.

  • Unfortunately, it might end up in another location in your computer's

  • memory, so you might have to do some copying

  • to take the original six elements and move them

  • elsewhere just to make room for more.

  • And there is a data function for that, something called [? re-alloc ?]

  • or reallocate.

  • And indeed it can do exactly that.

  • It can give you a bigger chunk of memory and reallocate

  • what was previously there to be a larger [? size. ?]

  • But you have to do a little bit of work.

  • You have to invoke it in order achieve that.

  • You can't just blindly keep adding things at the end of this array.

  • Now, unfortunately, while a solution that might not be very efficient.

  • Even if you can allocate a bigger chunk of memory that's bigger than six

  • because you have more numbers, for instance, to store,

  • what if that takes a bit of time?

  • And indeed it's going to.

  • If you allocate more integers somewhere else in memory

  • you still have to copy those original values,

  • and now it just feels like you're wasting time.

  • Now, instead of just inserting things into the list,

  • you might have to copy it into a bigger space, reallocate things, grow.

  • It's a lot more work.

  • And all of that discussion of running time and performance

  • comes back into play, because if that whole copying process and reallocating

  • is costing you time, your algorithm or your program

  • ultimately might not really be as fast as you might want it.

  • So, what could we do instead?

  • What could we do instead in order to solve

  • this problem dynamically, so to speak, that being the operative word.

  • And luckily enough, last week we learned that there is dynamic memory allocation

  • in C by way of that function malloc.

  • And we also learned that there is ways of representing structures

  • in C that you don't necessarily get with the language itself,

  • because they're not primitives.

  • They're not built in.

  • In other words, let me propose this as a solution to our problem.

  • This is a list of, let's see, five numbers it would seem,

  • 9, 17, 22, 26, and 34.

  • Pretty arbitrary right now, but you might

  • imagine very simply drawing those same numbers-- 9, 17, 22, 26,

  • 34-- in the form of an array and they're clearly deliberately sorted.

  • But again, what if you wanted to grow that array or even shrink that array

  • dynamically over time?

  • Well, let me propose that we not draw those numbers back to back to back

  • to back literally next to each other but allow ourselves potentially

  • a little bit of space?

  • But if that's the case and nine is here in my computer's memory and 17 is here

  • and 22 is here, or over here, or over here-- in other words,

  • what if I relax the constraint that my numbers or my data types more

  • generally have to be stored contiguously back to back to back to back in memory

  • and instead allow them to be anywhere, indeed anywhere a function like malloc