Placeholder Image

字幕列表 影片播放

  • VOICEOVER: The following content is provided under a Creative

  • Commons license.

  • Your support will help MIT Open Courseware

  • continue to offer high-quality educational resources for free.

  • To make a donation or to view additional materials

  • from hundreds of MIT courses, visit MIT OpenCourseWare

  • at ocw.mit.edu.

  • JULIAN SHUN: Good afternoon, everyone.

  • So today we're going to talk about storage allocation.

  • This is a continuation from last lecture

  • where we talked about serial storage allocation.

  • Today we'll also talk a little bit more

  • about serial allocation.

  • But then I'll talk more about parallel allocation and also

  • garbage collection.

  • So I want to just do a review of some memory allocation

  • primitives.

  • So recall that you can use malloc to allocate memory

  • from the heap.

  • And if you call malloc with the size of s,

  • it's going to allocate and return

  • a pointer to a block of memory containing at least s bytes.

  • So you might actually get more than s bytes,

  • even though you asked for s bytes.

  • But it's guaranteed to give you at least s bytes.

  • The return values avoid star, but good programming practice

  • is to typecast this pointer to whatever type

  • you're using this memory for when you receive

  • this from the malloc call.

  • There's also aligned allocation.

  • So you can do aligned allocation with memalign,

  • which takes two arguments, a size a as well as a size s.

  • And a has to be an exact power of 2,

  • and it's going to allocate and return

  • a pointer to a block of memory again containing

  • at least s bytes.

  • But this time this memory is going

  • to be aligned to a multiple of a,

  • so the address is going to be a multiple of a,

  • where this memory block starts.

  • So does anyone know why we might want to do an aligned memory

  • allocation?

  • Yeah?

  • STUDENT: [INAUDIBLE]

  • JULIAN SHUN: Yeah, so one reason is

  • that you can align memories so that they're

  • aligned to cache lines, so that when you access an object that

  • fits within the cache line, it's not

  • going to cross two cache lines.

  • And you'll only get one cache axis instead of two.

  • So one reason is that you want to align

  • the memory to cache lines to reduce

  • the number of cache misses.

  • You get another reason is that the vectorization operators

  • also require you to have memory addresses that

  • are aligned to some power of 2.

  • So if you align your memory allocation with memalign,

  • then that's also good for the vector units.

  • We also talked about deallocations.

  • You can free memory back to the heap with the free function.

  • So if you pass at a point of p to some block of memory,

  • it's going to deallocate this block

  • and return it to the storage allocator.

  • And we also talked about some anomalies of freeing.

  • So what is it called when you fail to free

  • some memory that you allocated?

  • Yes?

  • Yeah, so If you fail to freeze something that you allocated,

  • that's called a memory leak.

  • And this can cause your program to use more and more memory.

  • And eventually your program is going

  • to use up all the memory on your machine,

  • and it's going to crash.

  • We also talked about freeing something more than once.

  • Does anyone remember what that's called?

  • Yeah?

  • Yeah, so that's called double freeing.

  • Double freeing is when you free something more than once.

  • And the behavior is going to be undefined.

  • You might get a seg fault immediately,

  • or you'll free something that was allocated

  • for some other purpose.

  • And then later down the road your program

  • is going to have some unexpected behavior.

  • OK.

  • I also want to talk about m map.

  • So m map is a system call.

  • And usually m map is used to treat some file on disk

  • as part of memory, so that when you

  • write to that memory region, it also backs it up on disk.

  • In this context here, I'm actually

  • using m map to allocate virtual memory without having

  • any backing file.

  • So

  • So our map has a whole bunch of parameters here.

  • The second to the last parameter indicates

  • the file I want to map, and if I pass a negative 1,

  • that means there's no backing file.

  • So I'm just using this to allocate some virtual memory.

  • The first argument is where I want to allocate it.

  • And 0 means that I don't care.

  • The size in terms of number of bytes

  • has how much memory I want to allocate.

  • Then there's also permissions.

  • So here it says I can read and write this memory region.

  • s private means that this memory region

  • is private to the process that's allocating it.

  • And then map anon means that there is no name associated

  • with this memory region.

  • And then as I said, negative 1 means

  • that there's no backing file.

  • And the last parameter is just 0 if there's no backing file.

  • Normally it would be an offset into the file

  • that you're trying to map.

  • But here there's no backing file.

  • And what m map does is it finds a contiguous unused region

  • in the address space of the application that's large enough

  • to hold size bytes.

  • And then it updates the page table

  • so that it now contains an entry for the pages

  • that you allocated.

  • And then it creates a necessary virtual memory management

  • structures within the operating system

  • to make it so that users accesses to this area

  • are legal, and accesses won't result in a seg fault.

  • If you try to access some region of memory without using--

  • without having OS set these parameters,

  • then you might get a set fault because the program might not

  • have permission to access that area.

  • But m map is going to make sure that the user can access

  • this area of virtual memory.

  • And m map is a system call, whereas malloc,

  • which we talked about last time, is a library call.

  • So these are two different things.

  • And malloc actually uses m map under the hood

  • to get more memory from the operating system.

  • So let's look at some properties of m map.

  • So m map is lazy.

  • So when you request a certain amount of memory,

  • it doesn't immediately allocate physical memory

  • for the requested allocation.

  • Instead it just populates the page table

  • with entries pointing to a special 0 page.

  • And then it marks these pages as read only.

  • And then the first time you write to such a page,

  • it will cause a page fault. And at that point,

  • the OS is going to modify the page table,

  • get the appropriate physical memory,

  • and store the mapping from the virtual address space

  • to physical address space for the particular page

  • that you touch.

  • And then it will restart the instructions

  • so that it can continue to execute.

  • You can-- turns out that you can actually

  • m map a terabyte of virtual memory,

  • even on a machine with just a gigabyte of d ram.

  • Because when you call m map, it doesn't actually

  • allocate the physical memory.

  • But then you should be careful, because a process might

  • die from running out of physical memory

  • well after you call m map.

  • Because m map is going to allocate this physical memory

  • whenever you first touch it.

  • And this could be much later than when you actually

  • made the call to m map.

  • So any questions so far?

  • OK.

  • So what's the difference between malloc and m map?

  • So as I said, malloc is a library call.

  • And it's part of--malloc and free are part of the memory

  • allocation interface of the heat-management code in the c

  • library.