字幕列表 影片播放 列印英文字幕 VOICEOVER: The following content is provided under a Creative Commons license. Your support will help MIT Open Courseware continue to offer high-quality educational resources for free. To make a donation or to view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu. JULIAN SHUN: Good afternoon, everyone. So today we're going to talk about storage allocation. This is a continuation from last lecture where we talked about serial storage allocation. Today we'll also talk a little bit more about serial allocation. But then I'll talk more about parallel allocation and also garbage collection. So I want to just do a review of some memory allocation primitives. So recall that you can use malloc to allocate memory from the heap. And if you call malloc with the size of s, it's going to allocate and return a pointer to a block of memory containing at least s bytes. So you might actually get more than s bytes, even though you asked for s bytes. But it's guaranteed to give you at least s bytes. The return values avoid star, but good programming practice is to typecast this pointer to whatever type you're using this memory for when you receive this from the malloc call. There's also aligned allocation. So you can do aligned allocation with memalign, which takes two arguments, a size a as well as a size s. And a has to be an exact power of 2, and it's going to allocate and return a pointer to a block of memory again containing at least s bytes. But this time this memory is going to be aligned to a multiple of a, so the address is going to be a multiple of a, where this memory block starts. So does anyone know why we might want to do an aligned memory allocation? Yeah? STUDENT: [INAUDIBLE] JULIAN SHUN: Yeah, so one reason is that you can align memories so that they're aligned to cache lines, so that when you access an object that fits within the cache line, it's not going to cross two cache lines. And you'll only get one cache axis instead of two. So one reason is that you want to align the memory to cache lines to reduce the number of cache misses. You get another reason is that the vectorization operators also require you to have memory addresses that are aligned to some power of 2. So if you align your memory allocation with memalign, then that's also good for the vector units. We also talked about deallocations. You can free memory back to the heap with the free function. So if you pass at a point of p to some block of memory, it's going to deallocate this block and return it to the storage allocator. And we also talked about some anomalies of freeing. So what is it called when you fail to free some memory that you allocated? Yes? Yeah, so If you fail to freeze something that you allocated, that's called a memory leak. And this can cause your program to use more and more memory. And eventually your program is going to use up all the memory on your machine, and it's going to crash. We also talked about freeing something more than once. Does anyone remember what that's called? Yeah? Yeah, so that's called double freeing. Double freeing is when you free something more than once. And the behavior is going to be undefined. You might get a seg fault immediately, or you'll free something that was allocated for some other purpose. And then later down the road your program is going to have some unexpected behavior. OK. I also want to talk about m map. So m map is a system call. And usually m map is used to treat some file on disk as part of memory, so that when you write to that memory region, it also backs it up on disk. In this context here, I'm actually using m map to allocate virtual memory without having any backing file. So So our map has a whole bunch of parameters here. The second to the last parameter indicates the file I want to map, and if I pass a negative 1, that means there's no backing file. So I'm just using this to allocate some virtual memory. The first argument is where I want to allocate it. And 0 means that I don't care. The size in terms of number of bytes has how much memory I want to allocate. Then there's also permissions. So here it says I can read and write this memory region. s private means that this memory region is private to the process that's allocating it. And then map anon means that there is no name associated with this memory region. And then as I said, negative 1 means that there's no backing file. And the last parameter is just 0 if there's no backing file. Normally it would be an offset into the file that you're trying to map. But here there's no backing file. And what m map does is it finds a contiguous unused region in the address space of the application that's large enough to hold size bytes. And then it updates the page table so that it now contains an entry for the pages that you allocated. And then it creates a necessary virtual memory management structures within the operating system to make it so that users accesses to this area are legal, and accesses won't result in a seg fault. If you try to access some region of memory without using-- without having OS set these parameters, then you might get a set fault because the program might not have permission to access that area. But m map is going to make sure that the user can access this area of virtual memory. And m map is a system call, whereas malloc, which we talked about last time, is a library call. So these are two different things. And malloc actually uses m map under the hood to get more memory from the operating system. So let's look at some properties of m map. So m map is lazy. So when you request a certain amount of memory, it doesn't immediately allocate physical memory for the requested allocation. Instead it just populates the page table with entries pointing to a special 0 page. And then it marks these pages as read only. And then the first time you write to such a page, it will cause a page fault. And at that point, the OS is going to modify the page table, get the appropriate physical memory, and store the mapping from the virtual address space to physical address space for the particular page that you touch. And then it will restart the instructions so that it can continue to execute. You can-- turns out that you can actually m map a terabyte of virtual memory, even on a machine with just a gigabyte of d ram. Because when you call m map, it doesn't actually allocate the physical memory. But then you should be careful, because a process might die from running out of physical memory well after you call m map. Because m map is going to allocate this physical memory whenever you first touch it. And this could be much later than when you actually made the call to m map. So any questions so far? OK. So what's the difference between malloc and m map? So as I said, malloc is a library call. And it's part of--malloc and free are part of the memory allocation interface of the heat-management code in the c library.