字幕列表 影片播放 列印英文字幕 [MUSIC PLAYING] DAVID MALAN: All right. This is CS50, and last time where we left off was here, focusing on data structures. And indeed, one of the last data structures we looked at was that of a hash table. But that was the result of a progression of data structures that we began with this thing here, an array. Recall that an array was actually a data structure that was actually introduced back in week two of CS50, but it was advantageous at the time, because it allowed us to do things efficiently, like binary search, and it was very easy to use with its square bracket notation and adding integers or strings or whatever it is. But it had limitations, recall. And among those limitations were its lack of resizeability, its lack of dynamism. We had to decide in advance how big we wanted this data structure to be, and if we wanted it to be any bigger, or for that matter any smaller, we would have to dynamically ourselves resize it and copy all of the old elements into a new array, and then go about our business. And so we introduced last time this thing here instead, a linked list that addresses that problem by having us on demand allocate these things that we called nodes, storing inside them an integer or really any data type that we want, but connecting those nodes with these arrows pictured here, specifically connecting them or threading them together using something called pointers. Whereby pointers are just addresses of those nodes in memory. So while we pay a bit of a price in terms of more memory in order to link these nodes together, we gain this flexibility, because now when we want to grow or shrink this kind of data structure, we simply use our friend malloc or free or similar functions still. But we then, using linked lists-- and at a lower level, pointers as a new building block, did we begin to solve other problems. We considered the problem of a stack of trays in the cafeteria, and we presented an abstract data type known as a stack. And a stack supports operations like push and pop. But what's interesting about a stack for our purposes recall is that we don't need to commit necessarily to implementing it one way or another. Indeed, we can abstract away the underlying implementation details of a stack, and implement it using an array if we want, if we find that easier or convenient. Or for that matter we can implement it using a linked list, if we want that additional ability to grow and shrink. And so the data type itself in a stack has these two operations, push and pop, but they're independent, ultimately, of how we actually implement things underneath the hood. And that holds true as well for this thing you here. A line, or more properly a queue, whereby instead of having this last in, first out or LIFO property, we want something more fair in the human world, a first in, first out. So that when you NQueue or Dqueue some piece of data, whatever was NQueued first is the first thing to get out of that queue as well. And here too did we see that we could implement these things using a linked list or using array, and I would wager there is yet other possible implementations as well. And then we transitioned from these abstract data types to another sort of paradigm for building a data structure in memory. Rather than just linking things together in a unidirectional way, so to speak, with a linked list, we introduced trees, and we introduced things like binary search trees, that so long as you keep these data structures pretty well balanced, such that the height of them is logarithmic and is not linear like a linked list, can we achieve the kind of efficiency that we saw back in week zero when we did binary search on a phone book. But now, thanks to these pointers and thanks to malloc and free can we grow and shrink the data structure without committing in advance to an actual fixed size array. And similarly did we solve another real world problem. Recall that a few weeks ago we looked at forensics, and most recently did we look at compression, both of which happen to involve files. And in this case, the goal was to compress information, ideally losslessly, without throwing away any of the underlying information. And thanks to Huffman coding did we see one technique for doing that, whereby instead of using seven or eight bits for every letter or punctuation symbol in some text, we can instead come up with our own coding that we use one bit like a one to represent a super common letter like e, and two or three or four or more bits for the less common letters in our world. And then again we came to hash tables. And hash tables too is an abstract type that we could implement using an array or using a linked list or using an array and a linked list. And indeed, we looked first at a hash table as little more than an array. But we introduced this idea of a hash function, that allows you, given some input, to decide on some output and index a numeric value, typically, that allows you to decide where to put some value. But if you use something like an array, of course, you might paint yourself into a corner, such that you don't have enough room ultimately for everything. And so we introduced separate chaining, whereby a hash table in this form is really just an array, pictured here vertically, and a set of linked lists hanging off that array, pictured here horizontally, that allows us to get some pretty good efficiency in terms of hashing, finding the chain that we want in pretty much constant time, and then maybe incurring a bit of linear cost if we actually have a number of collisions. Now today, after leaving behind these data structures-- among them a try, which recall was our last data structure that allowed us in theory in constant time to look up or insert or even delete words in a data structure, depending only on the length of the string, not how many strings were in there-- do we continue to use these ideas, these building blocks, these data structures. But now today we literally leave behind the world of C and starts to enter the world of web programming, or really the world of web pages and dynamic outputs and databases, ultimately, and all of the things that most of us are familiar with every day. But it turns out that this time we don't have to leave behind those ingredients. Indeed, something like this, which you'll soon know as HTML-- the language in which web pages are written-- HyperText Markup Language-- even this textual document, which seems to have a bit of structure to it, as you might glean here from the indentation, can underneath the hood be itself represented as a tree. A DOM, or Document Object Model, but indeed, we'll see now some real world, very modern applications of the same data structures in software that we ourselves use. Because today, we look at how the internet works, and in turn how we actually build software atop it. But first, a teaser. [VIDEO PLAYBACK] [MUSIC PLAYING] -He came with a message, with a protocol all his own. He came to a world of cruel firewalls, uncaring routers, and dangers far worse than death. He's fast. He's strong. He's TCPIP, and he's got your address. Warriors of the Net. [END PLAYBACK] DAVID MALAN: All right, so coming soon is how the internet works. And it's not quite like that. But we'll see in a bit more detail. But let's consider first something a little more familiar, if abstractly, like our own home. So odds are, before coming to a place like this, you had internet access at home or at school or at work or the like. And inside of that building-- let's call it your home-- you had a number of devices. Maybe a laptop, maybe a desktop, maybe both, maybe multiple. And you had some kind of internet service provider, Comcast or Verizon or companies like that, that actually run some kind of wired connection, typically-- though it could be wireless-- into your home, and via that connection are you on your laptop or desktop able to get out onto the internet. Well it turns out that the internet itself is a pretty broad term. The internet is really just this interconnection of lots of different networks. Harvard here has a network. Yale has a network. Google has a network. Facebook has a network. Your home has a network and the like. And so the internet really is the interconnection of all of those physical networks. And on top of this internet, do there run services, things like the web or the world wide web. Things like email. Things like Facebook Messenger. Things like Skype. And any number of applications that we use every day