Placeholder Image

字幕列表 影片播放

  • [MUSIC PLAYING]

  • DAVID MALAN: All right.

  • This is CS50, and last time where we left off

  • was here, focusing on data structures.

  • And indeed, one of the last data structures we looked at

  • was that of a hash table.

  • But that was the result of a progression of data structures

  • that we began with this thing here, an array.

  • Recall that an array was actually a data structure that was actually introduced

  • back in week two of CS50, but it was advantageous at the time,

  • because it allowed us to do things efficiently, like binary search,

  • and it was very easy to use with its square bracket notation

  • and adding integers or strings or whatever it is.

  • But it had limitations, recall.

  • And among those limitations were its lack

  • of resizeability, its lack of dynamism.

  • We had to decide in advance how big we wanted this data structure to be,

  • and if we wanted it to be any bigger, or for that matter any smaller,

  • we would have to dynamically ourselves resize it and copy

  • all of the old elements into a new array, and then go about our business.

  • And so we introduced last time this thing here instead,

  • a linked list that addresses that problem by having us on demand

  • allocate these things that we called nodes,

  • storing inside them an integer or really any data type that we want,

  • but connecting those nodes with these arrows pictured

  • here, specifically connecting them or threading them together using something

  • called pointers.

  • Whereby pointers are just addresses of those nodes in memory.

  • So while we pay a bit of a price in terms of more memory in order

  • to link these nodes together, we gain this flexibility,

  • because now when we want to grow or shrink this kind of data structure,

  • we simply use our friend malloc or free or similar functions still.

  • But we then, using linked lists-- and at a lower level,

  • pointers as a new building block, did we begin to solve other problems.

  • We considered the problem of a stack of trays in the cafeteria,

  • and we presented an abstract data type known as a stack.

  • And a stack supports operations like push and pop.

  • But what's interesting about a stack for our purposes

  • recall is that we don't need to commit necessarily

  • to implementing it one way or another.

  • Indeed, we can abstract away the underlying implementation

  • details of a stack, and implement it using an array if we want,

  • if we find that easier or convenient.

  • Or for that matter we can implement it using a linked list,

  • if we want that additional ability to grow and shrink.

  • And so the data type itself in a stack has these two operations, push and pop,

  • but they're independent, ultimately, of how we actually

  • implement things underneath the hood.

  • And that holds true as well for this thing you here.

  • A line, or more properly a queue, whereby

  • instead of having this last in, first out or LIFO property,

  • we want something more fair in the human world, a first in, first out.

  • So that when you NQueue or Dqueue some piece

  • of data, whatever was NQueued first is the first thing

  • to get out of that queue as well.

  • And here too did we see that we could implement these things using a linked

  • list or using array, and I would wager there is yet

  • other possible implementations as well.

  • And then we transitioned from these abstract data types

  • to another sort of paradigm for building a data structure in memory.

  • Rather than just linking things together in a unidirectional way, so to speak,

  • with a linked list, we introduced trees, and we introduced things

  • like binary search trees, that so long as you

  • keep these data structures pretty well balanced, such that the height of them

  • is logarithmic and is not linear like a linked list,

  • can we achieve the kind of efficiency that we saw back in week zero

  • when we did binary search on a phone book.

  • But now, thanks to these pointers and thanks to malloc and free

  • can we grow and shrink the data structure without committing in advance

  • to an actual fixed size array.

  • And similarly did we solve another real world problem.

  • Recall that a few weeks ago we looked at forensics,

  • and most recently did we look at compression, both of which

  • happen to involve files.

  • And in this case, the goal was to compress information,

  • ideally losslessly, without throwing away any of the underlying information.

  • And thanks to Huffman coding did we see one technique

  • for doing that, whereby instead of using seven or eight bits for every letter

  • or punctuation symbol in some text, we can instead come up with our own coding

  • that we use one bit like a one to represent a super common letter like e,

  • and two or three or four or more bits for the less

  • common letters in our world.

  • And then again we came to hash tables.

  • And hash tables too is an abstract type that we could implement using an array

  • or using a linked list or using an array and a linked list.

  • And indeed, we looked first at a hash table as little more than an array.

  • But we introduced this idea of a hash function, that

  • allows you, given some input, to decide on some output

  • and index a numeric value, typically, that allows

  • you to decide where to put some value.

  • But if you use something like an array, of course,

  • you might paint yourself into a corner, such

  • that you don't have enough room ultimately for everything.

  • And so we introduced separate chaining, whereby a hash table in this form

  • is really just an array, pictured here vertically, and a set of linked

  • lists hanging off that array, pictured here horizontally,

  • that allows us to get some pretty good efficiency in terms

  • of hashing, finding the chain that we want in pretty much constant time,

  • and then maybe incurring a bit of linear cost

  • if we actually have a number of collisions.

  • Now today, after leaving behind these data structures-- among them

  • a try, which recall was our last data structure that allowed us

  • in theory in constant time to look up or insert or even delete words in a data

  • structure, depending only on the length of the string,

  • not how many strings were in there-- do we continue to use these ideas,

  • these building blocks, these data structures.

  • But now today we literally leave behind the world of C

  • and starts to enter the world of web programming,

  • or really the world of web pages and dynamic outputs

  • and databases, ultimately, and all of the things that most of us

  • are familiar with every day.

  • But it turns out that this time we don't have to leave behind those ingredients.

  • Indeed, something like this, which you'll soon

  • know as HTML-- the language in which web pages are written-- HyperText Markup

  • Language-- even this textual document, which seems to have a bit of structure

  • to it, as you might glean here from the indentation, can underneath the hood

  • be itself represented as a tree.

  • A DOM, or Document Object Model, but indeed, we'll

  • see now some real world, very modern applications of the same data

  • structures in software that we ourselves use.

  • Because today, we look at how the internet works,

  • and in turn how we actually build software atop it.

  • But first, a teaser.

  • [VIDEO PLAYBACK]

  • [MUSIC PLAYING]

  • -He came with a message, with a protocol all his own.

  • He came to a world of cruel firewalls, uncaring routers,

  • and dangers far worse than death.

  • He's fast.

  • He's strong.

  • He's TCPIP, and he's got your address.

  • Warriors of the Net.

  • [END PLAYBACK]

  • DAVID MALAN: All right, so coming soon is how the internet works.

  • And it's not quite like that.

  • But we'll see in a bit more detail.

  • But let's consider first something a little more familiar, if abstractly,

  • like our own home.

  • So odds are, before coming to a place like this,

  • you had internet access at home or at school or at work or the like.

  • And inside of that building-- let's call it your home--

  • you had a number of devices.

  • Maybe a laptop, maybe a desktop, maybe both, maybe multiple.

  • And you had some kind of internet service provider, Comcast or Verizon

  • or companies like that, that actually run some kind of wired connection,

  • typically-- though it could be wireless-- into your home,

  • and via that connection are you on your laptop or desktop able to get out

  • onto the internet.

  • Well it turns out that the internet itself is a pretty broad term.

  • The internet is really just this interconnection

  • of lots of different networks.

  • Harvard here has a network.

  • Yale has a network.

  • Google has a network.

  • Facebook has a network.

  • Your home has a network and the like.

  • And so the internet really is the interconnection

  • of all of those physical networks.

  • And on top of this internet, do there run services, things like the web

  • or the world wide web.

  • Things like email.

  • Things like Facebook Messenger.

  • Things like Skype.

  • And any number of applications that we use every day