2019年律師CS50--算法、數據結構。 (CS50 for Lawyers 2019 - Algorithms, Data Structures)

字幕列表影片播放

[MUSIC PLAYING]
DAVID MALAN: Recall that an algorithm is just step-by-step instructions
for solving some problem.
Not unlike this problem here wherein I sought Mike Smith among the whole phone
book of names and numbers.
But up until now, we've really only focused
on those step-by-step instructions and not
so much on how the data we are searching is stored.
Of course, in this version of that problem, it's stored here on paper,
but in the digital world, it's of course not going to be paper, but 0's and 1's.
But it's one thing to say that the numbers and maybe even the names
are stored ultimately as 0's and 1's, but where and how exactly?
There's all those transistors and they're flipping on and off,
but with respect to each other, are those numbers laid out left to right,
top to bottom, are they all over the place?
Let's actually take a look at that question now
and consider how a computer leverages what
are called data structures to facilitate implementation of algorithms.
Indeed, how you lay out a computer's data inside of its memory
has non-trivial impacts on the performance or efficiency
of your algorithms, whereas the algorithm itself
can be correct as we've seen, but not necessarily efficient logically.
Both space and the representation underneath the hood of your data
can also make a significant impact.
But let's simplify the world first.
And rather than focus on, say, a whole phone book of names and numbers,
let's focus just on numbers, and much smaller numbers that aren't even
phone numbers, but just integers, and only save seven of them at a time.
And I've hidden these seven numbers, if you will,
behind these seven yellow doors.
And so by knocking and opening, one of these doors
will reveal one number at a time.
And the goal at hand, though, is to find a very specific number,
just like I sought one specific phone number before, this time I want
to find the number 50 specifically.
Well, where to begin?
I'll go with the one closest to me and knock, knock, knock--
15 is the number.
So a little bit low.
Let's proceed from there to see 23.
We seem to be getting closer.
Let's open this door next and-- oh, we seem to have veered down smaller,
so I'm a little confused.
But I have four doors left to check.
So 50 is not there and 50 is not there and 50 is, in fact, there.
So not bad.
Within just six steps, have I found the number in question.
But of course, to be fair, there were only seven doors.
So if we generalize that to say that there were n doors where n is just
a number, well that was roughly n doors I had to open among the n doors
just to find the one that I sought.
So could I have done better?
You know, my instincts like yours were perhaps to start at the left
and move to the right, and we seem to be on a good path initially.
We went from 15 to 23, and then darn it if 16
didn't throw a wrench in the works, because I expected it,
perhaps naively, to be bigger and bigger as I moved right.
But honestly, had I not told you anything-- and indeed I did-- then
you wouldn't have known anything about these numbers other than maybe
the number 50 is actually there.
I told you nothing as to the magnitude or the size
of any of the other numbers, let alone the order,
but in the world of the phone book, of course,
we were able to take for granted that those names were
sorted by the phone company for us-- from left to right, from A to Z.
But in this case, if your data is just added to the computer's memory one
at a time in no particular order, the onus
is on you, the programmer or algorithm, to find that number
you're interested in nonetheless.
Now what was left here?
And indeed 4 is even smaller than 50.
So these seven doors were by design randomly assigned a number.
And so you could do no better.
I might have gotten lucky.
I might not have gone with my initial instincts
and touch the number 15 at left.
I might have, effectively blinded, gone and touched 50 and just gotten lucky,
and then it would have been just one step.
But there's only a one in seven chance I would have been correct so quickly,
so that's not really an algorithm that I could
reproduce with the same efficiency again and again.
So how can I do better?
And how does the phone company enable us to do better?
Well they, of course, put in a huge amount of effort upfront
to sort all of those names and associated numbers from left
to right, from A to Z. And so that's a huge leg up for us,
because then I can assume I can do divide and conquer or so-called binary
search, dividing that phone book in two as
implied by "bi" in "binary," having the problem again and again.
But someone's got to do that work for us, be it the phone company
or perhaps me with these numbers.
So let's take one more stab at this problem,
this time presuming that the seven doors in question
do, in fact, have the numbers behind them sorted from left to right,
small to big.
So where to find the number 50 now?
I have seven doors behind which are those same numbers,
but this time they are sorted from left to right.
And no skipping ahead thinking that, well, I remember all the other numbers,
so I know immediately where 50 is.
Let's assume for the moment that we don't know anything
about the other numbers other than the fact that they are sorted.
Well, my inclination is not to start at the left with this first door,
much like my inclination ultimately with that phone book was not to start
with the first page, but the middle.
And indeed, I'm going to go here to the middle of these doors and--
16.
Not quite the one I want.
But if the doors are sorted now, I know that that number 50 is not to the left,
and so I'm going to go to the right.
Where do I go to the right?
Well, I have three doors left, I'm going to follow the same algorithm
and open that door in the middle and-- oh, so close.
I found only, if you will, the meaning of life.
So 42, though, is not the number I care about,
but I do know something about 50-- it's bigger than 42.
And so now, it's quite simply the case that-- aha, 50 is there,
it's going to be in that last number.
So whereas before took me up to six steps to find the number 50,
and only then by luck did I find it where
it was because it was just randomly placed,
now I spent 1, 2, 3 steps in total, which is, of course, fewer than six.
And as these numbers of doors grow in size
and I have hundreds or thousands of doors,
surely it will be the case just like the phone book that having this problem
again and again is going to get me to my answer
if it's there in logarithmic instead of linear time, so to speak.
But what's key to the success of this algorithm-- binary search--
is that the doors are not only sorted, but they are back-to-back-to-back.
Now I have the luxury of feet and I can move back and forth
among these numbers, but even my steps take me some amount of time and energy.
But fortunately, each such step just takes one unit of energy, if you will,
and I can immediately jump wherever I would like one step at a time.
But a computer is purely electronic, and in the context of memory,
doesn't actually need to take any steps.
Electronically a computer can jump to any location in memory
instantly in so-called constant time.
So just one step, that might take me several.
And so that's an advantage a computer has and it's just one of the reasons
why they are so much faster than us at solving so many problems.
But the key ingredient to laying out the data for a computer to solve
your problems quickly is that you need to put your data back-to-back-to-back.
Because a computer at the end of the day,
yes, stores only 0's and 1's, but those 0's and 1's are generally
treated in units of, say, eight--
8 bits per byte.
But those bytes, when storing numbers like this,
need those numbers to be back-to-back-to-back and not just
jumbled all over the place.
Because it needs to be the case that the computer is
allowed to do the simplest of arithmetic to figure out where to look.
Even I in my head am sort of doing a bit of math figuring out,
well where's the middle?
Even though among few doors you can pretty much eyeball it quickly.
But a computer's going to have to do a bit of arithmetic, so what
is that math?
Well if I have 1, 2, 3, 4, 5, 6, 7 doors initially,
and I want to find the middle one, I'm actually just going to do what?
7 divided by 2, which gives me 3 and 1/2-- that's
not an integer that's that useful for counting doors,
so let's just round it down to 3.
So 7 divided by 2 is 3.5 rounded down to 3 suggests
mathematically that the number of the door that's in the middle of my doors
should be that known as 3.
Now recall that a computer generally starts
counting at 0 because 0 bits represent 0 in decimal,
and so this is door 0, 1, 2, 3, 4, 5, 6.
So there's still seven doors, but the first is 0 and the last is called 6.
So if I'm looking for number 3, that's 0, 1, 2, 3.
And indeed, that's why I jumped to the middle of these doors,
because I went very specifically to location 3.
Now why did I jump to 42 next?
Of course, that was in the middle of the three remaining doors,
but how would a computer know mathematically where to go,
whereas we can just rather eyeball it here?
Well if you've got 3 doors divided by 2, that gives me, of course, 1.5--
let's round that down to 1.
So if we now re-number these doors, it's 0, 1, 2,
because these are the only three doors that exist, well door 1 is 0, 1--
the 42, and that's how a computer would know to jump right to 42.
Of course, with just one door left, it's pretty simple.
You'd needn't even do any of that math if there's just one,
and so we can immediately access that in constant time.
In other words, even though my human feet are taking a bit of energy
to get from one door to another, a computer has the leg-up, so to speak,
of getting to these doors even quicker, because all it has to do
is a little bit of division, maybe some rounding,
and then jump exactly to that position in memory.
And that is what we call constant time, but it presupposes, again,
that the data is laid out back-to-back-to-back so that every one
of these numbers is an equal distance away from every other.
Because otherwise if you were to do this math
and coming up with the numbers 3 or 1, you
have to be able to know where you're jumping in memory,
because that number 42 can't be down here, it has to be numerically in order
exactly where you expect.
And so in computer science and in programming is this kind of arrangement
where you have doors or really data back-to-back-to-back known
as what's called an array.
An array is a contiguous block of memory wherein values are stored
back-to-back-to-back-to-back-- from left to right conceptually,
although of course, direction has less meaning once you're inside
of a computer.
Now it is thanks to these arrays that we were able to search,
even something like a phone so quickly.
After all, you can imagine in the physical world,
a phone book isn't all that unlike an array,
albeit a more arcane version here, because its pages are indeed
back-to-back-to-back-to-back from left to right, which is wonderful.
And you'll recall when we searched a phone book,
we were already able to describe the efficiency via which
we were able to search it-- via each of those three algorithms.
One page at a time, two pages at a time, and then one-half
of the remaining problem at a time.
Well it turns out that there's a direct connection even
to the simplification of that same problem.
If I have n doors and I search them from left to right, that of course
might take me as many six, seven total steps or n if the number I'm seeking
is all the way at the end.
I could have gone two doors at a time, although that really
would have gone off the rails with the randomly-sorted numbers,
because there would have been no logic to just going left to right twice as
fast because I would be missing every other element never knowing
when to go back.
And in the case of binary search, my last algorithm where
I started in the middle and found 16, and then
started in the middle of that middle and found 42, and then
started in the middle of the middle and found my last number,
binary search is quite akin to what we did by tearing that problem in half
and in half.
So how did we describe the efficiency of that algorithm last time?
Well we proposed that my first algorithm was linear, this straight line in red
represented here by the label n, because for every page in the phone book,
in the worst case you might need one extra step to find someone
like Mike Smith.
And indeed, in the case of these doors, if there's just one more door added,
you might need one more step to find that number 50 or any other.
Now I could, once those doors are sorted,
go through them twice as fast, looking two doors at a time,
and if I go too far and find, say, 51, I could double-back and fix that mistake.
But what I ultimately did was divide and conquer.
Starting in the middle, and then the middle of the middle,
and the middle of the middle of the middle,
and that's what give me this performance.
This so-called logarithmic time-- log base 2 event
which if nothing else means that we have a different shape fundamentally
to the performance of this algorithm.
It grows so much more slowly in time even as the problem gets really big.
And even off the screen here, imagine that even as n gets huge,
that green line would not seem to be going very high
even as the red and yellow ones do.
So in computer science, there are actually
formal labels we can apply to this sort of methodology of analyzing algorithms.
When you talk about upper bounds, on just how much time an algorithm takes,
you might say this-- big O, quite literally.
That an algorithm is in a big O of some formula.
For instance, among the formulas it might be are these here--
n squared, or n log n, or n, or log n, or 1.
Which is to say you can represent somewhat simply
mathematically using n-- or really any other place holder-- as your value
a variable that represents the size of the problem in question.
So for instance, in the case of linear search,
when I'm searching that phone book left to right
or searching these doors left to right, in the worst case,
it might take me as many as n steps to find Mike or that 50,
and so we would say that that linear algorithm is
in big O of n, which is just a fancier way of saying quite simply that it's
indeed linear in time.
But sometimes I might get lucky, and indeed in the best case,
I might find Mike or 50 or anything else much faster,
and computer scientists also have ways of expressing lower bounds
on the running times of algorithms.
Whereby in the best case, perhaps, an algorithm
might take only this much time and at least this much time.
And we use a capitalized omega to express that notion of a lower bound,
whereas again, a big O represents an upper bound on the same.
So we can use these same formulas, because depending on the algorithm,
it might indeed take n squared steps or just 1 or constant number thereof,
but we can consider even linear search to having a lower bound,
because in the best case, maybe Mike or maybe 50
or any other inputs of the problem just so
happens to be at the very beginning of that book or those doors.
And so in the best case, a lower bound on the running time of linear search
might indeed be omega of 1 because you might just
get lucky and take one step or two or three
or terribly few, but independent of the number n.
And so there, we might express this lower bound as well.
Now meanwhile there's one more Greek symbol
here, theta, capitalized here, which represents a coincidence of upper
and lower bounds.
Whereby if it happens to be the case for some algorithm
that you have an upper bound and a lower bound that are the same,
you can equivalently say not both of those statements, but quite simply
that the algorithm is in theta of some formula.
Now suffice it to say, this green line is good.
Indeed, any time we achieve logarithmic time instead of, say, linear time, we
have made an improvement.
But what did we presuppose?
Well, we presupposed in both the case of the phone book
and in the case of those doors that they were sorted in advance for us.
By me in the case of the doors and by the phone
company in the case of the book.
But what did it cost me and what did it cost
them to sort all of those numbers and names
just to enable us ultimately to sort logarithmically?
Well let's consider that in the context of, again, some numbers, this time
some numbers that I myself can move around.
Here we have eight cups, and on these eight cups are eight numbers from 1
through 8.
And they're indeed sorted from smallest to largest, though I could equivalently
do this problem from largest to smallest so long as we all
agree what the goal is.
Well let me go ahead and just randomly shuffle some of these cups
so that not everything is in order anymore,
and indeed now they're fairly jumbled, and indeed not in the order I want,
so some work needs to be done.
Now why might they arrive in this order?
Well in the case of the phone book, certainly new people
are moving into a town every day, and so they're coming in not themselves
in alphabetical order, but seemingly random,
and it's up to the phone company to slot them
into the right place in a phone book for the sake of next year's print.
And the same thing with those doors.
Were I to add more and more numbers behind those doors,
I'd need to decide where to put them, and they're not necessarily
going to arrive for my input source in the order I want.
So here, then, I have some randomly-ordered data,
how do I go about sorting it quickly?
Well, let's take a look at the first problem I see.
2 and 1 are out of order, so let me just go ahead and swap, so to speak,
those two.
I've now improved to the state of my cups,
and I've made some progress still, but 2 and 6
seem OK even though maybe there should be some cups in between.
So let's look at the next pair now.
We have 6 and 5, which definitely are out of order, so let's switch those.
6 and 4 are the same, out of order.
6 and 3, just as much.
6 and 8 are not quite back-to-back, but there's probably
going to be a number in-between, but they are at least in the right order,
because 6, of course, is less than 8.
And then lastly we have 8 and 7.
Let's swap those here and done--
or are we not?
Well I've made improvements with every such swap, but some of these cups
still remain out of order.
Now these two are all set.
2 and 5 are as well, even though ultimately we
might need some numbers between them, but 4 and 5 are indeed out of order.
3 and 5 just as much.
6 and 5 are OK, 7 and 6 are OK, and 8 and 7 as well.
So we're almost done there, but I do see some glitches.
So let's again compare all of these cups pairwise--
1, 2; 2, 4-- oops, 4, 3, let's swap that.
Let's keep going just to be safe.
4, 5; 5, 6; 6, 7; 7, 8.
And by way of this process, just comparing cups back-to-back,
we can fix any mistakes we see.
Just for good measure, let me do this once more.
1, 2; 2, 3; 3, 4; 4, 5; 5, 6; 6, 7; 7, 8.
Now this time that I've gone all the way from left to right checking
that every cup is in order, I can safely conclude that these cups are sorted.
After all, if I just went from left to right and did no work,
why would I presume that if I do that same algorithm again,
I'd make any changes?
I wouldn't, so I can quit at this point.
So that's all fine and good, but perhaps we could have sorted these differently.
That felt a little tedious and I felt like I was doing a lot of work.
What if I just try to select the cups I want rather than deal
with two cups at a time?
Let's go ahead and randomly shuffle these again in any old order,
making sure to perturb what was otherwise left to right.
And here we have now another random assortment of cups.
But you know what I'm going to do this time?
I'm just going to select the smallest I see.
2 is already pretty small, so I'll start as before on the left.
So let's now check the other cups to see if there's something smaller that I
might prefer to be in this location.
3, 1-- ooh, 1 is better, I'm going to make mental note of this one.
5, 8, 7, 6, 4-- all right, so 1 would seem to be the smallest number.
So I'm going to go ahead and put this where it belongs,
which is right here at the side.
There's really no room for it, but you know what?
These were randomly-sorted, let me just go ahead
and evict whatever's there, too, and put 1 in it's place.
Now to be fair, I might have messed things up a little bit,
but no more so than I might have when I received these numbers randomly.
In fact, I might even get lucky-- by evicting a cup,
I might end up putting it in the right place so it all washes out in the end.
Now let's go ahead and select the next smallest number,
but not bother looking at that first one anymore.
So 3 is pretty small, so I'll keep that in mind.
2 is even smaller, so I'll forget about 3 and now remember 2.
5 is bigger, 8 and 7 and 6 and 4--
all right, 2 now seems to be the next smallest number I can select.
I know it belongs there, but 3's already there, so let's evict 3
and there you go, I got lucky.
Now I have 1 and 2 in the right place.
Let's again select the next smallest number.
I see 3 here, and again, I don't necessarily know as a computer
if I'm only looking at one number at a time
if there are, in fact, anything smaller to its side.
So let's check-- 5, 8, 7, 6, 4-- nope.
So 3 I shall select, and I got lucky, I'll leave it alone.
How about the next smallest number?
5 is pretty small, but 8, 7, 6, 4 is even smaller.
Let's select this one, put it in its place, evicting the 5
and putting it where there's room.
8 is not that small, but it's all I know now.
But ooh-- 7 is smaller, I'll remember this.
6 is even smaller, I'll remember that, and it feels
like I'm creating some work for myself.
5 is the next smallest, 8's in the way.
We'll evict 8 and put 5 right there.
7 is pretty small, but 6 is even smaller, but still smaller than 8,
so let's pick up 6, evict 7, and put 7 in its place.
Now for good measure, we're obviously done, but I as the computer
don't know that yet if I'm just looking at one of these cups or, if you will,
doors at a time.
7's pretty small, 8 is no smaller, so 7 I've selected
to stay right there in its place.
8 as well, by that same logic, is now in its right place.
So it turns out that these two algorithms
that I concocted along the way actually do have some formal semantics.
In fact, in computer science, we'd call the first
of those algorithms that thing here, bubble sort.
Because in fact, as you compare two cups side-by-side and swap them on occasion
in order to fix transpositions, well, your largest numbers
would seem to be bubbling their way up to the top,
or equivalently, the smallest ones down to the end, and so bubble sort
is the formal name for that algorithm.
How might express this more succinctly than my voice over there?
Well let me propose this pseudocode.
There's no one way to describe this or any algorithm,
but this was as few English words as I could come up with and still
be pretty precise.
So repeat until no swaps the following-- for i from 0 to n minus 2,
if the i-th and i-th plus 1 elements are out of order, swap them.
Now why this lingo?
Well computational thinking is all about expressing yourself
very methodically, very clearly, and ultimately
defining, say, some variables or terms that you'll need in your arguments.
And so here what I've done is adopt a convention.
I'm using i to represent an integer--
some sort of counter--
to represent the index of each of my cups or doors or pages.
And here, we are adopting the convention, too,
of starting to count from 0.
And so if I want to start looking at the first cup, a.k.a.
0, I want to keep looking up, up to the cup called n minus 2,
because if my first cup is cup 0, and this is then 1, 2, 3, 4, 5, 6, 7,
indeed the cup is labeled 8, but it's in position 7.
And so this position more generally, if there are n cups, would be n minus 1.
So bubble sort is telling me to start at 0 and then look up to n minus 2,
because in the next line of code, I'm supposed
to compare the i-th elements and the i-th plus 1, so to speak.
So I don't want to look all the way to the end,
I want to look one shy to the end, because I know in looking at pairs,
I'm looking at this one as well as the one to its right, a.k.a.
i plus 1.
So the algorithm ultimately is just saying,
as you repeat that process again and again until there are no swaps, just
as I proposed, you're swapping any two cups that with respect to each other
are out of order.
And so this, too, is an example more generally of smalling local problems
and achieving ultimately a global result, if you will.
Because with each swap of those cups, I'm improving the quality of my data.
And each swap in and of itself doesn't necessarily solve the big picture,
but together when we aggregate all of those smaller solutions have we
assembled the final result.
Now what about that second algorithm, wherein
I started again with some random cups, and then that time I
selected one at a time the number I actually wanted in place?
I first sought out the smallest.
I found that to be 1 and I put it all the way there on the left.
And I then sought out the next smallest number,
which after checking the remaining cups, I determined was 2.
And so I put 2 second in place.
And then I repeated that process again and again,
not necessarily knowing in advance from anyone what numbers I'd find.
Because I checked each and every remaining cup,
I was able to conclude safely that I had indeed found the next smallest element.
And so that algorithm, too, has a name--
selection sort.
And I might describe it pseudocode similar
in structure but with different logic ultimately.
Let me propose that we do for i from 0 to n minus 1,
where again, n is the number of cups, and 0 is by convention my first cup,
and n minus 1, therefore, is my last.
And what I then want to do is find the smallest element between the i-th
element and the n-th plus--
at n minus 1.
That is, find the smallest element between wherever you've begun
and that last element, n minus 1.
And then if-- when you've found that smallest element,
you swap it with the i-th element.
And that's why I was picking up one cup and another
and swapping them in place-- evicting one and putting one where it belongs.
And you do this again and again and again,
because each time your incrementing 1.
So whereas the first iteration of this loop will start here all the way left,
the second iteration will start here, and the third iteration
will start here.
And so with the amount of problem to be solved
is steadily decreasing until I have 1 and then 0 cups left.
Now it certainly took some work to sort those n cups,
but how much work did it take?
Well in the case of bubble sort, what was I doing on each pass
through these cups?
Well I was comparing and then potentially swapping
each adjacent pair of cups, and then repeating myself again and again.
Well if we have here n cups, how many pairs
can you create which you then consider swapping?
Well if I have n cups, I could seem to make 1, 2, 3, 4, 5, 6,
7 out of 8 pairs at a time, so more generally n minus 1.
So on each pass here, it would seem that I'm comparing n minus 1 cups.
Now how many passes do I need to ultimately make?
It would seem to be roughly n, because in the worst case,
these cups might be completely out of order.
Which is to say, I might indeed do n things n minus 1 times,
and if you multiply that out, I'm going to get some factor of n squared.
But what about selection sort, wherein I instead looked through all of the cups,
selecting first the smallest, and then repeating
that process for the next smallest still?
Well in that case, I started with n cups,
and I might need to look at all n, and then
once I found that, I might instead look at n minus 1.
So there, too, I seem to be summing something like n plus n minus 1
plus n minus 2 and so forth, so let's see
if we can't now summarize this as well.
Well let me propose more mathematically, that, say, with selection sort,
what we've done is this.
In looking for that smallest cup, I had to make n minus 1 comparisons.
Because as I identified the smallest cup I'd yet seen,
I compared it to no more than n minus others.
Now if the first selection of a cup took me n minus 1 steps but then it's done,
the next lesson of the next smallest cup would
have taken me only n minus 2 steps.
And if you continue that logic with each pass,
you have to do a little bit less work until you're left with just one
very last cup at the end, such as 8.
So what does this actually sum too?
Well you might not remember or see it at first glance,
but it turns out, particularly if you look at one of those charts
at the back of a textbook, does this summation or series actually
aggregate to n times n minus all divided by 2.
Now this you can perhaps multiply out a bit more readily as
in n squared minus n all divided by 2.
And if we factor that out, we can now get n squared
divided by 2 minus n divided by 2.
Now which of these terms, n squared divided by 2 or n divided by 2,
tends to dominate the other?
That is to say, as n gets larger and larger,
which of these mathematical expressions has the biggest effect
on the number of steps?
Well surely it's n squared, albeit divided by 2, because as n gets large,
n squared is certainly larger than n.
And so what a computer scientist here would typically do
is just ignore those lower-ordered terms, so to speak.
And he would say with a figurative or literal wave of the hand,
this is on the order of n squared this algorithm.
That isn't to say it's precisely that many steps,
but rather as n gets really large, it is pretty much
that n squared term that really matters the most.
Now this is not a form of proof, but rather a proof by example, if you will,
but let's see if I can't convince you with a single example numerically
of the impact of that square.
Well if we start again with n squared over 2 minus n over 2 and say n
is maybe 1 million initially-- so not eight cups, not 1,000 pages in a book,
but 1 million numbers or any other element itself.
What does this actually sum to?
Well 1 million squared divided by 2 minus 1 million divided by 2
happens to be 500 billion minus 500,000, which of course is 499,999,500,000.
Now I daresay that is pretty darn close to big O of n squared.
Why?
Well if we started with, say, 1 trillion then
halved it and ended up with 499 billion, that's still pretty close.
Now in real terms, that does not equal the same number of steps,
but it gives us a general sense it's on the order of this many steps,
because if we plugged in larger and larger values for n,
that difference would not even be as extreme.
Well why don't we take a look now at these algorithms in a different form
altogether without the physical limitation of me as the computer?
Pictured here is, if you will, an array of numbers, but pictured graphically.
Wherein we have vertical bars, and the taller
the bar, the bigger the number it represents.
So big bar is big number, small bar is small number,
but they're clearly, therefore, unsorted.
Via these number of algorithms we've seen, bubble sort and selection sort,
what does it actually look like to sort of many elements?
Let's take a look.
In this tool where I proceed to choose my first algorithm,
which shall be, say, bubble sort.
And you'll see rather slowly that this algorithm is indeed comparing
pairwise elements, and if--
and only if they're out of order, swapping them again and again.
Now to be fair, this quickly gets tedious,
so let me increase the animation speed here.
And now you can rather see that bubbling up of the largest.
Previously it was my 8 and my 7 and 6.
Here we have 99, 98, 97, but indeed, those tallest bars
are making their way up.
So let's turn our attention next to this other algorithm, selection sort,
to see if it looks or perhaps feels rather different.
Here now we have selection sort each time
going through the entire list looking for the smallest possible element.
Highlighted in red for just a moment here is
9, because we have not yet until-- oh, now found
a smaller element, now 2, and now 1.
And we'll continue looking through the rest of the numbers just
to be sure we don't find something smaller, and once we do,
1 goes into place.
And then we repeat that process, but we do fewer steps now,
because whereas there are n total bars, we don't need to look at the leftmost
now because it's sorted, we only need look at n minus 1.
So this process again will repeat.
We found 2.
We're just double-checking that there's not something smaller,
and now 2 is in its place.
Now we humans, of course, have the advantage
of having an aerial view, if you will, of all this data.
And certainly a computer could remember more than just
the smallest number it's recently seen.
Why not for efficiency remember the two smallest numbers?
The three smallest numbers?
The four smallest numbers?
That's fine, but that argument is quickly devolving into--
just remember all the original numbers.
And so yes, you could perhaps save some time,
but it sounds like you're asking for more and more space
with which to remember the answers to those questions.
Now this, too, would seem to be taking us all day.
Even if we down here increase the animation speed,
it now is selecting those elements a bit faster and faster,
but there's still so much work to be done.
Indeed, these comparison-based sorts that are comparing things again
and again and then redoing that work in some form to improve the problem still
just tend to end up on the order of--
bingo, of n squared.
Which is to say that n squared or something quadratic
tends to be rather slow.
And this is in quite contrast to our logarithmic time before,
but that logarithm thus far was for searching, not sorting.
So let's compare these two now side by side,
albeit with a different tool that presents the same information
graphically sideways.
Here again we have bars, and small bar is small number,
and big bar is big number, but here, they've simply been rotated 90 degrees.
On the left here we have selection sort, on the right here bubble sort,
both of whose bars are randomly sorted so that neither
has an edge necessarily over the other.
Let's go ahead and play all and see what happens here.
And you'll see that indeed, bubbles bubbling up
and selection is improving its selections as we go.
Bubble would seem to have won because selection's got a bit more work,
but there, too, it's pretty close to a tie.
So can we do better?
Well it turns out we can, so long as we use a bit more of that intuition
we had when we started thinking computationally
and we divided and conquered, we divided and conquered.
In other words, why not, given n doors or n cups or in pages,
why don't we divide and conquer that problem again and again?
In other words, in the context of the cups,
why don't I simply sort for you the left half and then the right half,
and then with two sorted halves, just interweave them for you together.
That would seem to be a little different from walking back and forth
and back and forth and swapping elements again and again.
Just do a little bit of work here, a little bit
more now, and then reassemble your total work.
Now of course, if I simply say, I'll sort this left half,
what does it mean to sort this left half?
Well, I dare say this left half can be divided
into a left half of the left half, thereby making the problem smaller.
So somehow or other, we could leverage that intuition of binary search,
but apply it to sort.
It's not going to be in the end quite as fast as binary search,
because with sort, you have to deal with all of the elements,
you can't simply tear half of the problem
away because you'd be leaving half of your elements unsorted.
But it turns out there's many algorithms that
are faster than selection and bubble sort, and one of those
is called merge sort.
And merge sort leverage is precisely this intuition of dividing
a problem in half and in half, and to be fair, touching all of those halves
ultimately, but doing it in a way that's more efficient and less
comparison-based than bubble sort and selection sort themselves.
So let me go ahead and play all now with these three sets of bars
and see just which one wins now.
And after just a moment, there's nothing more
to say-- merge sort has already won, if you will, even though now bubble has,
and now selection.
And perhaps this was a fluke-- to be fair,
these numbers are random, maybe merge sort got lucky.
Let's go ahead and play the test once more with other numbers.
And indeed it again is done.
Let me play it one third and final time, but notice the pattern
now that emerges with merge sort.
You can see if you look closely the actual halving again and again.
And indeed, it seems that half of the list get sorted,
and then you re assemble it at the very end.
And indeed, let's zoom in on this algorithm
now and look specifically at merge sort alone.
Here we have merge sort, and highlighted in colors
as we do work is exactly the elements you're sorting again and again.
The reason so few of these bars are being
looked at a time is because again, logically or recursively, if you will,
are we sorting first the left half?
But no, the left half of the left half.
But no, the left half of the left half of the left half and so
forth, and what this really boils down to
ultimately is sorting eventually individual elements.
But if I hand you one element and I say, please sort this,
it has no halves, so your work is done-- you don't need do a thing.
But then if you have two halves, each of size 1,
there might indeed be work to be done there,
because if one is smaller than the other or one is larger than the other,
you do need to interleave those for me to merge them.
And that's exactly what merge sort's doing here.
Allow me to increase the animation speed and you'll see as we go,
that half of the list is getting sorted at a time.
It's not perfect and it's not perfectly smooth,
because that's-- because half of the other elements are there,
but now are reemerging the two halves.
And that was fast, but it finished faster
indeed than would have been for bubble and selection sort,
but there was a price being paid.
If you think back to our vertical visualization of bubble sort
and selection sort, they were doing all of their work in place.
Merge sort seemed to be getting a little greedy on us, if you will,
and that it was temporarily putting some of those bars down here,
effectively using twice as much space as those first two algorithms, selection
and bubble.
And indeed, that's where merge sort gets its edge fundamentally.
It's not just a better algorithm, per se, and better thought-out,
but it actually additionally consumes more resources-- not time, but space.
By using twice as much space-- not just the top half of the screen,
but the bottom--
can merge sort temporarily put some of its work over here,
continue doing some other work, and then reassemble them together.
Both selection sort and bubble sort did not have that advantage.
They had to do everything in place, which
is why we had to swap so many things so many times.
We had far fewer spots in which to work on that table.
But with merge sort, spend a bit more space,
and you can reduce that amount of time.
Now all of these algorithms assume that our data is back-to-back-to-back--
that is, stored in an array.
And that's great, because that's exactly how a computer is so inclined
to store data inherently.
For instance, pictured here is a stick of memory of RAM--
Random Access Memory.
And indeed, albeit a bit of a misnomer that R in RAM, random, actually
means that a computer can jump in instant or constant time
to a specific byte.
And that's so important when we want to jump
around our data, our cups, or our pages in order to get at data instantly,
if you will.
And the reason it is so conducive to laying out information back-to-back
contiguously in memory is if we consider one of these black chips on this DIMM--
or Dual In-line Memory Module--
is that we have in this black chip really, if you will,
an artist's rendition at hand.
That artist's rendition might propose that if you
have some number of bytes in this chip, say 1 billion for 1 gigabyte,
it certainly stands to reason that we humans could
number those bytes from 0 on up--
from 0 to 1 billion, roughly speaking.
And so the top left one here might be 0, the next one might be 1,
the next one thereafter should be 2, and so we can
number each and every one of our bytes.
And so when you store a number on a cup or a number behind a door,
that amounts to just writing those numbers inside of each of these boxes.
And each is next to the other, and so with simple arithmetic,
a bit of division and rounding, might you
be able to jump instantly to any one of these addresses?
There is no moving parts here to do any work like my human feet might
have to do in our real world.
Rather the computer can jump instantly to that so-called address
or index of the array.
Now what can we do when we have a canvas that
allows us to layout memory in this way?
We can represent any number of types.
Indeed in Python, there are all sorts of types of data.
For instance, bool for a Boolean value and float for a floating point
value, a real number with a decimal.
An int for an integer and str for a string.
Each of those is laid out in memory in some particular way that's
conducive to accessing it efficiently.
But that's precisely why, too, we've run into issues
when using something like a float, because if you decide a priori to use
only so many bytes, bytes to the left and to the right,
above and below it might end up getting used by other parts of your program.
And so if you've only asked for, say, 32 or 64 bits or 4 or 8 bytes,
because you're then going to be surrounded by other data,
that floating point value or some other can only be ultimately so precise.
Because ultimately yes, we're operating in bits,
but those bits are physically laid out in some order.
So with that said, what are the options via which we can paint on this canvas?
Surely it would be nice if we could store
data not necessarily always back-to-back in this way,
but we can create more sophisticated data structures
so as to support not only these types here, but also ones like these.
Dict in Python for dictionary, otherwise known as a hash table.
And list for a sort of array that can grow and shrink, and range
for a range of values.
Set for a collection of values that contain
no duplicates, and tuples, something like x, y or latitude, longitude.
These concepts-- surely it would be nice to have
accessible to us in higher level contexts like Python,
but if at the end of the day all we have is bytes of memory back-to-back,
we need some layers of abstraction on top of that memory
so as to implement these more sophisticated structures.
So we'll take a look at a few in particular ints
and str and dict and list, because all of those
somehow need to be built on top of these lower-level principles of memory.
So how might this work and what problems might we solve?
Let's now use the board as my canvas, drawing on it
that same grid of rows and columns in order
to divide this screen into that many bytes.
And I'll go ahead and divide this board into these squares, each one of which
represents an individual byte, and each of those bytes, of course,
has some number associated with it.
That number is not the number inside of that box, per se, not the bits
that compose it, but rather just metadata-- an index
where address that exists implicitly, but is not actually stored.
This then might be index 0 or address 0, this
might be 1, this 2, this 3, this one 4, this one 5.
And if we, as for artist's sake, move to the next row,
we might call this 6 and this 7, and so forth.
Now suppose we want to store some actual values in this memory,
well let's go ahead and do just that.
We might stored the actual number 4 here, followed by 8,
followed by 15 and 16, perhaps followed by 23, and then 42.
And so we have some random numbers inside of this memory,
and because those numbers are back-to-back,
we can call this an array of size 6.
Its first index is 0, its last index is 5,
and between there are six total values.
Now what can we do if we're ready to add a seventh number to this list?
Well, we could certainly put it right here
because this is the next appropriate location,
but it depends whether that spot is still available.
Because the way a computer typically works
is that when you're writing a program, you
need to decide in advance how much memory you want.
And you tell the computer by way of the operating system,
be it Windows or macOS, Linux, or something else,
how many bytes of memory you would like to allocate to your particular problem.
And if I only had the foresight to say, I
would like 6 bytes in which to store 6 numbers,
the operating system might have handed me that back and said,
fine, here you go, but the operating system thereafter
might have proceeded to allocate subsequent adjacent bytes, like 6
and 7, to some other aspect of your program.
Which is to say, you might have painted yourself into a bit of a corner
by only in code asking the operating system for just those initial 6 bytes.
You instead might have wanted to ask for more bytes
so as to allow yourself this room to grow,
but if you didn't do that in code, you might just be unlucky.
But that's the price you pay for an array.
You have this wonderfully efficient ability
to search it randomly, if you will, which
is to say instantly via arithmetic.
You can jump to the beginning or the end or even the middle,
as we've seen, by just doing perhaps some addition, subtraction, division,
and rounding, and that gets you ultimately right where
you want to go in some constant and very few number of steps.
But unfortunately, because you wanted all of that memory
back-to-back-to-back, it's up to you to decide how much of it you want.
And if the operating system, I'm sorry, has already allocated 6, 7,
and elsewhere on the board to other parts of the program,
you might be faced with the decision as to just say, no,
I cannot accept any more data, or you might say, OK, operating system,
what if I don't mind where I am in memory-- and you probably don't--
but I would like you to find me more bytes somewhere else?
Rather like going from a one-bedroom to a two-bedroom apartment
so that you have more room, you might physically have to pack your bags
and go somewhere else.
Unfortunately, just like in the real world, that's not without cost.
You need to pack those bags and physically move, which takes time,
and so will it take you and the operating system some time
to relocate every one of your values.
So sure, there might be plenty of space down here below on multiple rows
and even not pictured, but it's going to take a non-zero amount of time
to relocate that 4 and 8 and 15 and that 16 and 23 and 42 to new locations.
That might be your only option if you want to support more data,
and indeed, most programs would want-- it would be an unfortunate situation
if you had to tell your user or boss, I'm sorry, I ran out of space,
and that's certainly foolish.
If you actually do have more space, it's just not right there next to you.
So with an array, you have the ability physically
to perform very sophisticated, very efficient algorithms such as we've
seen-- binary search and bubble sort and selection sort
and merge sort, and do so in quite fast time.
Even though selection sort and bubble sort were big O of n squared,
merge sort was actually n times log n, which is slow--
which is slower than log n alone, but faster than n squared.
But they all presuppose that you had random access to elements
arithmetically via their indexes or address, and to do so,
you can with your computer's memory with arrays,
but you need to commit to some value.
All right, fine.
Let's not ask the operating system for 6 bytes initially, let's say, give me 7
because I'm going to leave one of them blank.
Now of course, that might buy you some runway, so to speak,
so that you can accommodate if and when a seventh element,
but what about an eighth?
Well, you could ask the operating system from the get-go, don't get me
6 bytes of space, but give me 8 or give me 16 or give me 100.
But at that point, you're starting to get a little greedy,
and you're starting to ask for more memory than you might actually
need anytime soon, and that, too, is unfortunate,
because now you're being wasteful.
Your computer, of course, only has a finite amount of space,
and if you're asking for more of it than you actually need,
that memory, by definition, is unavailable to other parts
of your program and perhaps even others.
And so your computer ultimately might not
be able to get as much work done because it's been holding off to the side
just some empty space.
Empty parking spaces you've reserved for yourself or empty seats
at a table that might potentially go unused, it's just wasteful.
And hardware costs money.
And hardware enables you to solve problems.
And with less hardware available, can you solve fewer problems at hand,
and so that, too, doesn't feel like a perfect solution.
So again, this series of trade-offs, it depends
on what's most important to you-- time or space or money or development
or any number of other scarce resources.
So what can we do instead as opposed to an array?
How do we go about getting dynamism that we so clearly wants here,
whereas it wouldn't-- wouldn't it be nice if we could grow these great data
structures, and better yet, even shrink them?
If I no longer need some of these numbers,
I'm going to give you back that memory so that I can use it elsewhere
for more compelling purposes.
Well it turns out that in computer science,
programmers can create even fancier data structures
but at a higher level of abstraction.
It turns out, we could start making lists out of our values.
In fact, if I wanted to add some number to the screen, and for instance,
maybe these two spots were blocked off by something else.
But you know what?
I do know there's some room elsewhere on the screen,
it just happens to be available here.
And so if I want to put the number 50 in my list of values,
I might just have to say, I don't care where you put it,
go ahead and put it right there.
Well where is there?
Well if we continue this indexing-- this is 6 and 7 and 8 and 9, 10, 11, 12,
13, 14, and 15, if 50 happens to end up by chance at location 15
because it's the first byte available, because not only these two, but maybe
even all of these are taken for some other reason--
ever since you asked for your first six, that's OK,
so long as you can somehow link your original data to the new.
And pictorially here, I might be inclined just to say, you know what?
Let me just leave a little breadcrumb, so to speak, and say that after the 42,
I should actually go down here and follow this arrow.
Sort of Chutes and Ladders style, if you will.
Now that's fine and you can do that-- after all, at the end of the day,
computers will do what you want, and if you
can write the code to implement this idea,
it will, in fact, remember that value.
But how do we achieve this?
Here, too, you have to come back to the fundamental definition
of what your computer is doing and how.
It's just got that chip of memory, and those bytes back-to-back,
such as those pictured here.
So this is all you get-- there is no arrow feature inside of a computer.
You have to implement that notion yourself.
So how can you go about doing that?
Well, you can implement this concept of an arrow,
but you need to implement it ultimately at a lower level or trust
that someone else will for you.
Well, as best I can tell, I do know that my first several elements happened
to be back-to-back from 4 on up to 42 in locations 0 through 5.
Because those are contiguous, I get my random access
and I can immediately jump from beginning to middle to end.
This 50 and anything after it needs to be handled a little better.
If I want to implement this arrow, the only possible way
seems to be to somehow remember that the next element after 42
is at location 15.
And that location, a.k.a.
address or index, just has to be something I remember.
Unfortunately I don't have quite enough room left to remember that.
What I really want to do is not store this arrow, but by the way,
parenthetically go ahead and store the number 15--
not as the index of that cell, but as the next address
that should be followed.
The catch, though, is that I've not left myself enough room.
I've made mental note in parentheses here
that we've got to solve this a bit better.
So let's start over for the moment, and no longer worry
about this very low level, because it's too messy at some point.
It's like talking in 0's and 1's--
I don't want to talk in bytes in this way.
So let's take things up in abstraction level,
if you will, and just agree to agree that you can store values in memory,
and those values can be data, like numbers you want--
4, 8, 15, 16, 23, 42, and now 50.
And you can also store somehow the addresses or indexes--
locations of those values.
It's just up to you how to use this canvas.
So let's do that and clear the screen and now start
to build a higher-level concept.
Not an array, but something we'll call a linked list.
Now what is a linked list?
A linked list is a data structure that's a higher-level concept in abstraction
on top of what ultimately is just chunks of memory or bytes.
But this linked list shall enable me to store more and more values
and even remove them simply by linking them together.
So here, let me go ahead and represent those same values starting with 4,
followed by 8 and 15, and then 16 and 23, and finally, 42.
And now eventually I'm going to want to store 50, but I've run out of room
but that's fine, I'm going to go ahead and write 50 wherever there's space.
But now let's not worry about that grid, rows, and columns of memory.
Let's just stipulate that yes, that's actually there,
but it's not useful to operate at that level.
Much like it's not useful to continually talk in terms of 0's and 1's.
So let me go ahead and wrap these values with a higher-level idea
called a node or just a box.
And this box is going to store for us each of these values.
Here I have 4, here I have 8 and 15, here I have 16,
I have 23, and finally, 42.
And then when it comes time to add 50 to the mix,
it, too, will come in this box.
Now what is this box?
It's just an artist's rendition of the underlying bytes,
but now I have the ability to draw a prettier picture, if you will,
that somehow interlinks these boxes together.
Indeed, what I ultimately want to remember
is that 4 comes first and 42 comes last, but then wait, if I had 50,
it shall now come last.
So we could do this as an artist quite simply with those arrows pointing
each box to the next, implying that the next element in the list,
whether it's next door or far away, happens to be at the end of that arrow.
But what are those arrows?
Those are not something that you can represent in a computer
if at the end of the day all you have are blocks of memory and in them bytes.
If all you have are bytes-- when, therefore, patterns
of 0's and 1's, whatever you store in the computer
must be representable with those 0's and 1's, and among the easiest things
to represent, we know already, is numbers, like indexes or addresses
of these nodes.
So for instance, depending on where these nodes are in memory,
we can simply check that address and store it as well.
So for instance, if the 4 still happens to be at address 0,
and this time 8 is at address 4, and this one 8, and this one 12,
and this one 16, and this one 20-- just by chance back-to-back-to-back 4 bytes
apart--
32 bits, well 50 might be some distance away.
Maybe it's actually at location 100, that's OK.
We can still do this.
Because if we use part of this node, part of each box
to implement those actual arrows, we can actually store all the information
we need to know how to get from one box to another.
For instance, to get from 4 to the next element,
you're going to want to coincidentally go to not number 4, but address 4.
And if you want to go from value 8 to the next value, 15,
you're going to want to go to address 8.
And if you want to go from 15 to 16, the next address
is going to be 12, followed by 16, followed by 20.
And herein lies the magic--
if you want to get from 42 to that newest element that's
just elsewhere at address 100, that's what gets associated with 42's node.
As for 50, it's the dead end.
There's nothing more there, so we might simply
draw a line through that box saying, eh, just
store it all 0 bits or some other convention equivalently.
So there's so many numbers now on the screen,
but to be fair, that's all that's going on inside of a computer--
just storing of these bytes.
But now we can stipulate that, OK, I can somehow
store the location of each node in memory using its index or address.
It's just frankly not all that pleasant to stare at these values,
I'd much rather look at and draw the arrows graphically,
thereby representing the same idea of these pointers, if you will,
a term of art in some languages that allows me to remember
which element goes to which.
And what is the upside of all this now complexity?
Well now we have the ability to string together all of these nodes.
And frankly, if we wanted to remove one of these elements
from the list, that's fine, we can rather snip it out.
And we can simply update what the arrow is pointing to,
and equivalently, we can update the next address in that node.
And we can certainly add to this list by drawing more nodes here or perhaps
over here and just link them with arrows conceptually, or more specifically,
by changing that dead end to the address of the next element.
And so we can create the idea of the abstraction of a list using
just this canvas of memory.
But not all is good here.
We've surely paid a price, right?
Surely we couldn't get dynamism for addition and removal
and updating of a list without paying some price.
This dynamic growth, this ability to store as many more elements
as we want without having to tell the operating system from the get-go how
many elements we expect.
And indeed, while we're lucky at first, perhaps,
if we know from the get-go we need at least six values here,
they might be a consistent distance apart--
4 bytes or 32 bits.
And so I could do arithmetic on some of these nodes,
but that is no longer, unfortunately, a guarantee of this structure.
Whereas arrays do guarantee you random access, linked lists do not.
And linked lists instead require that you traverse them in linear time
from the first element potentially all the way to the last.
There is no way to jump to the middle element,
because frankly, if I do that math as before, 100 bytes away is the last,
so 100 divided by 2 is 50-- rounding down, keeping me at 50,
puts me somewhere over here, and that's not right.
The middle element is earlier, but that's
because there's no now support for random access or instant arithmetic
access to elements like the first, last, or middle.
All we'll remember now for the linked list is that first element,
and from there, we have to follow all of those breadcrumbs.
So that might be too high of a price to pay.
And moreover, there's overhead now, because I'm not storing
for every node one value, but two--
the value or data I care about, and the address or metadata that lets me
get to the next node.
So I'm using twice as much space there, say,
at least when storing numbers, but at least
I'm getting that dynamic support for growth.
So again, it depends on that trade-off and what is less costly to you.
But never fear.
This is just another problem to solve.
To be clear, we'd like to retain the dynamism that something
a linked list offers-- the ability to grow and even shrink that data
structure over time without having to decide a priori just how much memory we
want.
But at the moment we've lost the ability to search it quickly, as with something
like binary search.
So wouldn't it be nice if we could get both properties together?
The ability to grow and shrink as well as to search fast?
Well I daresay we can if we're just a bit more
clever about how we draw on our canvas.
Again, let's stipulate that we can certainly
store values anywhere in memory and somehow stitch them together
using addresses.
Now those addresses, otherwise known as pointers,
we no longer need draw, because frankly, they're just now a distraction.
It suffices to know we can draw them pictorially as with some arrows,
so let's do just that.
Let me go ahead now and draw those values,
say 16 up here followed by my 8 and 15, as well as my 4.
Over here, well I draw that 42 and my 23,
and now it remains for me to somehow link these together.
Since I don't need to leave room for those actual addresses,
it suffices now to just draw arrows.
I'll go ahead and draw just a box around 16 and 8, as well as my 4 and my 15,
as well as my 23 and my 42.
Now how should I go about linking them?
Well let me propose that we no longer link just from left to right,
but rather assemble more of a hierarchy here with 16 pointing at 8,
and 16 also pointing at 42.
And 42, meanwhile, pointing at 23 with 8 pointing at 4 as well as 15.
Now why have I done it this way?
Well by including these arrows sometimes bidirectionally,
have I stitched together a two-dimensional data
structure, if you will?
Now this again surely could be mapped to that lower level of memory
just by jotting down the addresses that each of these arrows represents,
but I like thinking at this level of abstraction
because I now can think in more sophisticated form about how
I might layout my data.
So what properties do I now get from this structure?
Well, dynamism was the first goal at hand,
and how might I go about adding a new value?
Say it's 50 that I'd like to add to this structure.
Well, if I look at the top here, 16, it's already
got two arrows, so it's full, but I know 50 is bigger than 16,
so let's start to apply that dynamic and say 50
shall definitely go down to the right.
Unfortunately, 42 already has one arrow off it, but there is room for more,
and it turns out that 50 is, in fact, greater than 42.
So you know what?
I'm just going to slot 50 right there and draw 42's second arrow to 50.
And what picture seems to be emerging here?
It's perhaps reminiscent of a family tree of sorts.
Indeed, with parents and children, or a tree more generally with roots.
Now whereas in our human world, trees tend to grow up,
these trees in computer science tend to grow down.
But henceforth, let's call this 16 our root,
and to its left is its left child, to its right is its right child, or more
generally, a whole left subtree and a whole right subtree.
Because indeed, starting at 42, we have another tree of sorts.
Rooted at 42 is a child called 23, and another child called 50.
So in this case, it's each of the nodes in our structure,
otherwise known in computer science as a tree, has zero, one, or two children,
you can create the second dimension.
and you can preserve not only the ability
to add data dynamically like 50, but, but,
but, we also now gain back that ability to search.
After all, if I'm asked now the question,
is the number 15 in this structure?
Well let me check for you.
Starting at 16, which is where this structure begins, just like a linked
list starts conceptually at the left, I'll
check if 16 is the value you want-- it's not, it's too big,
but I do know that 15, if it's here, it's to the left.
Now 8, of course, is not the value you want either,
but 8 is smaller than 15, so I'll now go to the right.
And indeed, sure enough, that I now find 15.
And it only took me one, two steps, not n to find it,
because through this second dimension am I able to lift up some of those nodes
rather than draw them just down as a straight line,
or in the linked to list, all the way from left to right.
With the second dimension can I now organize things more tightly.
And notice the key characteristics of this tree.
It is what's generally known, indeed, as a binary search tree.
Not only because it's a tree that lends itself to search,
but also because each of the nodes has no more than two or bi-children--
zero, one, or two.
And notice that to the left of the 16 is not only the value
8, but every number that can be reached to the left of 16 happens to be,
by design, less than 16.
And that's how we found 15.
Moreover to the right of 16, every value is greater than 16,
just as we have here.
And that definition can be applied so-called recursively.
You can make that claim about every node in this tree at any level,
because here, 42, every node to its left albeit just one is less.
Every node to its right albeit one is indeed more.
So so long as you bring to bear to our data the same sort of intuition
we brought to our phone book can we achieve these same properties
and goals, this efficiency of logarithmic time.
Log base 2 of n is indeed how long it might take us, big O of that
to find or insert some value.
Now to be fair, there are some prices paid here.
If I'm not careful, a data structure like this
could actually devolve into a linked list
if I just keep adding, by coincidence or intent,
more and more big and big numbers.
They might just so happen to get long and long and long and stringy
unless we're smart about how we rebalance the tree occasionally.
And indeed, there are other forms of these trees that
are smart, and with more code, will rebalance themselves to make sure
that they don't get long and stringy, but stay as high up as possible.
But there's another price paid beyond that potential gotcha--
more space.
Whereas my array used no arrows whatsoever and thus no extra space,
my linked list did use one extra chunk of space for each node--
storage for that point or address of its neighbor.
But in a tree structure, if you're storing multiple children,
you're using as many as two additional chunks of memory
to store as many if two of those arrows.
And so with a tree structure are you spending more space,
but potentially it's saving you time.
So again, we see this theme of trade-offs,
whereby if you really want less time to be spent,
you're going to have to spend more of that space.
Now can we do even better?
With an array, we had instant access to data,
but we painted ourselves into that corner.
With a linked list did we solve that particular problem,
but we gave up the ability to jump right where we want.
But with trees, particularly binary search trees,
can we rearrange our data intelligently and regain that logarithmic time.
But wouldn't it be nice if we could achieve even better, say,
constant time searches of data and insertions thereof?
Well for that, perhaps we could amalgamate some of the ideas
we've seen thus far into just one especially clever structure.
And let's call that particular structure a hash table.
And indeed, this is perhaps, in theory, the holy grail of data structures,
insofar as you can store anything in it in ideally constant time.
But how best to do this?
Well let's begin by drawing ourselves an array.
And that array this time I'll draw vertically simply
to leave ourselves a bit more in room for something clever.
This array, as always, can be indexed into by way of these locations
here where this might be location 0 and 1, 2, and 3,
followed by any number of others.
Now how do I want to use this array?
Well suppose that I want to store names and not numbers.
Those names, of course, could just be inserted in any old location,
but if unsorted, we already know we're going
to suffer as much as big O of n time--
linear time with which to find a particular name in that array
if you know nothing a priori about the order.
Well we know already, too, we could do better just like the phone company,
and if we sort the names we're putting into this structure,
we can at least then do binary search and whittle that search time down
to log base 2 of n.
But wouldn't it be nice if we can whittle that down further
and get to any name we want in nearly constant time-- one step, maybe two
or a few?
Well with a hash table can you approximately or ideally do that,
so long as we decide in advance how to hash those strings.
In other words, those strings of characters, here called names,
they have letters inside of them, say D-A-V-I-D for my own.
Well what if we looked at not the whole name,
but that first letter, which is, of course, constant time
to just look at one value.
And so if D is the fourth letter in the English alphabet, what if I store
DAVID--
or really, any D name at the fourth index in my array,
location 3 if you start counting at 0?
So here might be the A names, and here the B names, and here the C names,
and someone like David now belongs in this bucket, if you will.
Now suppose I want to store other names in this structure.
Well Alice belongs at location 0, and Bob, for instance, location 1.
And we can continue this logic and can continue
to insert more and more names so long as we hash those names
and jump right to the right location.
After all, I can in one step look at A or B or D
and instantly know 0 or 1 or 3.
How?
Well recall that in a computer you have ASCII or Unicode.
And we already have numbers predetermined to map
to those same characters.
Now to be fair, A I'm pretty sure it was 65 in ASCII,
but we could certainly subtract 65 from 65 to get 0.
And if capital B was 66, we could certainly subtract 65 from 66 to get 1.
So we can look, then, at the first letter of any name, convert it to ASCII
and subtract quite simply 65 if it's capital,
and get precisely to the index we want.
So to be fair, that's not one, but it is two or three steps,
but that is a constant number of steps again and again independent
of n, the total number of names.
Now what's nice about this is that we have a data structure into which we
can insert names instantly by hashing them and getting as output
that number or index 0 through 25, in the case of an English alphabet.
But what problem might arise?
The catch, though, is that we have someone else, like Doug,
whose name happens to start with the same name,
unfortunately there seems to be no room at this moment for Doug
since I'm already there.
But there we can draw inspiration from other data structures still.
We could maybe not just put David in this array,
but not even treat this array as the entire data structure,
but really the beginning of another.
In fact, let me go ahead and put David in his or my own box
and give Doug his own as well.
Now Doug and I are really just nodes in a structure.
And we can use this array still to get to the right nodes of interest,
but now we can use arrows to stitch them together.
If I have multiple names, each of which starts with a D,
I just need to remember to link those together,
thereby allowing myself to have any number of names
that start with that same letter, treating that list really
as a linked list.
But I get to that length list instantly by looking at that first letter
and jumping here to the right location.
And so here I get both dynamic growth and instant access to that list,
thereby decreasing significantly the amount of time
it takes me to find someone maybe 1/26 of the time.
Now to be fair, wait a minute, we're already
seeing collisions, so to speak, whereby I have multiple inputs hashing
to the same output--
three in this instance.
And in the worst case, perhaps everyone in the room
all has a name that starts with D, which means really,
you don't have a hash table or array at all,
you just have one really long linked list, and thus, linear.
But that would be considered a more perverse scenario, which you should try
to avoid by way of that hash function.
If that is the problem you're facing, then your hash function is just bad.
You should not have looked only in that case
at just the first letter of every name.
Perhaps you should have looked at the first two letters
back-to-back, and put anyone's name that starts with D-A in one list;
and D-B, if there is any, in a second list; and D-C, if there's any of those,
in some third list altogether; and D-D and D-E and D-F
and so forth, and actually have multiple combinations of every two letters,
and have as many buckets, so to speak, as many indexes in your array
as there are pairs of two alphabetical letters.
Now to be fair, you might have two people
whose names start with D-A or D-O, but hopefully there's even fewer.
And indeed, I say a hash table--
this whole structure approximates the idea of constant time
because it can devolve in places to linear time with longer lists of names.
But if your hash function is good and you don't have these collisions,
and therefore ideally you don't have any linked lists, just names, then
you indeed have a structure that gives you constant time access,
ultimately, combining all of these underlying
principles of dynamic growth and random access
to achieve ultimately the storage of all your values.
How, then, might a language like Python implement data types like int and str?
Well in the case of Python's latest version,
it allows ints to grow as big as you need them to be.
And so it surely can only be using contiguous memory once allocated
that stays in the same place.
If instead you want a number to grow over time,
well you're probably going to need to allocate some variable number of bytes
in that memory.
Strings, too, as well.
If you want to allocate strings, you're going to need to allow them to grow,
which means finding extra space in proximity
to the characters you already have, or maybe relocating the whole structure
so that that value can keep growing.
But we know now, we can do this with our canvas of memory.
How the particular language does it isn't even necessarily of interest,
we just know that it can, and even underneath the hood, how
it might do so.
As for these other structures in Python like dict or dictionary and list,
well those, too, are exactly what we've seen here.
A dictionary in Python is really just a hash table, some sort of variable
that has indexes that are not necessarily numbers, but words,
and via those words can you get back a value.
Indeed, more generally does a hash table have keys and values.
The keys are the inputs via which you produce those outputs.
So in our data structure, might have been the inputs as names.
The output of my hash function was an index value like some number.
And in Python do you have a wonderful abstraction in code that
allows you to express that idea of associating keys
with values, names with yes or no, true or false
they are present so that you can ask those questions yourself in your code.
And as for list, it's quite simply that.
It's the idea of an array but with that added dynamism,
and as such, a linked list of sorts.
And so now at this higher level of code can you not only think computationally,
but express yourself computationally knowing and trusting
that the computer can do that bidding.
How the data structures are organized really
is the secret source of these languages and tools,
and indeed, when you have some database or backend system, too,
the intellectual property that underlies those systems
ultimately boils down not only to the algorithms
in use, but also the data structures.
Because together, they-- and we've seen this--
together combine to produce not only the correctness of answers you want,
but the efficiency with which you can to those answers.