字幕列表 影片播放
-
The following content is provided
-
under a Creative Commons license.
-
Your support will help MIT OpenCourseWare continue
-
to offer high quality educational resources for free.
-
To make a donation or view additional materials
-
from hundreds of MIT courses, visit MIT OpenCourseWare
-
at ocw.mit.edu.
-
PROFESSOR: Hi.
-
I'm Srini Devadas.
-
I'm a professor of electrical engineering and computer
-
science.
-
I'm going to be co-lecturing 6.006-- Introduction
-
to Algorithms-- this term with professor Erik Domane.
-
Eric, say hi.
-
ERIK DOMANE: Hi.
-
[LAUGHTER]
-
PROFESSOR: And we hope you're going
-
to have a fun time in 6.006 learning
-
a variety of algorithms.
-
What I want to do today is spend literally a minute or so
-
on administrative details, maybe even less.
-
What I'd like to do is to tell you
-
to go to the website that's listed up there and read it.
-
And you'll get all information you
-
need about what this class is about from a standpoint
-
of syllabus; what's expected of you; the problem set
-
schedule; the quiz schedule; and so on and so forth.
-
I want to dive right in and tell you about interesting things,
-
like algorithms and complexity of algorithms.
-
I want to spend some time giving you
-
an overview of the course content.
-
And then we're going to dive right
-
in and look at a particular problem of peak
-
finding-- both the one dimensional version and a two
-
dimensional version-- and talk about algorithms to solve
-
this peak finding problem-- both varieties of it.
-
And you'll find that there's really
-
a difference between these various algorithms
-
that we'll look at in terms of their complexity.
-
And what I mean by that is you're
-
going to have different run times of these algorithms
-
depending on input size, based on how
-
efficient these algorithms are.
-
And a prerequisite for this class is 6.042.
-
And in 6.042 you learned about asymptotic complexity.
-
And you'll see that in this lecture
-
we'll analyze relatively simple algorithms today
-
in terms of their asymptotic complexity.
-
And you'll be able to compare and say
-
that this algorithm is fasten this other one-- assuming
-
that you have large inputs-- because it's
-
asymptotically less complex.
-
So let's dive right in and talk about the class.
-
So the one sentence summary of this class
-
is that this is about efficient procedures
-
for solving problems on large inputs.
-
And when I say large inputs, I mean things
-
like the US highway system, a map
-
of all of the highways in the United States;
-
the human genome, which has a billion letters
-
in its alphabet; a social network responding to Facebook,
-
that I guess has 500 million nodes or so.
-
So these are large inputs.
-
Now our definition of large has really changed with the times.
-
And so really the 21st century definition of large
-
is, I guess, a trillion.
-
Right?
-
Back when I was your age large was like 1,000.
-
[LAUGHTER]
-
I guess I'm dating myself here.
-
Back when Eric was your age, it was a million.
-
Right?
-
[LAUGHTER]
-
But what's happening really the world is moving faster,
-
things are getting bigger.
-
We have the capability of computing on large inputs,
-
but that doesn't mean that efficiency
-
isn't of paramount concern.
-
The fact of matter is that you can, maybe,
-
scan a billion elements in a matter of seconds.
-
But if you had an algorithm that required cubic complexity,
-
suddenly you're not talking about 10 raised to 9,
-
you're talking about 10 raised to 27.
-
And even current computers can't really
-
handle those kinds of numbers, so efficiency is a concern.
-
And as inputs get larger, it becomes more of a concern.
-
All right?
-
So we're concerned about--
-
--efficient procedures-- for solving large scale problems
-
in this class.
-
And we're concerned about scalability,
-
because-- just as, you know, 1,000
-
was a big number a couple of decades ago,
-
and now it's kind of a small number-- it's
-
quite possible that by the time you guys are professors
-
teaching this class in some university
-
that a trillion is going to be a small number.
-
And we're going to be talking about-- I don't know--
-
10 raised to 18 as being something
-
that we're concerned with from a standpoint of a common case
-
input for an algorithm.
-
So scalability is important.
-
And we want to be able to track how our algorithms are going
-
to do as inputs get larger and larger.
-
You going to learn a bunch of different data structures.
-
We'll call them classic data structures,
-
like binary search trees, hash tables-- that
-
are called dictionaries in Python-- and data
-
structures-- such as balanced binary search trees-- that
-
are more efficient than just the regular binary search trees.
-
And these are all data structures
-
that were invented many decades ago.
-
But they've stood the test of time,
-
and they continue to be useful.
-
We're going to augment these data structures in various ways
-
to make them more efficient for certain kinds of problems.
-
And while you're not going to be doing a whole lot of algorithm
-
design in this class, you will be
-
doing some design and a whole lot of analysis.
-
The class following this one, 6.046 Designing Analysis
-
of Algorithms, is a class that you
-
should take if you like this one.
-
And you can do a whole lot more design of algorithms in 6.046.
-
But you will look at classic data structures
-
and classical algorithms for these data structures,
-
including things like sorting and matching, and so on.
-
And one of the nice things about this class
-
is that you'll be doing real implementations of these data
-
structures and algorithms in Python.
-
And in particular are each of the problem
-
sets in this class are going to have both a theory
-
part to them, and a programming part to them.
-
So hopefully it'll all tie together.
-
The kinds of things we're going to be talking about in lectures
-
and recitations are going to be directly connected
-
to the theory parts of the problem sets.
-
And you'll be programming the algorithms that we talk about
-
in lecture, or augmenting them, running them.
-
Figuring out whether they work well on large inputs or not.
-
So let me talk a little bit about the modules
-
in this class and the problem sets.
-
And we hope that these problem sets
-
are going to be fun for you.
-
And by fun I don't mean easy.
-
I mean challenging and worthwhile, so at the end of it
-
you feel like you've learned something,
-
and you had some fun along the way.
-
All right?
-
So content wise--
-
--we have eight modules in the class.
-
Each of which, roughly speaking, has
-
a problem set associated with it.
-
The first of these is what we call algorithmic thinking.
-
And we'll kick start that one today.
-
We'll look at a particular problem, as I mentioned,
-
of peak finding.
-
And as part of this, you're going
-
to have a problem set that's going to go out today as well.
-
And you'll find that in this problem set
-
some of these algorithms I talk about today will
-
be coded in Python and given to.
-
A couple of them are going to have bugs in them.
-
You'll have to analyze the complexity of these algorithms;
-
figure out which ones are correct and efficient;
-
and write a proof for one of them.
-
All right?
-
So that's sort of an example problem set.
-
And you can expect that most of the problem sets
-
are going to follow that sort of template.
-
All right.
-
So you'll get a better sense of this
-
by the end of the day today for sure.
-
Or a concrete sense of this, because we'll
-
be done with lecture and you'll see your first problem set.
-
We're going to be doing a module on sorting and trees.
-
Sorting you now about, sorting a bunch of numbers.
-
Imagine if you had a trillion numbers
-
and you wanted to sort them.
-
What kind of algorithm could use for that?
-
Trees are a wonderful data structure.
-
There's different varieties, the most common being binary trees.
-
And there's ways of doing all sorts of things,
-
like scheduling, and sorting, using various kinds of trees,
-
including binary trees.
-
And we have a problem set on simulating a logic network
-
using a particular kind of sorting algorithm in a data
-
structure.
-
That is going to be your second problem set.
-
And more quickly, we're going to have modules on hashing,
-
where we do things like genome comparison.
-
In past terms we compared a human genome to a rat genome,
-
and discovered they were pretty similar.
-
99% similar, which is kind of amazing.
-
But again, these things are so large that you
-
have to have efficiency in the comparison methods
-
that you use.
-
And you'll find that if you don't get the complexity low
-
enough, you just won't be able to complete--
-
your program won't be able to finish running within the time
-
that your problem set is do.
-
Right?
-
Which is a bit of a problem.
-
So that's something to keep in mind as you test your code.
-
The fact is that you will get large inputs to run your code.
-
And you want to keep complexity in mind
-
as you're coding and thinking about the pseudocode,
-
if you will, of your algorithm itself.
-
We will talk about numerics.
-
A lot of the time we talk about such large numbers
-
that 32 bits isn't enough.
-
Or 64 bits isn't enough to represent these numbers.
-
These numbers have thousands of bits.
-
A good example is RSA encryption,
-
that is used in SSL, for example.
-
And when you go-- use https on websites,
-
RSA is used at the back end.
-
And typically you work with prime numbers
-
that are thousands of bits long in RSA.
-
So how do you handle that?
-
How does Python handle that?
-
How do you write algorithms that can
-
deal with what are called infinite precision numbers?
-
So we have a module on numerics in the middle of the term that
-
talks about that.
-
Graphs, really a fundamental data structure
-
in all of computer science.
-
You might have heard of the famous Rubik's cube assignment
-
from .
-
006 a 2 by 2 by 2 Rubik's cube.
-
What's the minimum number of moves
-
necessary to go from a given starting configuration
-
to the final end configuration, where all of the faces-- each
-
of the faces has uniform color?
-
And that can be posed as a graph problem.
-
We'll probably do that one this term.
-
In previous terms we've done other things
-
like the 15 puzzle.
-
And so some of these are tentative.
-
We definitely know what the first problem set is like,
-
but the rest of them are, at this moment, tentative.
-
And to finish up shortest paths.
-
Again in terms past we've asked you
-
to write code using a particular algorithm that
-
finds the shortest path from Caltech to MIT.
-
This time we may do things a little bit differently.
-
We were thinking maybe we'll give you a street map of Boston
-
and go figure out if Paul Revere used
-
the shortest path to get to where he was going,
-
or things like that.
-
We'll try and make it fun.
-
Dynamic programming is an important algorithm design
-
technique that's used in many, many problems.
-
And it can be used to do a variety of things, including
-
image compression.
-
How do you compress an image so the number of pixels