Placeholder Image

字幕列表 影片播放

  • The following content is provided

  • under a Creative Commons license.

  • Your support will help MIT OpenCourseWare continue

  • to offer high quality educational resources for free.

  • To make a donation or view additional materials

  • from hundreds of MIT courses, visit MIT OpenCourseWare

  • at ocw.mit.edu.

  • PROFESSOR: Hi.

  • I'm Srini Devadas.

  • I'm a professor of electrical engineering and computer

  • science.

  • I'm going to be co-lecturing 6.006-- Introduction

  • to Algorithms-- this term with professor Erik Domane.

  • Eric, say hi.

  • ERIK DOMANE: Hi.

  • [LAUGHTER]

  • PROFESSOR: And we hope you're going

  • to have a fun time in 6.006 learning

  • a variety of algorithms.

  • What I want to do today is spend literally a minute or so

  • on administrative details, maybe even less.

  • What I'd like to do is to tell you

  • to go to the website that's listed up there and read it.

  • And you'll get all information you

  • need about what this class is about from a standpoint

  • of syllabus; what's expected of you; the problem set

  • schedule; the quiz schedule; and so on and so forth.

  • I want to dive right in and tell you about interesting things,

  • like algorithms and complexity of algorithms.

  • I want to spend some time giving you

  • an overview of the course content.

  • And then we're going to dive right

  • in and look at a particular problem of peak

  • finding-- both the one dimensional version and a two

  • dimensional version-- and talk about algorithms to solve

  • this peak finding problem-- both varieties of it.

  • And you'll find that there's really

  • a difference between these various algorithms

  • that we'll look at in terms of their complexity.

  • And what I mean by that is you're

  • going to have different run times of these algorithms

  • depending on input size, based on how

  • efficient these algorithms are.

  • And a prerequisite for this class is 6.042.

  • And in 6.042 you learned about asymptotic complexity.

  • And you'll see that in this lecture

  • we'll analyze relatively simple algorithms today

  • in terms of their asymptotic complexity.

  • And you'll be able to compare and say

  • that this algorithm is fasten this other one-- assuming

  • that you have large inputs-- because it's

  • asymptotically less complex.

  • So let's dive right in and talk about the class.

  • So the one sentence summary of this class

  • is that this is about efficient procedures

  • for solving problems on large inputs.

  • And when I say large inputs, I mean things

  • like the US highway system, a map

  • of all of the highways in the United States;

  • the human genome, which has a billion letters

  • in its alphabet; a social network responding to Facebook,

  • that I guess has 500 million nodes or so.

  • So these are large inputs.

  • Now our definition of large has really changed with the times.

  • And so really the 21st century definition of large

  • is, I guess, a trillion.

  • Right?

  • Back when I was your age large was like 1,000.

  • [LAUGHTER]

  • I guess I'm dating myself here.

  • Back when Eric was your age, it was a million.

  • Right?

  • [LAUGHTER]

  • But what's happening really the world is moving faster,

  • things are getting bigger.

  • We have the capability of computing on large inputs,

  • but that doesn't mean that efficiency

  • isn't of paramount concern.

  • The fact of matter is that you can, maybe,

  • scan a billion elements in a matter of seconds.

  • But if you had an algorithm that required cubic complexity,

  • suddenly you're not talking about 10 raised to 9,

  • you're talking about 10 raised to 27.

  • And even current computers can't really

  • handle those kinds of numbers, so efficiency is a concern.

  • And as inputs get larger, it becomes more of a concern.

  • All right?

  • So we're concerned about--

  • --efficient procedures-- for solving large scale problems

  • in this class.

  • And we're concerned about scalability,

  • because-- just as, you know, 1,000

  • was a big number a couple of decades ago,

  • and now it's kind of a small number-- it's

  • quite possible that by the time you guys are professors

  • teaching this class in some university

  • that a trillion is going to be a small number.

  • And we're going to be talking about-- I don't know--

  • 10 raised to 18 as being something

  • that we're concerned with from a standpoint of a common case

  • input for an algorithm.

  • So scalability is important.

  • And we want to be able to track how our algorithms are going

  • to do as inputs get larger and larger.

  • You going to learn a bunch of different data structures.

  • We'll call them classic data structures,

  • like binary search trees, hash tables-- that

  • are called dictionaries in Python-- and data

  • structures-- such as balanced binary search trees-- that

  • are more efficient than just the regular binary search trees.

  • And these are all data structures

  • that were invented many decades ago.

  • But they've stood the test of time,

  • and they continue to be useful.

  • We're going to augment these data structures in various ways

  • to make them more efficient for certain kinds of problems.

  • And while you're not going to be doing a whole lot of algorithm

  • design in this class, you will be

  • doing some design and a whole lot of analysis.

  • The class following this one, 6.046 Designing Analysis

  • of Algorithms, is a class that you

  • should take if you like this one.

  • And you can do a whole lot more design of algorithms in 6.046.

  • But you will look at classic data structures

  • and classical algorithms for these data structures,

  • including things like sorting and matching, and so on.

  • And one of the nice things about this class

  • is that you'll be doing real implementations of these data

  • structures and algorithms in Python.

  • And in particular are each of the problem

  • sets in this class are going to have both a theory

  • part to them, and a programming part to them.

  • So hopefully it'll all tie together.

  • The kinds of things we're going to be talking about in lectures

  • and recitations are going to be directly connected

  • to the theory parts of the problem sets.

  • And you'll be programming the algorithms that we talk about

  • in lecture, or augmenting them, running them.

  • Figuring out whether they work well on large inputs or not.

  • So let me talk a little bit about the modules

  • in this class and the problem sets.

  • And we hope that these problem sets

  • are going to be fun for you.

  • And by fun I don't mean easy.

  • I mean challenging and worthwhile, so at the end of it

  • you feel like you've learned something,

  • and you had some fun along the way.

  • All right?

  • So content wise--

  • --we have eight modules in the class.

  • Each of which, roughly speaking, has

  • a problem set associated with it.

  • The first of these is what we call algorithmic thinking.

  • And we'll kick start that one today.

  • We'll look at a particular problem, as I mentioned,

  • of peak finding.

  • And as part of this, you're going

  • to have a problem set that's going to go out today as well.

  • And you'll find that in this problem set

  • some of these algorithms I talk about today will

  • be coded in Python and given to.

  • A couple of them are going to have bugs in them.

  • You'll have to analyze the complexity of these algorithms;

  • figure out which ones are correct and efficient;

  • and write a proof for one of them.

  • All right?

  • So that's sort of an example problem set.

  • And you can expect that most of the problem sets

  • are going to follow that sort of template.

  • All right.

  • So you'll get a better sense of this

  • by the end of the day today for sure.

  • Or a concrete sense of this, because we'll

  • be done with lecture and you'll see your first problem set.

  • We're going to be doing a module on sorting and trees.

  • Sorting you now about, sorting a bunch of numbers.

  • Imagine if you had a trillion numbers

  • and you wanted to sort them.

  • What kind of algorithm could use for that?

  • Trees are a wonderful data structure.

  • There's different varieties, the most common being binary trees.

  • And there's ways of doing all sorts of things,

  • like scheduling, and sorting, using various kinds of trees,

  • including binary trees.

  • And we have a problem set on simulating a logic network

  • using a particular kind of sorting algorithm in a data

  • structure.

  • That is going to be your second problem set.

  • And more quickly, we're going to have modules on hashing,

  • where we do things like genome comparison.

  • In past terms we compared a human genome to a rat genome,

  • and discovered they were pretty similar.

  • 99% similar, which is kind of amazing.

  • But again, these things are so large that you

  • have to have efficiency in the comparison methods

  • that you use.

  • And you'll find that if you don't get the complexity low

  • enough, you just won't be able to complete--

  • your program won't be able to finish running within the time

  • that your problem set is do.

  • Right?

  • Which is a bit of a problem.

  • So that's something to keep in mind as you test your code.

  • The fact is that you will get large inputs to run your code.

  • And you want to keep complexity in mind

  • as you're coding and thinking about the pseudocode,

  • if you will, of your algorithm itself.

  • We will talk about numerics.

  • A lot of the time we talk about such large numbers

  • that 32 bits isn't enough.

  • Or 64 bits isn't enough to represent these numbers.

  • These numbers have thousands of bits.

  • A good example is RSA encryption,

  • that is used in SSL, for example.

  • And when you go-- use https on websites,

  • RSA is used at the back end.

  • And typically you work with prime numbers

  • that are thousands of bits long in RSA.

  • So how do you handle that?

  • How does Python handle that?

  • How do you write algorithms that can

  • deal with what are called infinite precision numbers?

  • So we have a module on numerics in the middle of the term that

  • talks about that.

  • Graphs, really a fundamental data structure

  • in all of computer science.

  • You might have heard of the famous Rubik's cube assignment

  • from .

  • 006 a 2 by 2 by 2 Rubik's cube.

  • What's the minimum number of moves

  • necessary to go from a given starting configuration

  • to the final end configuration, where all of the faces-- each

  • of the faces has uniform color?

  • And that can be posed as a graph problem.

  • We'll probably do that one this term.

  • In previous terms we've done other things

  • like the 15 puzzle.

  • And so some of these are tentative.

  • We definitely know what the first problem set is like,

  • but the rest of them are, at this moment, tentative.

  • And to finish up shortest paths.

  • Again in terms past we've asked you

  • to write code using a particular algorithm that

  • finds the shortest path from Caltech to MIT.

  • This time we may do things a little bit differently.

  • We were thinking maybe we'll give you a street map of Boston

  • and go figure out if Paul Revere used

  • the shortest path to get to where he was going,

  • or things like that.

  • We'll try and make it fun.

  • Dynamic programming is an important algorithm design

  • technique that's used in many, many problems.

  • And it can be used to do a variety of things, including

  • image compression.

  • How do you compress an image so the number of pixels