機器學習 - CS50播客，Ep.6。 (Machine Learning - CS50 Podcast, Ep. 6)

字幕列表影片播放

SPEAKER: This is CS50.
[MUSIC PLAYING]
DAVID MALAN: Hello world.
This is the CS50 podcast.
My name is David Malan.
And I'm here with CS50's own Colton, no Brian Yu.
BRIAN YU: Hi everyone.
DAVID MALAN: So Colton could no longer be here today.
He's headed out west.
But I'm so thrilled that CS50's own Brian Yu's, indeed,
now with us for our discussion today of machine learning.
This was the most asked about topic in a recent Facebook
poll that CS50 conducted.
So let's dive right in.
Machine learning is certainly all over the place these days in terms
of the media and so forth.
But I'm not sure I've really wrapped my own mind around what
machine learning is and what its relationship to artificial intelligence
is.
Brian, our resident expert, would you mind bring me and everyone up to speed?
BRIAN YU: Yeah, of course.
Machine learning is sometimes a difficult topic
to really wrap your head around, because it
comes in so many different forms and different shapes.
But, in general, when I think about machine learning, the way
I think about it is how a computer is performing a task.
And usually when we're programming a computer to be able to do a task,
we're giving it very explicit instructions-- do this.
And if this is true, then do that or do this some number
of times using a for loop, for example.
But in machine learning, what we do is, instead of giving the computer
explicit instructions for how to do something,
we, instead, give the computer instructions for how
to learn to do something on its own.
So instead of giving it instructions for how to perform a task,
we're teaching computer how to learn for itself
and how to figure out how to perform some kind of task on it.
DAVID MALAN: And I do feel like I hear about machine learning
and AI, artificial intelligence, almost always in the same breath.
But is there a distinction between the two?
BRIAN YU: Yeah, there is.
So artificial intelligence or AI is usually a little bit broader.
It used to describe any situation where a computer is acting rationally
or intelligently.
Machine learning is a way of getting computers
to act rationally or intelligently by learning from patterns
and learning from data and being able to learn from experiences.
But there are certainly forms of AI of being able to act intelligently
that don't require the computer to actually be able to learn, for example.
DAVID MALAN: OK.
And I feel like I've certainly heard about artificial intelligence, AI,
especially for at least 20 years, if not 30 or 40, especially in the movies
or anytime there's some sort of robotic device.
Like, artificial intelligence has certainly been with us for some time.
But I feel like there's quite the buzz around machine
learning, specifically these days.
So what is it that has changed in recent months, recent years that
put this at the top of this poll, even among CS50's own students?
BRIAN YU: Yeah, so a couple of things have changed, certainly.
One has definitely been just an increase in the amount of data
that we have access to-- the big companies that have a lot of data
from people on the internet that are using devices and going on websites,
for instance.
There's a lot of data that companies have access to.
And as we talk about machine learning, you'll
soon see that a lot of the way that these machine learning algorithms work
is that they depend upon having a lot of data
from which to draw understanding from and to try and analyze
in order to make predictions or draw conclusions, for example.
DAVID MALAN: So, then, is it fair to say,
because I have more familiarity myself with networking and hardware
and so forth that because we just have so much more disk space available to us
now and such higher CPU rates at which machines can operate that that's partly
what's driven this that we now have the computational abilities
to answer these questions?
BRIAN YU: Yeah, absolutely.
I would say that's a big contributing factor.
DAVID MALAN: So if we go down that road, like,
at what point are the algorithms really getting fundamentally
smarter or better, as opposed to the computers just getting so darn
fast that they can just think so many steps ahead
and just come up with a compelling answer to some current problem quicker
than, say, a human?
BRIAN YU: Yeah, it's a good question.
And the algorithms that we have right now tend to be pretty good.
But there's a lot of research that's happening
in machine learning right now about like,
trying to make these algorithms better.
Right now, they're pretty accurate.
Can we make them even more accurate, given the same amount of data?
Or even given less data-- can we make our algorithms
able to be able to perform tasks effectively just as effectively?
DAVID MALAN: OK, all right.
Well, so I feel like the type of AI or machine
learning that I grew up with or knew about or heard about
was always related to, like, games.
Like, chess was a big one.
I knew Google made a big splash with Go some years ago-- the game,
not the language-- and then video games more generally.
Like, if you ever wanted to play back in the '80s against the "CPU,"
quote, unquote, I'm pretty sure it was mostly just random at the time.
But there's certainly been some games that
are ever more sophisticated where it's actually
really hard to beat the computer or really easy to beat the computer,
depending on the settings you choose.
So how are those kinds of games implemented when
there's a computer playing the human?
BRIAN YU: Yeah, so this an area, a very development
in the last couple of decades that 30 years ago was unimaginable probably
that a computer could beat a human at chess, for example.
But now, the best computers can easily beat the best humans.
No question about it.
And one of the ways that you do this is via form of machine learning known
as reinforcement learning.
And the idea of this is just letting a computer learn from experience.
So if you want to train a computer to be good at chess,
you could try and give it instructions about you
thinking of strategies yourself as the human and telling the computer.
But then the computer can only ever be as good as you are.
But in reinforcement learning, what we do is,
you let the computer play a bunch of chess games.
And when the computer loses, it's able to learn from that experience,
figure out what to do and then in the future, know to do less of that.
And if the computer wins, then whatever it did to get to that position,
it can do more of that.
And so you imagine just having a computer play millions
and millions and millions of games.
And eventually, it starts to build up this intelligence, so to speak,
of knowing what worked and what didn't work.
And so in the future of being able to get better and a better
at playing this game.
DAVID MALAN: So is this all that different from even the human
and the animal world where, like, if humans
have tried to domesticate animals or pets where you sort of reinforce
good behavior positively and negatively reinforce, like, bad behavior?
I mean, is that essentially what we're doing with our computers?
BRIAN YU: Yeah, it's inspired by the same idea.
And when a computer does something right or does something in the works,
you give the computer a reward, so to speak, is what people actually call it.
And then there's the penalty if the computer isn't able to perform as well.
And so you just train the computer algorithm to maximize that reward,
whether that reward is the result of like winning a game of chess or a robot
being able to move a certain number of paces.
And the result is that with enough training,
you end up with a computer that can actually perform the task.
DAVID MALAN: Fascinating.
So I feel like another buzzword these days is, like,
smart city where somehow, cities are using computer science
and using software more sophisticatedly.
And I gather that you can even use this kind of reinforcement
learning for, like, traffic lights, even in our human world?
BRIAN YU: Yeah.
So traffic lights traditionally are just controlled by a timer
that after a certain number of seconds, the traffic light switches.
But recently, there's been growth in, like, AI-controlled traffic lights
where you have traffic lights that are connected to radar and cameras.
And that can actually see, like, when the cars
are approaching in different places--
what times of day they tend to approach.
And so you can begin to, like, train an AI traffic
light to be able to predict, all right, when should I
be switching lights and maybe even having traffic lights coordinated
across multiple intersections across the city to try
and figure out what's the best way to flip the lights in order
to make sure that people are able to get through those intersections quickly.
DAVID MALAN: So that's pretty compelling,
because I'm definitely in Cambridge, been, like, in a car
and stopped at a traffic light.
And there's, like, no one around.
And you wish it would just notice either via sensor or timer
or whatever that, like, this is clearly not the most efficient use of,
like, anyone's time.
So that's pretty amazing that it could adapt sort of seamlessly like that.
Though, what is the relationship between AI
and the buttons that the humans pushed across the street
that according to various things I've read
are actually placebos and don't actually do anything and in some cases,
aren't even connected to wires.
BRIAN YU: I'm not actually sure.
I've also heard that they may be placebos.
I've also heard that, like, the elevator close button is also
a placebo that you press that.
And it sometimes doesn't actually work.
DAVID MALAN: Yes, I've read it even, which not necessarily
authoritative source.
There is, like, a photo where someone showed
a door close button had fallen off.
But there was nothing behind it.
Now, could have been photoshop.
But I think there's evidence of this, nonetheless.
BRIAN YU: It might be the case.
I don't think there's any AI happening there.
But I think it's more just psychology of the people and trying to make people
feel better by giving them a button to press.
DAVID MALAN: Do you push the button when you run across the street?
BRIAN YU: I do usually push the button when I want to cross the street.
DAVID MALAN: This is such a big scam, though, on all of us it would seem.
BRIAN YU: Do not push the button?
DAVID MALAN: No, I do, because just, what if?
And actually it's so gratifying, because there's
a couple places in Cambridge, Massachusetts
where the button legitimately works.
When you want to cross the street, you hit the button.
Within half a second, it has changed the light.
It's the most, like, empowering feeling in the world
because that never happens.
Even in an elevator, half the time you push it, like, nothing happens,
or eventually it does and is very good positive reinforcement
to see the traffic lights changing.
I'm very well behaved-- the traffic lights as a result.
OK, so more recently, I feel like, computers
have gotten way better at some technologies that kind of sort of
existed when I was a kid, like, handwriting recognition.
There was the palm pilot early on, which is
like a popular PDA or personal digital assistant, which has now been replaced
with Androids and iPhones and so forth.
But handwriting recognition is a biggie for machine learning, right?
BRIAN YU: Yeah, definitely.
And this is an area that's gotten very, very good.
I mean, I recently have just started using an iPad.
And it's amazing that I can be taking handwritten notes.
But then my app will let me, like, search for them by text
that it will look at my handwriting, convert it to text
so that I can search through it all.
It's very, very powerful.
And the way that this is often working now
is just by having access to a lot of data.
So, for example, if you wanted to train a computer
to be able to recognize handwritten digits, like, digits on a check
that you could deposit virtually now, like,
my banking app can deposit checks digitally.
What you can do is give the machine learning algorithm
a whole bunch of data, basically a whole bunch of pictures
of handwritten numbers that people have drawn
and labels for them associated with what number it actually is.
And so the computer can learn from a whole bunch of examples of here
are some handwritten ones, and here are some handwritten twos,
and here's some handwritten threes.
And so when a new handwritten digit comes along,
the computer just learns from that previous data and says,
does this look kind of like the ones, or does it look more like the twos?
And it can make an assessment as a result of that.
DAVID MALAN: So how can we humans are sometimes
filling out those little captchas--
the little challenges on websites where they're asking us, the humans,
to tell them what something says?
BRIAN YU: Yeah.
Part of the ideas that the captchas are trying to prove to the computer
that you are, in fact, human.
They're asking you to prove that you're a human.
And so they're trying to give you a task that a computer might struggle
to do, for instance, like, identify which of these images happened to have,
like, traffic lights in them, for example.
Although nowadays, computers are getting pretty good at that
that they using machine learning techniques
they can tell which of them are traffic lights.
DAVID MALAN: Yeah, exactly.
I would think so.
BRIAN YU: And I've also heard people talk about it.
I don't don't, action, if this is true that you
can use the results of these captchas to actually train machine learning
algorithms that when you are choosing which of the images
have traffic lights in them, you're training the algorithms that
are powering, like, self-driving cars, for instance,
to be able to better assess whether there are traffic lights in an image,
because you're giving more and more of this data
that computers are able to draw from.
So we've heard that too.
DAVID MALAN: It's interesting how these algorithms
are so similar to presumably how humans work, because,
like, when you and I learned how to write text, whether it was in print
or cursive, like, the teacher just shows us, like, one canonical letter A or B
or C. And yet, obviously, like, every kid in the room
is probably drawing that A or B or C a little bit differently.
And yet, somehow, we humans just kind of know that that's close enough.
So is it fair to say, like, computers really are just kind of doing that?
They are just being taught what something is
and then tolerating variations, thereof?
BRIAN YU: Yeah, that's probably about it.
One of the inspirations for machine learning
really is that the types of things that computers are good at
and the types of things that people are good at
tend to be very, very different.
But, like, computers can very easily do complex calculations, no problem,
when we might struggle with it.
But a problem, like, identifying that in a picture,
is there a bird in the sky or not, for example?
That's something that for a long time, computers really struggled to do,
whereas, it's easy for a child to be able to look in the sky and tell you
if there's a bird there.
DAVID MALAN: Oh, I was just going to say, I could do that probably.
OK.
So if this is supervised learning, and handwriting recognition's one,
like, what other types of applications fall under this umbrella?
BRIAN YU: Yeah, so handwriting recognition
counts as supervised learning, because it's supervised in the sense
that when we're providing data to the algorithm,
like the handwritten numbers, we're also providing labels for that data, like,
saying, this is the number one-- this is the number two.
That way, the computer is able to learn from that.
But this shows up all over the place.
So, for instance, like, your email spam filter
that detects automatically which emails are spam
and puts them in the spam mailbox, it's trained the same way.
You basically give the computer a whole bunch of emails-- some of which
you tell the computer these are real emails that are good emails.
And here, these are some other emails that are spam emails.
And the computer tries to learn the characteristics and the traits of spam
email so that when a new email comes about,
the computer is able to make a judgment call about,
do I think this is a nonspam, or do I think it's a spam email?
And so you could get it to classify it in that way.
So this kind of classification problem is a big area and supervised.
DAVID MALAN: And is that what is happening if you use gmail,
and you click on an email and report it as spam,
like, you're training gmail to get better at distinguishing?
BRIAN YU: Yes.
You can think of that as a form of reinforcement learning of the computer
learning from experience.
DAVID MALAN: Good boy.
BRIAN YU: You tell the computer that it got it wrong.
And it's now going to try and learn to be better in the future
to be able to more accurately predict which emails are spam
or not spam based on what you tell it.
And gmail has so many users and so many emails
that are coming to the inbox every day that you do this enough times.
And the algorithm gets pretty good at figuring out whether an email's spam
or not.
DAVID MALAN: It's a little creepy that my inbox is becoming sentient somehow.
OK, so if there's supervised learning, I presume
there's also unsupervised learning.
Is there?
BRIAN YU: Yeah, there absolutely is.
So supervised learning requires labels on the data.
But sometimes, their data doesn't always have labels.
But you still want to be able to take a data set, give it to a computer
and get the computer to tell you something interesting about it.
And so one common example of this is for when you're doing consumer analysis,
like, when Amazon is trying to understand its customers, for instance,
Amazon might not know all the different categories of customers
that there might be.
So it might not be able to give them labels already.
But you could feed a whole bunch of customer data to an algorithm.
And the algorithm could group customers into similar groups, potentially,
based on the types of products they're likely to buy, for example.
And you might not know in advance how many groups there are
or even what the groups are.
But the algorithm can get pretty good at clustering people
into different groups.
So clustering is a big example of unsupervised learning.
That's pretty common.
DAVID MALAN: So how different is that from just, like, exhaustive search
if you sort of label every customer with certain attributes-- what they've
bought, what time they've bought it, how frequently they've bought it
and so forth?
Like, isn't this really just some kind of quadratic problem
where you compare every customer's habits against every other customers
habits, and you can, therefore, exhaustively
figure out what the commonalities are?
Like, why is this so intelligent?
BRIAN YU: So you could come up with an algorithm
to say, like, OK, how close together are two particular customers, for instance,
in terms of how many things that they've bought in common, for instance,
or when they're buying particular products?
But if you've got a lot of different users
that all have slightly different habits, and maybe some groups of people share
things in common with other groups but then don't share other characteristics
in common, it can be tricky to be able to group an entire user base
into a whole bunch of different clusters that are meaningful.
And so the unsupervised learning algorithms
are pretty good at trying to figure out how you would actually
cluster those people.
DAVID MALAN: Interesting.
OK.
And so this is true for things I know in radiology, especially these days, like,
computers can actually not only read film, so x-rays and other types
of images of human bodies.
They can actually identify things, like, tumors now,
without necessarily knowing what kind of tumor they're looking for.
BRIAN YU: Yeah.
So one application of unsupervised learning is, like a anomaly detection,
given a set of data, which things stand out as anomalous.
And so that has a lot of medical applications
where if you've got a whole bunch of medical scans or images, for instance,
you could have a computer just look at all that data
and try and figure out which are the ones that don't quite look right.
And that might be worth doctors taking another look,
because potentially, there might be a health concern there.
You see the exact same type of technology and finance a lot
when you're trying to detect, like, which transactions might be
fraudulent transactions, for instance.
Out of tons of transactions, can you find the anomalies?
The things that sort of stand out is not quite like the others.
And these unsupervised learning algorithms
can be pretty good at picking out those anomalies out of a data set.
DAVID MALAN: So what kind of algorithm triggers a fraud alert?
Almost every time, I tried to use my credit cards for work.
BRIAN YU: That one is, I don't really know what's going on with that.
I know the credit card will often trigger an alert if you're outside
of an area where you normally are.
But the details of how those algorithms are working--
I couldn't really tell you.
DAVID MALAN: Interesting.
Common frustration when we do travel here for work.
OK, so it's funny, as you described unsupervised learning,
it occurs to me that, like, 10, 15 years ago when
I was actually doing my dissertation work for my PhD, which
was, long story short, about security and specifically,
how you could with software detect sudden outbreaks of internet worms,
so malicious software that can spread from one computer to another.
The approach we took at the time was to actually look
at the system calls-- the low-level functions
that software was executing on Windows PCs
and look for common patterns of those system calls across systems.
And it only occurs to me, like, all these years later that arguably,
what we were doing in our team to do this
was really a form of machine learning.
I just think it wasn't very buzz worthy at the time
to say what we were doing was machine learning.
But I kind of think I know machine learning in retrospect.
BRIAN YU: Yeah, maybe.
I mean, it's become so common nowadays just to take anything and just
tack on machine learning to it to make it sound fancier or sound cooler
than it actually is.
DAVID MALAN: Yes.
That, and I gather statistics is now called data science, essentially,
perhaps, overstating though.
So certainly all the rage, though, speaking of trends
is, like, self-driving cars.
In fact, if I can cite another authoritative Reddit photo--
and this one I think actually made the national news.
What is it about AI that's suddenly enabling people to literally sleep
behind the wheel of a car?
BRIAN YU: Well, I don't think people should be doing that quite yet but--
DAVID MALAN: But you do eventually.
BRIAN YU: Well.
So self-driving technology is hopefully going to get better.
But right now, we're in sort of a dangerous middle ground
that cars are able to do more and more things autonomously.
They can change lanes on their own.
They can maintain their lane on their own.
They can parallel park by themselves, for example.
The consumer ones, at least, are certainly not at the place
where you could just ignore the wheel entirely and just
let them go on their own.
But a lot of people are almost treating cars as if they can do that.
And so it's a dangerous time, certainly, for these semi-autonomous vehicles.
DAVID MALAN: And it's funny you mentioned parallel parking.
In a contest between you and a computer, who could parallel park better,
do you think?
BRIAN YU: The computer would definitely beat me at parallel parking.
So I got my driver's license in California.
And learning to parallel park is not on the California driving test.
So I was not tested on it.
I've done it maybe a couple times with the assistance of my parents
but definitely not something I feel very comfortable doing.
DAVID MALAN: But I feel like when I go to California and San Francisco
in very hilly cities, it's certainly common to park diagonally
against the curb so not parallel, per se, partly just for the physics of it
so that there's less risk of cars presumably rolling down the hill.
But I feel like in other flatter areas of California,
I have absolutely when traveling, parallel park.
So, like, how is this not a thing?
[LAUGHS] I mean, it's definitely common.
People do parallel park.
It's just not required on the test.
And so people invariably learn when they need to.
But pretty soon after I got my driver's license,
I ended up moving across the country to Massachusetts for college.
And so once I got to college, I never really had
occasion to drive a whole lot.
So I just never really did a lot of driving.
DAVID MALAN: I will say, I've gotten very comfortable
certainly over the years, parallel parking,
when I'm parking on the right-hand side of the road,
because, of course, in the US, we drive on the left.
But it does throw me if it's like a one-way street,
and I need to park on the left-hand side, because all of my optics
are a little off.
So I can appreciate that.
So a self-driving car, like a Tesla, is like the off cited example these days.
Like, what are the inputs to that problem and like, the outputs,
the decisions that are being made by the car just to make this more concrete?
BRIAN YU: Yeah.
So I guess the inputs are probably at least two broad categories--
one input being all of the sensory information around the car
that these cars have so many sensors and cameras that
are trying to detect what items and objects are around it
and trying to figure all of that out.
And the second input being presumably a human-entered destination
where the user probably is typing into some device on the computer in the car
where it is that they actually want to go.
And the output, hopefully is that the computers or the car
is able to make all of the decisions about when to step on the gas,
when to turn the wheel, and all of those actions
that it needs to take to get you from point A to point B. I mean,
that's the goal of these technologies.
DAVID MALAN: Fascinating.
It would just really frighten me to see someone on the road
not holding the wheel of the car.
This is maybe a little more of a California thing.
Though, other states are certainly experimenting with this.
Or companies in various states are.
So my car is old enough that I don't so much have a screen in the car.
It's really just me and a bunch of glass mirrors.
And it still blows my mind in 2019 when I
get into a rental car or friend's car that even just has the LCD
screen with a camera in the back that shows you, like, the green, yellow,
and red markings.
And it beeps when you're getting too close to the car.
So is that machine learning when it's detecting something and beeping at you
when you're trying to park, for instance?
Well, I guess you wouldn't know.
BRIAN YU: [LAUGHS] My guess is that's probably not machine learning.
It's probably just a pretty simple logic of,
like, try and detect what the distance is via some sensor.
And if the distance is less than a certain amount,
then, beep or something like that.
You could try and do it using machine learning.
But probably, simple heuristics are good enough for that type of thing--
would be my guess.
DAVID MALAN: So how should people think about the line between software
just being ifs and else ifs and conditions
and loops, versus, like, machine learning,
which kind of takes things up a notch?
BRIAN YU: Yeah.
So I guess the line comes when it would be difficult to formally articulate
exactly what the steps should be.
And driving is a complicated enough task that trying to formally describe
exactly what the steps should be for every particular circumstance
is going to be extraordinarily difficult, if not impossible.
And so then you really need to start to rely
on machine learning to be able to answer questions, like,
is there a traffic light ahead of me?
And is the traffic light green or red?
And how many cars are ahead of me, and where are they?
Because those are questions that it's harder to just program
a definitive answer to just given, like, all the
pixels of what the sensor of the front of the car is seeing.
DAVID MALAN: So is that also true with this other technology that's
in vogue these days of these always listening devices,
so like Siri and hey, Google and Alexa?
Like, I presume it's relatively easy for companies
to support well-defined commands, so a finite set of words or sentences
that the tools just to understand.
But does AI come into play or machine learning coming into play
when you want to support an infinite language,
like English or any other spoken language?
BRIAN YU: Yes.
So certainly when it comes to natural language processing,
given the words that I have spoken, can you figure out what it is that I mean?
And that's a problem that you'll often use
machine learning to be able to try and get at some sense of the meaning for.
But even with those predefined commands, if you
imagine a computer that only supported a very limited number of fixed commands,
we're still giving those commands via voice.
And so the computer still needs to be able to translate the sounds that
are just being produced in the air that the microphone is picking up
on into the actual words that they are.
And there's usually machine learning involved there too,
because it's not simple to be able to just take the sounds
and convert them to words, because different people speak
at different paces or have slightly different accents
or will speak in slightly different ways.
They might mispronounce something.
And so being able to train a computer to listen to that
and figure out what the words are, that can be tricky too.
DAVID MALAN: So that's pretty similar, though, to handwriting recognition?
Is that fair to say?
BRIAN YU: Probably.
You could do it a similar way where you train a computer by giving it
a whole bunch of sounds and what they correspond to in getting the computer
to learn from all of that data.
DAVID MALAN: And so why is it that every time I talk to Google,
it doesn't know what song I wanted to play.
BRIAN YU: [LAUGHS] Well, this technology is definitely still in progress.
There is definitely a lot of room for these technologies to get better.
DAVID MALAN: Those are diplomatic.
BRIAN YU: [LAUGHS] I mean, Siri on my phone half the time.
It doesn't pick up on exactly what I'm trying to ask it.
DAVID MALAN: Oh, for me, it feels even worse than that.
Like, I can confidently set timers, like set timer for three minutes
if I'm boiling some water or something.
But I pretty much don't use it for anything else besides that.
BRIAN YU: Yeah, I think timers I can do.
I used to try to, like, if I needed to send, like, a quick text
message to someone, I used to try and say, like,
text my mom that I'll be at the airport in 10 minutes.
But even then, it's very hit or miss.
DAVID MALAN: Well, even with you the other day,
I sent you a text message verbally.
But I just let the audio go out, because I just
have too little confidence in the transcription capabilities
of these devices these days.
BRIAN YU: Yeah.
Like, the iPhone, now, we'll try to, like, transcribe voicemails for you.
Or, at least, it'll make an attempt to so that you can just
tap on the voicemail and see a transcription of what's
contained in the voicemail.
And I really haven't found it very helpful.
But it can get like a couple of words.
And maybe I'll get a general sense.
But it's not good enough for me to really get any meaning out of it.
That's tough to listen to voicemail.
DAVID MALAN: See, I don't know.
I think that's actually use case where it's useful enough usually for me
if I can glean who it's from, or what the gist of the messages
so then I don't have to actually listen to it in real time.
But the problem for me when sending outbound messages
is, I want to look like an idiot.
They were completely incoherent, because Siri or whatever technology
is not transcribing me correctly.
OK.
But the dream I have, at least-- one of my favorite books
ever was Douglas Adams' Hitchhiker's Guide to the Galaxy,
where the most amazing technology in that book
is called The Babel fish, where it's a little fish that you put in your ear.
And it somehow translates all spoken words
that you're hearing into your own native language, essentially.
So how close are we to being able to talk to another human being who
does not speak the same language but seamlessly chat with that person?
BRIAN YU: I think we're pretty far away from it.
I think Skype, I think, has this feature or, at least,
it's a feature that they've been developing where they can try
to approximate a real-time translation.
And I think I saw a video--
DAVID MALAN: I can't even talk successfully to someone in English
on Skype.
BRIAN YU: Yeah.
So I think the demo is pretty good.
But I don't think it's, like, commercially available yet.
But translation technology has gotten better.
But it's certainly still not good.
One of my favorite types of YouTube videos that I watch sometimes
are people that will, like, take a song and their lyrics of the song
and translate it into another language and translate back into English.
And the lyrics just get totally messed up,
because this translation technology is, it can approximate meaning.
But it's certainly far from perfect.
DAVID MALAN: That's kind of like playing operator in English where
you tell someone something, and they tell someone something,
and they tell someone-- someone.
And by the time you go around the circle,
it is not at all what you originally said.
BRIAN YU: Yeah, I think I played that game when I was younger.
I think we called it telephone.
We call it operator?
DAVID MALAN: Yeah.
No, actually we probably called it telephone too.
Was there an operator involved?
Maybe you call operator if you need a hint.
Or maybe--
BRIAN YU: I don't think you got hints when I was playing.
DAVID MALAN: No, I think we had a hint feature where you say, operator.
And maybe the person next to you has to tell you again or something.
Maybe it's been a long time since I played this too.
Fascinating.
Well, thank you so much for explaining to me
and everyone out there a little bit more about machine learning.
If folks want to learn more about ML, what would you suggest they google?
BRIAN YU: Yeah.
So you can look up basically any of the keywords that we talked about today.
You could just look up machine learning.
But if you wanted to be more specific, you
could look up reinforcement learning or supervised learning or unsupervised
learning.
If there's any of the particular technologies,
you could look those up specifically, like handwriting recognition
or self-driving cars.
There are a lot of resources available of people
that are talking about these technologies
and how they work for sure.
DAVID MALAN: Awesome.
Well, thanks so much.
This was Machine Learning on the CS50 podcast.
If you have other ideas for topics that you'd love for Brian and I and the team
to discuss and explore, do just drop us an email at podcast@cs50.harvard.edu.
My name is David Malan.
BRIAN YU: I'm Brian Yu.
DAVID MALAN: And this was the CS50 Podcast.