Placeholder Image

字幕列表 影片播放

  • MALE SPEAKER: Welcome, everybody,

  • to one more Authors at Google Talk.

  • Today, our guest speaker is Pedro Domingos,

  • whose new book is called "The Master Algorithm."

  • We have it here and you can buy copies outside.

  • So one definition of machine learning

  • is "the automation of discovery."

  • Our guest, Pedro Domingos, is at the very forefront

  • of the search for the master algorithm, a universal learner

  • capable of deriving all knowledge, past, present

  • and future, from data.

  • Pedro Domingos is a professor of Computer Science

  • and Engineering at the University of Washington.

  • He's the co-founder of the International Machine Learning

  • Society.

  • Pedro received his MS in Electrical Engineering

  • and Computer Science from IST in Lisbon,

  • his Master's of Science and PhD in Information

  • and Computer Science from the University of California

  • at Irvine.

  • He spent two years as an assistant professor at IST

  • before joining the faculty of the University of Washington

  • in 1999.

  • Pedro is the author or co-author of over 200

  • technical publications in machine learning, data mining,

  • and other areas.

  • He is the winner of the SIGKDD Innovation Award, the highest

  • honor in data science.

  • He's an AAAI Fellow and has received the Sloan Fellowship

  • and NSF Career Award, a Fulbright scholarship, an IBM

  • Faculty Award, several best paper awards,

  • and other distinctions.

  • He's a member of the editorial board of "The Machine Learning

  • Journal."

  • Please join me in welcoming Pedro, today, to Google.

  • [APPLAUSE]

  • PEDRO DOMINGOS: Thank you.

  • Let me start with a very simple question--

  • where does knowledge come from?

  • Until very recently, it came from just three sources, number

  • one, evolution-- that's the knowledge that's

  • encoded in your DNA-- number two,

  • experience-- that's the knowledge that's

  • encoded in your neurons-- and number three, culture,

  • which is the knowledge you acquire

  • by talking with other people, reading books, and so on.

  • And everything that we do, right,

  • everything that we are basically comes from these three sources

  • of knowledge.

  • Now what's quite extraordinary is just, only recently,

  • there's a fourth source of knowledge on the planet.

  • And that's computers.

  • There's more and more knowledge now that comes from computers,

  • is discovered by computers.

  • And this is as big of a change as the emergence

  • of each of these four was.

  • Like evolution, right, well, that's life on earth.

  • It's the product of evolution.

  • Experience is what distinguishes us mammals from insects.

  • And culture is what makes humans what we are

  • and as successful as we are.

  • Notice, also, that each of these forms of knowledge discovery

  • is orders of magnitude faster than the previous one

  • and discovers orders of magnitude more knowledge.

  • And indeed, the same thing is true of computers.

  • Computers can discover knowledge orders of magnitude

  • faster than any of these things that went before

  • and that co-exist with them and orders of magnitude more

  • knowledge in the same amount of time.

  • In fact, Yann LeCun says that "most

  • of the knowledge in the world in the future

  • is going to be extracted by machines

  • and will reside in machines."

  • So this is a major change that, I think, is not just for us

  • computer scientists to know about and deal

  • with, it's actually something that everybody

  • needs to understand.

  • So how do computers discover new knowledge?

  • This is, of course, the province of machine learning.

  • And in a way, what I'm going to try to do in this talk

  • is try to give you a sense of what machine learning is

  • and what it does.

  • If you're already familiar with machine learning,

  • this will hopefully give you a different perspective on it.

  • If you're not familiar with machine learning already,

  • this should be quite fascinating and interesting.

  • So there are five main paradigms in machine learning.

  • And I will talk about each one of them in turn

  • and then try to step back and see, what is the big picture

  • and what is this idea of the master algorithm.

  • The first way computers discover knowledge

  • is by filling gaps in existing knowledge.

  • Pretty much the same way that scientists work, right?

  • You make observations, you hypothesize

  • theories to explain them, and then

  • you see where they fall short.

  • And then you adapt them, or throw them away

  • and try new ones, and so on.

  • So this is one.

  • Another one is to emulate the brain.

  • Right?

  • The greatest learning machine on earth

  • is the one inside your skull, so let's reverse engineer it.

  • Third one is to simulate evolution.

  • Evolution, by some standards, is actually an even greater

  • learning algorithm than your brain

  • is, because, first of all, it made your brain.

  • It also made your body.

  • And it also made every other life form on Earth.

  • So maybe that's something worth figuring out how it works

  • and doing it with computers.

  • Here's another one.

  • And this is to realize that all the knowledge that you learn

  • is necessarily uncertain.

  • Right?

  • When something is induced from data,

  • you're never quite sure about it.

  • So the way to learn is to quantify that uncertainty using

  • probability.

  • And then as you see more evidence,

  • the probability of different hypotheses evolves.

  • Right?

  • And there's an optimal way to do this using Bayes' theorem.

  • And that's what this approach is.

  • Finally, the last approach, in some ways,

  • is actually the simplest and maybe even the most intuitive.

  • It's actually to just reason by analogy.

  • There's a lot of evidence in psychology

  • that humans do this all the time.

  • You're faced with a new situation,

  • you try to find a matching situation in your experience,

  • and then you transfer the solution

  • from the situation that you already

  • know to the new situation that you're faced with.

  • And connected with each of these approaches to learning,

  • there is a school of thought in machine learning.

  • So the five main ones are the Symbolists, Connectionists,

  • Evolutionaries, Bayesians, and Analogizers.

  • The Symbolists are the people who

  • believe in discovering new knowledge

  • by filling in the gaps in the knowledge

  • that you already have.

  • One of the things that's fascinating about machine

  • learning is that the ideas in the algorithms

  • come from all of these different fields.

  • So for example, the Symbolists, they have their origins

  • in logic, philosophy.

  • And they're, in some sense, the most "computer-sciency"

  • of the five tribes.

  • The Connectionists, their origins

  • are, of course, in neuroscience, because they're

  • trying to take inspiration from how the brain works.

  • The Evolutionaries, well, their origins

  • are, of course, in evolutionary biology,

  • in the algorithm of evolution.

  • The Bayesians come from statistics.

  • The Analogizers actually have influences

  • from a lot of different fields, but probably the single most

  • important one is psychology.

  • So in addition to being very important for our lives,

  • machine learning is also a fascinating thing,

  • I think, to study, because in the process of studying machine

  • learning, you can actually study all of these different things.

  • Now each of these "tribes" of machine learning, if you will,

  • has its own master algorithm, meaning its own general purpose

  • learner that, in principle, can be used to learn anything.

  • In fact, each of these master algorithms

  • has a mathematical proof that says,

  • if you give it enough data, it can learn anything.

  • OK?

  • For the Symbolists, the master algorithm is inverse deduction.

  • And we'll see, in a second, what that is.