機器學習。谷歌的願景--谷歌I/O 2016 (Machine Learning: Google's Vision - Google I/O 2016)

字幕列表影片播放

TOM SIMONITE: Hi.
Good morning.
Welcome to day three of Google I/O,
and what should be a fun conversation about machine
learning and artificial intelligence.
My name is Tom Simonite.
I'm San Francisco bureau chief for MIT Technology Review.
And like all of you, I've been hearing a lot recently
about the growing power of machine learning.
We've seen some striking results come out
of academic and industrial research labs,
and they've moved very quickly into the hands of developers,
who have been using them to make new products and services
and companies.
I'm joined by three people this morning
who can tell us about how this new technology
and the capabilities it brings are coming out into the world.
They are Aparna Chennapragada, who
is the director of product management
and worked on the Google Now mobile assistant,
Jeff Dean, who leads the Google Brain research group here
in Mountain View, and John Giannandrea,
who is head of search and machine intelligence at Google.
Thanks for joining me, all of you.
We're going to talk for about 30 minutes,
and then there will be time for questions from the floor.
John, why don't we start with you?
You could set the scene for us.
Artificial intelligence and machine learning
are not brand new concepts.
They've been around for a long time,
but we're suddenly hearing a lot more about them.
Large companies and small companies
are investing more in this technology,
and there's a lot of excitement.
You can even get a large number of people
to come to a talk about this thing early in the morning.
So what's going on?
Tell these people why they're here.
JOHN GIANNANDREA: What's going on?
Yeah, thanks, Tom.
I mean, I think in the last few years,
we've seen extraordinary results in fields that hadn't really
moved the needle for many years, like speech recognition
and image understanding.
The error rates are just falling dramatically,
mostly because of advances in deep neural networks,
so-called deep learning.
I think these techniques are not new.
People have been using neural networks for many, many years.
But a combination of events over the last few years
has made them much more effective,
and caused us to invest a lot in getting them
into the hands of developers.
People talk about it in terms of AI winters,
and things like this.
I think we're kind of an AI spring right now.
We're just seeing remarkable progress
across a huge number of fields.
TOM SIMONITE: OK.
And now, how long have you worked
in artificial intelligence, John?
JOHN GIANNANDREA: Well, we started
investing heavily in this at Google about four years ago.
I mean, we've been working in these fields,
like speech recognition, for over a decade.
But we kind of got serious about our investments
about four years ago, and getting organized
to do things that ultimately resulted
in the release of things like TensorFlow, which
Jeff's team's worked on.
TOM SIMONITE: OK.
And we'll talk more about that later, I'm sure.
Aparna, give us a perspective from the view of someone
who builds products.
So John says this technology has suddenly
become more powerful and accurate and useful.
Does that open up new horizons for you,
when you're thinking about what you can build?
APARNA CHENNAPRAGADA: Yeah, absolutely.
I think for me, these are great as a technology.
But as a means to an end, they're
powerful tool kits to help solve real problems, right?
And for us, as building products, and for you guys,
too, there's two ways that machine learning
changes the game.
One is that it can turbo charge existing use cases-- that
is, existing problems like speech recognition--
by dramatically changing some technical components
that power the product.
If you're building a voice enabled assistant, the word
error rate that John was talking about, as soon as it dropped,
we actually saw the usage go up.
So the product gets more usable as machine learning improves
the underlying engine.
Same thing with translation.
As translation gets better, Google Translate,
it scales to 100-plus languages.
And photos is a great example.
You've heard Sundar talk about it, too,
that as soon as you have better image understanding,
the photo labeling gets better, and therefore, I
can organize my photos.
So it's a means to an end.
That's one way, certainly, that we have seen.
But I think the second way that's, personally, far more
exciting to see is where it can unlock new product use cases.
So turbocharging existing use cases is one thing,
but where can you kind of see problems
that really weren't thought of as AI or data problems?
And thanks to mobile, here-- 3 billion phones-- a lot
of the real world problems are turning into AI problems,
right?
Transportation, health, and so on.
That's pretty exciting, too.
TOM SIMONITE: OK.
And so is one consequence of this
that we can make computers less annoying, do you think?
I mean, that would be nice.
We'd all had these experiences where
you have a very clear idea of what it is you're trying to do,
but it feels like the software is doing
everything it can to stop you.
Maybe that's a form of artificial intelligence, too.
I don't know.
But can you make more seamless experiences
that just make life easier?
APARNA CHENNAPRAGADA: Yeah.
And I think in this case, again, one of the things
to think about is, how do you make sure-- especially
as you build products-- how do you
make sure your interface scales with the intelligence?
The UI needs to be proportional to AI.
I cannot believe I said some pseudo formula in front of Jeff
Dean.
But I think that's really important,
to make sure that the UI scales with the AI.
TOM SIMONITE: OK.
And Jeff, for people like Aparna,
building products, to do that, we
need this kind of translation step
which your group is working on.
So Google Brain is a research group.
Works in some very fundamental questions in its field.
But you also build this infrastructure,
which you're kind of inventing from scratch, that makes
it possible to use this stuff.
JEFF DEAN: Yeah.
I mean, I think, obviously, in order
to make progress on these kinds of problems,
it's really important to be able to try lots of experiments
and do that as quickly as you can.
There's a very fundamental difference
between having an experiment take a few hours,
versus something that takes six weeks.
It's just a very different model of doing science.
And so, one of the things we work on
is trying to build really scalable systems that are also
flexible and easy to express new kinds of machine learning
ideas.
So that's how TensorFlow came about.
It's sort of our internal research vehicle,
but also robust enough to take something you've done and done
lots of experiments on, and then, when you get something
that works well, to take that and move it into a production
environment, run things on phones or in data
centers, on RTPUs, that we announced a couple days ago.
And that seamless transition from research
to putting things into real products
is what we're all about.
TOM SIMONITE: OK.
And so, TensorFlow is this very flexible package.
It's very valuable to Google.
You're building a lot of things on top of it.
But you're giving it away for free.
Have you thought this through?
Isn't this something you should be keeping closely held?
JEFF DEAN: Yeah.
There was actually a little bit of debate internally.
But I think we decided to open source it,
and it's got a nice Apache 2.0 license which basically
means you can take it and do pretty much whatever
you want with it.
And the reason we did that is several fold.
One is, we think it's a really good way of making research
ideas and machine learning propagate more quickly
throughout the community.
People can publish something they've done,
and people can pick up that thing
and reproduce those people's results or build on them.
And if you look on GitHub, there's
like 1,500 repositories, now, that mention TensorFlow,
and only five of them are from Google.
And so, it's people doing all kinds of stuff with TensorFlow.
And I think that free exchange of ideas and accelerating
of that is one of the main reasons we did that.
TOM SIMONITE: OK.
And where is this going?
So I imagine, right now, that TensorFlow is mostly
used by people who are quite familiar with machine learning.
But ultimately, the way I hear people
talk about machine learning, it's
just going to be used by everyone everywhere.
So can developers who don't have much
of a background in this stuff pick it up yet?
Is that possible?
JEFF DEAN: Yeah.
So I think, actually, there's a whole set
of ways in which people can take advantage of machine learning.
One is, as a fundamental machine learning researcher,
you want to develop new algorithms.
And that's going to be a relatively small fraction
of people in the world.
But as new algorithms and models are developed
to solve particular problems, those models
can be applied in lots of different kinds of things.
If you look at the use of machine learning
in the diabetic retinopathy stuff
that Sundar mentioned a couple days ago,
that's a very similar problem to a lot of other problems
where you're trying to look at an image
and detect some part of it that's unusual.
We have a similar problem of finding text
in Street View images so that we can read the text.
And that looks pretty similar to a model
to detect diseased parts of an eye, just different training
data, but the same model.
So I think the broader set of models
will be accessible to more and more people.
And then there's even an easier way,
where you don't really need much machine learning knowledge
at all, and that is to use pre-trained APIs.
Essentially, you can use our Cloud Vision API
or our Speech APIs very simply.
You just give us an image, and we give you back good stuff.
And as part of the TensorFlow flow open source,
we also released, for example, an inception model that
does image classification that's the same model as underlies
Google Photos.
TOM SIMONITE: OK.
So will it be possible for someone-- maybe they're
an experienced builder of apps, but don't know much about
machine learning-- they could just
have an idea and kind of use these building blocks to put it
together?
JEFF DEAN: Yeah.
Actually, I think one of the reasons TensorFlow has taken
off, is the tutorials in TensorFlow are actually
quite good at illustrating six or seven important kinds
of models in machine learning, and showing people
how they work, stepping through both the machine learning
that's going on underneath, and also how you express them
in TensorFlow.
That's been pretty well received.
TOM SIMONITE: OK.
And Aparna, I think we've seen in the past
that when a new platform of mode of interaction comes forward,
we have to experiment with it for some time
before we figure out what works, right?
And sometimes, when we look back,
we might think, oh, those first generation
mobile apps were kind of clunky, and maybe not so smart.
How are we going with that process
here, where we're starting to have to understand
what types of interaction work?
APARNA CHENNAPRAGADA: Yeah.
And I think it's one of the things that's not intuitive
when you start out, you rush out into a new area,
like we've all done.
So one experience, for example, when
we started working on Google Now, one thing we realized
is, it's really important to make sure
that, depending on the product domain, some of these black box
systems, you need to pay attention
to what we call internally as the wow to WTH ratio.
That is, as soon as you kind of say,
hey, there are some delightful magical moments, right?
But then, if you kind of get it wrong,
there's a high cost to the user.
So to give you an example, in Google Search,
let's say you search for, I don't know, Justin Timberlake,
and we got a slightly less relevant answer.
Not a big deal, right?
But then, if the assistant told you to sit in the car,
go drive to the airport, and you missed
your flight, what the hell?
So I think it's really important to get that ratio right,
especially in the early stages of this new platform.
The other thing we noticed also is
that explainability or interpretability really builds
trust in many of these cases.
So you want to be careful about looking
at which parts of the problem you use machine learning
and you drop this into.
You want to look at problems that are easy for machines
and hard for humans, the repetitive things,
and then make sure that those are the problems that you
throw machine learning against.
But you don't want to be unpredictable and inscrutable.
TOM SIMONITE: And one mode of interaction that everyone seems
to be very excited about, now, is this idea
of conversational interface.
So we saw the introduction on Wednesday of Google Assistant,
but lots of other companies are building these things, too.
Do we know that definitely works?
What do we know about how you design
a conversational interface, or what the limitations
and strengths are?
APARNA CHENNAPRAGADA: I think, again, at a broad level,
you want to make sure that you can have this trust.
So [INAUDIBLE] domains make it easy.
So it's very hard to make a very horizontal system
work that works for anything.
But I'm actually pretty excited at the progress.
We just launched-- open sourced-- the sentence parser,
Parsey Mcparseface.
I just wanted to say that name.
But it's really exciting, because then you say,
OK, you're starting to see the beginning of conversational,
or at least a natural language sentence understanding,
and then you have building blocks that build on top of it.
TOM SIMONITE: OK.
And John, with your search hat on for a second,
we heard on Wednesday that, I think, 20% of US searches
are now done by voice.
So people have clearly got comfortable with this,
and you've managed to provide something
that they want to use.
Is the Assistant interface to search
going to grow in a similar way, do you think?
Is it going to take over a big chunk of people's search
queries?
JOHN GIANNANDREA: Yeah.
We think of the Assistant as a fundamentally different product
than search, and I think it's going
to be used in a different way.
But we've been working on what we
call voice search for many, many years,
and we have this evidence that people
like it and are using it.
And I would say our key differentiator, there, is just
the depth of search, and the number of questions
we can answer, and the kinds of complexities
that we can deal with.
I think language and dialogue is the big unsolved problem
in computer science.
So imagine you're reading an article
and then writing a shorter version of it.
That's currently beyond the state of the art.
I think the important thing about the open source release
we did of the parser is it's using TensorFlow as well.
So in the same way as Jeff explained,
the functionality of this in Google Photos for finding
your photos is actually available open source,
and people can actually play with it
and run a cloud version of it.
We feel the same way about natural language understanding,
and we have many more years of investment
to make in getting to really natural dialogue systems,
where you can say anything you want,
and we have a good shot of understanding it.
So for us, this is a journey.
Clearly, we have a fairly usable product in voice search today.
And the Assistant, we hope, when we launch
later this year, people will similarly
like to use it and find it useful.
TOM SIMONITE: OK.
Do you need a different monetization model
for the Assistant dialogue?
Is that something--
JOHN GIANNANDREA: We're really focused, right now,
on building something that users like to use.
I think Google has a long history
of trying to build things that people find useful.
And if they find them useful, and they use them at scale,
then we'll figure out a way to actually have a business
to support that.
TOM SIMONITE: OK.
So you mentioned that there are still
a lot of open research questions here,
so maybe we could talk about that a little bit.
As you described, there have been
some very striking improvements in machine learning recently,
but there's a lot that can't be done.
I mean, if I go to my daughter's preschool,
I would see young children learning and using
language in ways that your software can't match right now.
So can you give us a summary of the territory that's
still to be explored?
JOHN GIANNANDREA: Yeah.
There's a lot still to be done.
I think there's a couple of areas
which researchers around the world
are furiously trying to attack.
So one is learning from smaller numbers of examples.
Today, the learning systems that we have,
including deep neural networks, typically
require really large numbers of examples.
Which is why, as Jeff was describing,
they can take a long time to train,
and the experiment time can be slow.
So it's great that we can give systems
hundreds of thousands or millions of labeled examples,
but clearly, small children don't need to do that.
They can learn from very small numbers of examples.
So that's an open problem.
I think another very important problem in machine learning
is what the researchers call transfer learning, which
is learning something in one domain,
and then being able to apply it in another.
Right now, you have to build a system
to learn one particular task, and then that's not
transferable to another task.
So for example, the AlphaGo system that
won the Go Championship in Korea,
that system can't, a priori, play chess or tic tac toe.
So that's a big, big open problem
in machine learning that lots of people are interested in.
TOM SIMONITE: OK.
And Jeff, this is kind of on your group, to some extent,
isn't it?
You need to figure this out.
Are there particular avenues or recent results
that you would highlight that seem to be promising?
JEFF DEAN: Yeah.
I think we're making, actually, pretty significant progress
in doing a better job of language understanding.
I think, if you look at where computer vision was three
or four or five years ago, it was
kind of just starting to show signs of life,
in terms of really making progress.
And I think we're starting to see the same thing in language
understanding kinds of models, translation, parsing, question
answering kinds of things.
In terms of open problems, I think unsupervised
learning, being able to learn from observations
of the world that are not labeled,
and then occasionally getting a few labeled examples that
tell you, these are important things about the world
to pay attention to, that's really
one of the key open challenges in machine learning.
And one more, I would add, is, right now,
what you need a lot of machine learning expertise for
is to kind of device the right model structure
for a particular kind of problem.
For an image problem, I should use convolutional neural nets,
or for language problems, I should use this particular kind
of recurrent neural net.
And I think one of the things that
would be really powerful and amazing
is if the system itself could device the right structure
for the data it's observing.
So learning model structure concurrently
with trying to solve some set of tasks, I think,
would be a really great open research problem.
TOM SIMONITE: OK.
So instead of you having to design the system
and then setting it loose to learn,
the learning system would build itself, to some extent?
JEFF DEAN: Right.
Right now, you kind of define the scaffolding of the model,
and then you fiddle with parameters
as part of the learning process, but you don't sort of
introduce new kinds of connections
in the model structure itself.
TOM SIMONITE: Right.
OK.
And unsupervised learning, just giving it that label,
it makes it sound like one unitary problem, which
may not be true.
But will big progress on that come
from one flash of insight and a new algorithm,
or will it be-- I don't know-- a longer slog?
JEFF DEAN: Yeah.
If I knew, that would be [INAUDIBLE].
I have a feeling that it's not going to be, like,
100 different things.
I feel like there's a few key insights
that new kinds of learning algorithms
could pick up on as to what aspects
of the world the model is observing are important.
And knowing which things are important
is one of the key things about unsupervised learning.
TOM SIMONITE: OK.
Aparna, so what Jeff's team kind of works out, eventually,
should come through into your hands,
and you could build stuff with it.
Is there something that you would really
like him to invent tomorrow, so you can start building
stuff with it the day after?
APARNA CHENNAPRAGADA: Auto generate emails.
No, I'm kidding.
I do think, actually, what's interesting is, you've heard
these building blocks, right?
So machine perception, computer vision, wasn't a thing,
and now it's actually reliable.
Language understanding, it's getting there.
Translation is getting there.
To me, the next other building block you can make machines do
is hand-eye coordination.
So you've seen the robot arms video
that Sundar talked about and showed at the keynote,
but imagine if you could kind of have these rote tasks that
are harder, tedious for humans, but if you
had reliable hand-eye coordination built in, that's
in a learned system versus a controlled system code
that you usually write, and it's very brittle,
suddenly, it opens up a lot more opportunities.
Just off the top of my head, why isn't there
anything for, like, elderly care?
Like, you are an 80-year-old woman with a bad back,
and you're picking up things.
Why isn't there something there?
Or even something as mundane with natural language
understanding, right?
I have a seven-year-old.
I'm a mom of a 7-year-old.
Why isn't there something for, I don't know,
math homework, with natural language understanding?
JOHN GIANNANDREA: So I think one of things
we've learned in the last few years
is that things that are hard for people
to do, we can teach computers to do,
and things that are easy for us to do
are still the hard problems for computers.
TOM SIMONITE: Right.
OK.
And does that mean we're still missing some big new field
we need to invent?
Because most of the things we've been talking about so far
have been built on top of this deep learning
and neural network.
JOHN GIANNANDREA: I think robotics work is interesting,
because it gives the computer system an embodiment
in the world, right?
So learning from tactile environments
is a new kind of learning, as opposed to just looking
at unsupervised or supervised.
Just reading text is a particular environment.
Perception, looking at images, looking at audio,
trying to understand what this song is,
that's another kind of problem.
I think interacting with the real world
is a whole other kind of problem.
TOM SIMONITE: Right.
OK.
That's interesting.
Maybe this is a good time to talk a little bit more
about DeepMind.
I know that they are very interested in this idea
of embodiment, the idea you have to submerge this learning
agent in a world that it can learn from.
Can you explain how they're approaching this?
JOHN GIANNANDREA: Yeah, sure.
I mean, DeepMind is another research group
that we have at Google, and we work closely with them
all the time.
They are particularly interested in learning from simulations.
So they've done a lot of work with video games
and simulations of physical environments,
and that's one of the research directions that they have.
It's been very productive.
TOM SIMONITE: OK.
Is it just games?
Are they moving into different types of simulation?
JOHN GIANNANDREA: Well, there's a very fine line
between a video game-- a three-dimensional video game--
and a physics simulation already environment, right?
I mean, some video games are, in fact,
full simulations of worlds, so there's not really
a bright line there.
TOM SIMONITE: OK.
And do DeepMind work on robotics?
They don't, I didn't think.
JOHN GIANNANDREA: They're doing a bunch of work
in a bunch of different fields, some of which
gets published, some of which is not.
TOM SIMONITE: OK.
And the robot arms that we saw in the keynote on Wednesday,
are they within your group, Jeff?
JEFF DEAN: Yes.
TOM SIMONITE: OK.
So can you tell us about that project?
JEFF DEAN: Sure.
So that was a collaboration between our group
and the robotics teams in Google X. Actually, what happened was,
one of our researchers discovered
that the robotics team, actually,
had 20 unused arms sitting in a closet somewhere.
They were a model that was going to be discontinued
and not actually used.
So we're like, hey, we should set these up in a room.
And basically, just the idea of having
a little bit larger scale robotics test environment
than just one arm, which is what you typically
have in a physical robotics lab, would
make it possible to do a bit more exploratory research.
So one of the first things we did with that was just
have the robots learn to pick up objects.
And one of the nice properties that has,
it's a completely supervised problem.
The robot can try to grab something,
and if it closes its griper all the way, it failed.
And if it didn't close it all the way,
and it picked something up, it succeeded.
And so it's learning from raw camera pixel inputs
directly to torque motor controls.
And there's just a neural net there
that's trained to pick things up based on the observations it's
making of things as it approaches a particular object.
TOM SIMONITE: And is that quite a slow process?
I mean, that fact that you have multiple arms going
at once made me think that, maybe, you
were trying to maximize your throughput, or something.
JEFF DEAN: Right.
So if you have 20 arms, you get 20 times as much experience.
And if you think about how small kids learn to pick stuff up,
it takes them maybe a year, or something,
to go from being able to move their arm to really be
able to grasp simple objects.
And by parallelizing this across more arms,
you can pool the experience of the robotic arms a bit.
TOM SIMONITE: I see.
OK.
JEFF DEAN: And they need less sleep.
TOM SIMONITE: Right.
John, at the start of the session,
you referred to this concept of AI winter,
and you said you thought it was spring.
When do we know that it's summer?
JOHN GIANNANDREA: Summer follows spring.
I mean, there's still a lot of unsolved problems.
I think problems around dialogue and language
are the ones that I'm particularly interested in.
And so, until we can teach a computer to really read,
I don't think we can declare that it's summer.
I mean, if you can imagine a computer's really reading
and internalizing a document.
So it's interesting.
So translation is reading a paragraph in one language
and writing it in another language.
In order to do that really, really well,
you have to be able to paraphrase.
You have to be able to reorder words, and so on and so
forth So imagine translating something
from English to English.
So you read a paragraph, and you write a different paragraph.
If we could do that, I think I would declare summer.
TOM SIMONITE: OK.
Reading is-- well, there are different levels of reading,
aren't there?
Do you know--
JOHN GIANNANDREA: If you can paraphrase, then you really--
TOM SIMONITE: Then you think that-- if you
could reach that level.
JOHN GIANNANDREA: And actually understood--
TOM SIMONITE: Then you've got some argument.
JOHN GIANNANDREA: And to a certain extent,
today, our translation systems, which
are not perfect by any means, are getting better.
They do do some of that.
They do do some paraphrasing.
They do do some re-ordering.
They do do a remarkable amount of language understanding.
So I'm hopeful researchers around the world
will get there.
And it's very important to us that our natural language
APIs become part of our cloud platform,
and that people can experiment with it, and help.
JEFF DEAN: One thing I would say is,
I don't think there's going to be
this abrupt line between spring and summer, right?
There's going to be developments that push the state of the art
forward in lots of different areas in kind
of this smooth gradient of capabilities.
And at some point, something becomes
possible that didn't used to be possible,
and people kind of move the goalposts
of what they think of as really, truly hard problems.
APARNA CHENNAPRAGADA: The classic joke, right?
It's only AI until it starts working,
and then it's computer science.
JEFF DEAN: Like, if you'd asked me four years ago,
could a computer write a sentence
given an image as input?
And I would have said, I don't think they
can do that for a little while.
And they can actually do that today,
and that's kind of a good example of something
that has made a lot of progress in the last few years.
And now you sort of say, OK, that's in our tool
chest of capabilities.
TOM SIMONITE: OK.
But if we're not that great at predicting
how the progress goes, does that mean we can't see winter,
if it comes back?
JOHN GIANNANDREA: If we stop seeing progress,
then I think we could question what the future's going
to look like.
But today, the rate of-- I think researchers in the field
are excited about this, and maybe the field
is a little bit over-hyped because of the rate of progress
we're seeing.
Because something like speech recognition,
which didn't work for my wife five years ago,
and now works flawlessly, because image identification
is now working better than human raters for many fields.
So there's these narrow fields for which algorithms are not
superhuman in their capabilities.
So we're seeing tremendous progress.
And so it's very exciting for people working in this field.
TOM SIMONITE: OK.
Great.
I should just note that, in a couple of minutes,
we will open up the floor for questions.
There are microphones here and here in the main seating area,
and there's one microphone up in the press area, which
I can't see right now, but hopefully you
can figure out where it is.
Sundar Pichai, CEO of Google, has spoken a lot recently
about how he thinks we're moving from a world which
is mobile-first to AI-first.
I'm interested to hear what you think that means.
Maybe, Aparna, you could speak to that.
APARNA CHENNAPRAGADA: I interpret
it a couple different ways.
One is, if you look at how mobile's changed,
how you experience computing, it's
not happened at one level of the stack, right?
It's at the interface level, it's
at the information level, and infrastructure.
And I think that's the same thing that's
going to happen with AI and any of these machine learning
techniques, which is, you'll have infrastructure layer
improvements.
You saw the announcement about TPU.
You'll have a bunch of algorithms and models
improvements at the intelligence and information layer,
and there will be interface changes.
So the best UI is probably no UI.
TOM SIMONITE: Right.
OK.
John, what does AI-first mean to you?
JOHN GIANNANDREA: I think it means
that this assistant kind of layer is available to you
wherever you are.
Whether you're in your car, or whether it's
ambient in your house, or whether you're
using your mobile device or laptop,
that there is this smart assistance
that you find very quietly useful to you all the time.
Kind of how Google search is for most people today.
I think most people would not want search engines taken away
from them, right?
So I think that being that useful to people,
so that people take it for granted,
and then it's ambient across all your devices,
is what AI-first means to me.
TOM SIMONITE: And we're in the early stages of this,
do you think?
JOHN GIANNANDREA: Yeah.
It's a journey, I think.
It's a multi-year journey
TOM SIMONITE: OK.
Great.
So thanks for a fascinating conversation.
Now, we'll let someone else ask the questions for a little bit.
I will alternate between the press mic and the mics
down here at the front.
Please keep your questions short,
so we can get through more of them,
and make sure they're questions, not statements.
We will start with the press mic, wherever it is.
MALE SPEAKER: There's nobody there.
TOM SIMONITE: I really doubt the press has no questions.
What's happening?
Why don't we start with the developer mic
right here on the right?
AUDIENCE: I have a philosophical question about prejudice.
People tend to have prejudice.
Do you think this is a step stone
that we need to take in artificial intelligence,
and how would society accept that?
JOHN GIANNANDREA: I'm not sure I understand the question.
Some people have prejudice, and?
AUDIENCE: Some people have the tendency
to have prejudice, which might lead to behaviors
such as discrimination.
TOM SIMONITE: So the question is,
will the systems that the people build have biases?
JOHN GIANNANDREA: Oh, I see.
I see.
Will people's prejudices creep into machine learning systems?
I think that is a risk.
I think it all depends on the training data that we choose.
We've already seen some issues with this kind of problem.
So I think it all depends on carefully
selecting training data, particularly
for supervised systems.
TOM SIMONITE: OK.
Is the press mic working, at this point?
SEAN HOLLISTER: Hi.
I'm Sean Hollister, up here in the press mic.
TOM SIMONITE: Great.
Go for it.
SEAN HOLLISTER: Hi, there.
I wanted to ask about the role of privacy in machine learning.
You need a lot of data to make these observations
and to help people with machine learning.
I give all my photos to Google Photos,
and I wonder what happens to them afterwards.
What allows Google to see what they
are, and is that ever shared in any way with anyone else?
Personally, I don't care very much about that.
I'm not worried my photos are going
to get out to other folks, but where do they go?
What do you do with them?
And to what degree are they protected?
JEFF DEAN: Do you want to take that one?
APARNA CHENNAPRAGADA: I think this
is one of the most important things
that we look at across products.
So even with photos, or Google Now,
or voice, and all of these things.
There's actually two principles we codify into building this.
One is, there's a very explicit--
it's a very transparent contract between the user
and the product that is, you basically know what benefits
you're getting with the data, and the data
is there to help you.
That's one principle.
But the second is, by default, it's an opt-in experience.
You're in the driver's seat.
In some sense, let's say, you're saying,
hey, I do want to get traffic information when
I'm on Shoreline, because it's clogged up to Shoreline
Amphitheater, you, of course, need the system
to know where your location is.
Because you don't want to know how the traffic is in Napa.
So having that contract be transparent, but also
an opt-in, I think it really addresses the equation.
But I think the other thing to add in here
is also that, by definition, all of these are for your eyes
only, right?
In terms of, like, all your data is yours, and that's an axiom.
JOHN GIANNANDREA: And to answer his question,
we would never share his photos.
We train models based on other photos that are not yours,
and then the machine looks at your photos,
and it can label it, but we would never
share your private photo there.
SEAN HOLLISTER: To what degree is advertising
anonymously-targeted at folks like me,
based on the contents of things I upload,
little inferences you make in the meta data?
Is any of that going to advertisers in any way,
even in aggregate, hey, this is a person who
seems to like dogs?
JOHN GIANNANDREA: For your photos?
No.
Absolutely not.
APARNA CHENNAPRAGADA: No.
TOM SIMONITE: OK.
Let's go to this mic right here.
AUDIENCE: My questions is for Aparna, about,
what is the thought process behind creating a new product?
Because there are so many things that these guys are creating.
So how do you go from-- because it's kind of obvious right
now to see if you have my emails,
and you know that I'm traveling tomorrow to New York,
it's kind of simple to do that on my calendar
and create an event.
How do you go from robotic arms, trying
to understand how to get things, to an actual product?
The question is, what is the thought process behind it?
APARNA CHENNAPRAGADA: Yeah.
I'll give you the short version of it.
And, obviously, there's a longer version of it.
Wait for the medium post.
But I think the short version of it
is, to echo one thing JG said, you
want to pick problems that are easy for machines
and hard for humans.
So AI plus machine learning is not
going to turn a non-problem into a real problem
that people need solving.
It's like, you can take Christopher Nolan and Ben
Affleck, and you can still end up with Batman Versus Superman.
So you want to make sure that the problem you're solving
is a real one.
Many of our failures, even internally
and external, like frenzy around bots and AI,
is when you kid yourself that the problem needs solving.
And the second one, the second quick insight there,
is that you also want to build an iterative model.
That is, you want to kind of start small, and say, hey,
travel needs some assistance.
What are the top five things that people need help with?
And see which of these things can scale.
JEFF DEAN: I would add one thing to that,
which is, often, we're doing research
on a particular kind of problem.
And then, when we have something we think is useful,
we'll share that internally, as presentations or whatever,
and maybe highlight a few places where
we think this kind of technology could be used.
And that's sort of a good way to inform the product designers
about what kinds of things are now possible that
didn't used to be possible.
TOM SIMONITE: OK.
Let's have another question from the press section up there.
AUDIENCE: Yeah.
There's a lot of talk, lately, about sort of a fear of AI.
Elon Musk likened it to summoning the demon.
Whether that's overblown or not, whether it's
perception versus reality, there seems
to be a lot of mistrust or fear of going
too far in this direction.
How much stock you put into that?
And how do you win the trust of the public, when
you show experiments like the robot arm thing
on the keynote, which was really cool, but sort
of simultaneously creepy at the same time?
JOHN GIANNANDREA: So I get this question a lot.
I think there's this notion that's
been in the press for the last couple of years
about so-called super intelligence,
that somehow AI will beget more AI,
and then it will be exponential.
I think researchers in the field don't put much stock in that.
I don't think we think it's a real concern yet.
In fact, I think we're a long way away
from it being a concern.
There are some researchers who actually
think about these ethical problems,
and think about AI safety, and we
think that's really important.
And we work on this stuff with them,
and we support that kind of work.
But I think it's a concern that is decades and decades away.
It's also conflated with the fact
that people look at things like robots learning
to pick things up, and that's somehow
inherently scary to people.
I think it's our job, when we bring products
to market, to do it in a thoughtful way
that people find genuinely useful.
So a good example I would give you is, in Google products,
when you're looking for a place, like a coffee shop
or something, we'll show you when it's busy.
And that's the product of fairly advanced machine learning
that takes aggregate signals in a privacy-preserving way
and says, yeah, this coffee shop is really
busy on a Saturday morning.
That doesn't seem scary to me, right?
That doesn't seem anything like a bad thing
to bring into the world.
So I think there's a bit of a disconnect between the somewhat
extended hype, and the actual use of this technology
in everyday products.
TOM SIMONITE: OK.
Next question.
AUDIENCE: Thank you.
So given Google's source of revenue
and the high use of ad blockers, is there
any possibility of using machine learning
to maybe ensure that the appropriate ads are served?
Or if there's multiple versions of the same ad,
that the ad that would apply most to me
would be served to me, and to a different user,
a different version, and things like that?
Is that on the roadmap?
JEFF DEAN: Yeah.
I think, in general, there's a lot
of potential applications of machine
learning to advertising.
Google has actually been using machine
learning in our advertising system for more than a decade.
And I think one of the things about deciding
what ads to show to users is, you
want them to be relevant and useful to that user.
And it's better to not show an ad at all,
if you don't have something that seems plausibly relevant.
And that's always been Google's advertising philosophy.
And other websites on the web don't necessarily quite
have the same balance, in that respect.
But I do think there's plenty of opportunity to continue
to improve advertising systems and make them better,
so that you see less ads, but they're actually more useful.
TOM SIMONITE: OK.
Next question from at the top.
JACK CLARK: Jack Clark with Bloomberg News.
So how do you differentiate to the user
between a sponsored advert, and one that is provided by your AI
naturally?
How do I know that the burger joint you're suggesting
is like a paid-for link, or is it a genuine link?
JEFF DEAN: So in our user interfaces,
we always clearly delimit advertisements.
And in general, all ads that we show
are selected algorithmically by our systems.
They're not like, you can just give us an ad,
and we will always show it to someone.
We always decide what is the likelihood
that this ad is going to be useful to someone,
before we decide to show that advertiser's ad.
JACK CLARK: Does this extend to stuff like Google Home, where
it will say, this is a sponsored restaurant
we're going to send you to.
JEFF DEAN: I don't know that product.
JOHN GIANNANDREA: I mean, we haven't
launched Google Home yet.
So a lot of these product decisions are still to be made.
I think we do, as a general rule,
clearly identify when something is sponsored
versus when it's organic.
TOM SIMONITE: OK.
Next question here.
AUDIENCE: Hi.
This is a question for Jeff Dean.
I'm very much intrigued by the Google Brain project
that you're doing.
Very cool t-shirt.
The question is, what is the road map of that,
and how does it relate to the point of singularity?
JEFF DEAN: Aha.
So the road map of-- this is sort of the project code name
for the team that I work on.
Basically, the team was developed
to investigate the use of advanced methods
in machine learning to solve difficult problems in AI.
And we're continuing to work on pushing the state
of the art in that area.
And I think that means working in lots of different areas,
building the right kinds of hardware with TPUs,
building the right systems infrastructure with things
like TensorFlow.
Solving the right research problems
that are not connected to products,
and then figuring out ways in which machine learning can
be used to advance different kinds of fields,
as we solve different problems along the road.
I'm not a big believer in the singularity.
I think all exponentials look like exponentials
at the beginning, but then they run out of stuff.
TOM SIMONITE: OK.
Thanks for the question.
Back to the pressbox.
STEVEN MAX PATTERSON: Hi.
Steven Max Patterson, IDG.
I was looking at Google Home and Google Assistant,
and it looks like it's really a platform.
And it's a composite of other platforms,
like the Knowledge Graph, Google Cloud Speech, Google machine
learning, the Awareness API.
Is this a feature that other consumer device manufacturers
could include, and is that the intent and direction of Google,
is to make this a platform?
JOHN GIANNANDREA: It's definitely
the case that most of our machine learning APIs
are migrating to the cloud platform, which enables people
to use, for example, our speech capabilities in other products.
I think the Google Assistant is intended to be, actually,
a holistic product delivered from Google.
That makes sense.
But it may make sense to syndicate
that to other manufacturers at some point.
We don't have any plans to do that today.
But in general, we're trying to be
as open as we can with the component pieces
that you just mentioned, and make
them available as Cloud APIs, and in many cases,
as open source solutions as well.
JEFF DEAN: Right.
I think one of the things about that
is, making those individual pieces available
enables everyone in the world to take advantage of some
of the machine learning research we've done,
and be able to do things like label images,
or do speech recognition really well.
And then they can go off and build
really cool, amazing things that aren't necessarily
the kinds of things we're working on.
JOHN GIANNANDREA: Yeah, and many companies are doing this today.
They're using our translate APIs.
They're using our Cloud Speech APIs today.
TOM SIMONITE: Right.
We have time for one last quick question from this mic here.
AUDIENCE: Hi.
I'm [INAUDIBLE].
John, you said that you would declare summer
if, in language understanding, it
would be able to translate from one paragraph in English
to another paragraph in English.
Don't you think that making that possible requires
really complete understanding of the world, and everything
that's going on, just to catch the emotional level that
is in the paragraph, or even the physical understanding
of the world around us?
JOHN GIANNANDREA: Yeah, I do.
I use that example because it is really, really hard.
So I don't think we're going to be done for many, many years.
I think there's a lot of work to do.
We built the Google Knowledge Graph, in part,
to answer that question, so that we actually
had some semantic understanding of at least
the things in the world, and some of the relationships
between them.
But yeah, it's a very hard problem.
And I used that example because it's
pretty clear we won't be done for a long time.
TOM SIMONITE: OK.
Sorry, there's no time for other questions.
Thanks for the question.
A good forward-looking note to end on.
We'll see how it works out over the coming years.
Thank you for joining me, all of you on stage,
and thanks for the questions and coming for the session.
[MUSIC PLAYING]