Sonnet 2.0 (TF Dev Summit '19) (Sonnet 2.0 (TF Dev Summit ‘19))

字幕列表影片播放

[MUSIC PLAYING]
MALCOLM REYNOLDS: My name is Malcolm
and I'm a research engineer at DeepMind.
TAMARA NORMAN: Hi, my name is Tamara
and I'm a software engineer at DeepMind.
MALCOLM REYNOLDS: So we're here today to talk about Sonnet
and hopefully answer some questions for you,
so what is Sonnet, why DeepMind has found it useful,
and why it might be useful to you,
and then talk a bit about our plans
for the future of the library.
So Sonnet is DeepMind's library for constructing
neural networks in TensorFlow.
We've been working on it since 2016
and it's been in open source for the last couple of years.
And you can try the TensorFlow 1.0 version with this command
line here.
But you might be thinking, sounds interesting.
Why does this exist?
And that is a very legitimate question.
So to answer that, we have to take a little trip
down memory lane.
So 2014, TensorFlow is in development at Brain,
but there's no open-source release yet.
And unrelatedly I guess, DeepMind is acquired by Google.
So a little bit of code archeology here.
So does anybody recognize what system this is written in?
Hands up if it looks vaguely familiar.
OK, not too many hands.
So I censored some of this just to make it slightly harder.
So now it may be clearer that this is actually
Torch7 code, or Lua Torch.
And specifically, we're defining a module here.
That's what the red arrow is pointing to.
So DeepMind was an entirely Torch7 operation,
at least in the research side when I joined.
And it's a really great piece of software.
If you're interested in the sort of history of deep learning
abstractions, you know, the abstractions in Torch
really go back to stuff from the '90s, like Lush.
So fast forward a few years, we have
dozens of research projects going on, hundreds of people
involved, and being able to kind of quickly share code
between the projects was really critical.
So somebody comes up with a new RNN,
everyone else can try that out with minimal fuss
and no changing of the code.
And we thought we had a pretty good workflow
and we were pretty happy with it,
and then we decided to just get rid of it
all and start from scratch.
Seems questionable.
But the reason is we were transitioning to TensorFlow.
And obviously, there's many great reasons
to use TensorFlow.
We've been hearing about them for the last few days.
Top of the list for us was better alignment
and collaboration with our colleagues at Google.
It was designed for distributed training from day one, which
Torch really wasn't, and that was becoming a problem
in some of our research.
And we wanted to really take the best advantage of TPUs,
and obviously, TensorFlow is the best way to use them.
And we knew from initial assessments
this was going to be flexible enough to build
whatever we need on top of.
But we weren't actually starting from scratch.
We had some stuff we really wanted to preserve.
And it's not that we want the same APIs as Lua Torch.
That would not have been the right decision.
But we wanted the same engineering philosophy.
So what we mean by this is like hopefully we're
going to state through this talk.
So first of all, kind of composable
modules using object-oriented programming.
Now it's definitely totally valid
to have a kind of purely functional library.
But we found that when you start having lots of variable reuse
RNNs, those kind of features, objects really
made a lot more sense.
So we definitely want to keep that.
We wanted to decouple the model definition
from how it's actually trained.
And this is really key, and I will comment on this
again shortly.
Hackability is really crucial.
We don't think we can anticipate everything our research
scientists might want, like years or even months ahead
of time.
But what we can do is have code that they
are able to dig down, hopefully not too far,
make the changes they need to, maybe by forking the module,
and we're kind of comfortable with that,
and then continue with their research.
And it's important to emphasize we're really
optimizing for research.
You've heard a lot about a lot of great solutions
for kind of moving things into production.
And that's really important, but for DeepMind research, that
really happens very rarely.
So we prefer to optimize the libraries
based on research progress, next paper, that kind of thing.
And overall, we're looking for a standard interface
with really minimal assumptions that doesn't prevent you
from doing anything.
So composability.
What do I mean by this?
Hopefully, anyone who's used an LSTM would be like,
yes, you can implement this out of some Linear modules,
and they contain the variables and LSTM
is a thing around all of that.
But you can go many levels up from there.
So the differentiable neural computer,
which was the first project I worked on at DeepMind,
you know, that's a module that contains
an LSTM plus some other stuff.
And we worked on that and got the paper out,
and then the code is sitting there,
and then another team could come along and be like, hey,
we have some system which requires an RNN.
It's currently an LSTM, but maybe we'll
just try this instead.
And you know, it's been used in many places,
but I think the most significant maybe was the Capture the Flag
work, which had some really cool results, specifically
with the DNC controller.
And you know, if they could reuse the code,
like they didn't need to ask us about how it worked.
They didn't even need to read the code.
They're just like, this is an RNN that conforms to some API,
we'll just drop it in, and it works.
Orthogonality to the training setup dimension.
So I did a quick survey.
We have roughly two setups for unsupervised learning,
four for reinforcement learning, and many more setups
for single projects.
And to be clear, what I mean here,
this is kind of everything that goes on above the model.
Like how do you feed it the data,
how do you launch experiments, monitor them, like
what configurability is there for the researchers.
All this kind of stuff is what I'm
referring to as training setup.
And this might seem like a ton of, you know,
repeated effort and duplicated code,
but we don't actually view this as a problem.
These different setups exist for different research goals.
And trying to push everything into a single training setup,
even just for RL, it's been tried.
We don't think it works.
So we're kind of happy with lots of different setups
which kind of coexist.
And the reason it's not a problem,
apart from the serving different research goals,
is that we can reuse Sonnet modules between all of them.
And that means we're not like redoing the DNC from scratch.
So I've talked a bunch about what we've already done
with Sonnet and TensorFlow 1.
Tamara is going to talk a bit about the future.
TAMARA NORMAN: So yeah.
Cheers, Malcolm.
So what will Sonnet look like in TF 2?
DeepMind are really excited for TF 2,
and are planning once again to invest
and changing our workflow in adopting TF 2.
Eager execution has already been widely tried by researchers,
and they really enjoy the improved flexibility,
debugging, and simplicity that come from this.
And they're looking forward to the benefits
of this being part of TF 2.
But one thing we've noticed is that Sonnet 1 was
based on features, like variable scripts quite heavily,
and these are going away.
This isn't something that we're worried about.
We're really excited and follow TF
into this much more natural and Pythonic world.
So how do we plan on making Sonnet in this new world?
So we've built tf.Module as the base of Sonnet 2.
This is a stateful container, which provides both variable
and model tracking.
It's been designed by engineers at DeepMind
in collaboration between many individuals at DeepMind
and Brain, learning lessons from both Sonnet 1 and Lua Torch.
It's now been upstreamed into TensorFlow.
And you can try it out in the alpha
that's been released over the course of Dev Summit.
It will soon form the basis of many higher level
components in TF, including things like TF Keras.
So what about modules?
Modules can have multiple forward methods,
addressing a limitation that we found within Sonnet 1.
They can be nested arbitrarily, allowing
for the creation of complex models,
like the [INAUDIBLE] that was talked about earlier.
And they have automatic name scoping, which
allows for easy debugging.
So what are we trying to do with Sonnet 2?
We're aiming to create a library which
makes very few assumptions about what users want to do,
both with their networks and their training loops.
It's been designed in close collaboration with researchers.
And we think it's a lightweight, simple abstraction that's
as close to the maths as it can be.
So what the features of Sonnet?
Sonnet has the ability to have multiple forward methods,
something enabled by tf.Module.
But what does that really mean?
There will be an easy way to create [? excessive ?] methods
on an object that can have access
to the same state and variables.
An example of where this is useful
will be in a few slides time.
One thing that we've found that's really
crucial with Sonnet 1 is that it's really easy to compose it
with other systems.
Sonnet 1 works out of the box with Replicator,
which is our internal solution to distributed training,
even though these libraries were never designed
with each other in mind.
They don't even have a dependency on each other.
They both just use TensorFlow under the hood.
It's up to users to define how they
want to compose their system together and decide where they
want to put things like that [INAUDIBLE]
to ensure that the gradients are the same across all
the devices.
We plan for the same philosophy to be implemented in Sonnet 2.
Replicator has been merged with Distribution Strategy,
so you'll all be able to try this out.
And you can find more details on Replicator on the DeepMind blog
post.
So what do we want to do within Sonnet?
We also want to continue to provide the flexibility
and composability that we've already
heard about that they have in v1, such as things that models
can be composed in arbitrary ways,
and that no training loop is provided.
We still don't want to aim to predict
where the research will go.
So what does the code look like?
This is the linear layer.
It's pretty much the implementation, just
without some of the comments.
And one thing we've been really excited by
is how simple an implementation we've
been able to create on top of tf.Module and TF 2 is.
Research is by nature about pushing boundaries.
And we think these simple layers with minimal boilerplate
will encourage forking and inheritance as a way of trying
out variations.
Another thing I just wanted to highlight
is that modules don't necessarily have to be layers
or even have state.
They just inherit from tf.Module and get the name scoping
from that.
In Sonnet, we also implement optimizers of modules.
They are simply nodes on a graph and benefit
with the name scoping from tf.Module, which greatly adds
to the visualization of graphs and debugging
of operations later on.
The forward method on optimizers isn't even [INAUDIBLE]..
It's the supply update, which doesn't return anything.
It just takes your gradients and your variables
and applies them together.
And as promised, here's the example
of using multiple forward methods.
Making something like a variation [INAUDIBLE] encoder
or encapsulate both an encoder and a decoder,
so they can be trained together whilst exposing both
the forward methods on them both as a way for testing
and using them later on.
So what have we found when implementing Sonnet in TF 2?
We've been doing it for a while now,
and we found that we've had a much simpler, more
understandable code than before.
The debugging has been vastly improved and less painful
with Eager mode.
And we have initially seen some significant speed ups
with TF function.
I think this will be a hugely powerful tool going forwards.
So when can you try this out?
The road map for Sonnet 2 is on the way.
It's on the way.
tf.Module is already available in the alpha release
in TF Core.
We're iterating currently on basic modules
now with researchers with the alpha release on GitHub soon.
And hopefully, we'll have a beta release sometime in the summer.
And now over to Malcolm.
He's going to talk to you about some
of the projects, some of which you may have heard of,
which have used and been enabled by features of Sonnet 1
that we're taking forward into Sonnet 2.
MALCOLM REYNOLDS: Thanks, Tamara.
So just a few samples of stuff that's
been done over the past few years
in Sonnet and TensorFlow 1.
So this is the generative query network.
So it's kind of receiving, like, 2D rendering observations
at the top.
And then it's able to infer what the 3D environment looks
like and put the camera in any new position.
And you can see here the different probabilistic views
of what the whole environment could look like.
And this is, like, a classic example
of several papers worth of research,
which are all like modules, which then
get kind of built on top of.
So the original paper was draw back in the Lua Torch days.
And then we expanded on that with convolutional draw
and then the generative query network.
AlphaStar, something quite recent.
And Sonnet was really key here because there's
no assumptions on the shape of what's coming
in and out of each module.
Both the observations and actions
for the StarCraft environment are extremely complex
hierarchical structures.
The action space, you need to have
a kind of autoregressive model to pick
whether you're going to move a unit
and then where you're going to move the unit to.
And the fact that Sonnet had been
built with minimal assumptions about what goes in and out
meant that this was actually quite easy.
Another interesting aspect here is how the model is actually
trained.
So, the details are on a blog post on the DeepMind website.
But the AlphaStar League is quite a complicated set up
with lots of different agents.
And they train against frozen versions of themselves
from the past and try and learn to exploit them.
And then you make a new agent that
tries to learn to exploit all the previous ones.
And this kind of custom setup is a good reason why we don't just
try and stick with one training style for everything,
because even a year ago, I don't think we could have anticipated
what kind of elaborate training scheme
would have been necessary to make this work.
And finally, BigGAN.
I suspect a lot of people are quite familiar with this.
But just in case you're not, this
is not a real dog, not a real photo.
This is randomly generated.
And I think the key aspects of Sonnet
here was that the underlying GAN architecture was, like,
maybe relatively similar to things that had already
been in the literature.
There was a few key components which
made it really a lot better.
One of them was being able to train it on an entire [? TPE ?]
pod.
And a key aspect here was cross replica [INAUDIBLE]..
So the researchers involved could just say,
here's a normal GAN made out of normal modules.
We need to, like, hack the batch norm,
like, completely replace it to do kind
of interesting [? TPE ?] stuff.
And then everything else is just kind of--
just works as normal.
And obviously, we can't talk about BigGAN
without briefly having a look at dogball.
So here's a little a of dogball.
And then, that's probably enough.
So in conclusion, Sonnet is a library
that's been designed in collaboration with DeepMind
research scientists.
We've been using it very happily for the last few years.
We're looking forward to using it in the TensorFlow 2 version.
It works for us and it might work for you.
Thank you very much.
[APPLAUSE]
[MUSIC PLAYING]