Placeholder Image

字幕列表 影片播放

  • ALEX PASSOS: My name is Alex.

  • And I'm here to tell you about something cool

  • that we've been working on in the TensorFlow team

  • for the past few months.

  • This is mostly work done by Google Brain in Cambridge.

  • And I'm here presenting because, well, I'm not in Cambridge.

  • So this project is called AutoGraph.

  • And what it is, is essentially, a Python compiler

  • that creates TensorFlow graphs for you.

  • So I'm sure, since you walked into this room,

  • you probably used TensorFlow before.

  • And you might have some idea about how it works.

  • And the way it works, under the hood,

  • is you build this graph to represent your computation.

  • And here, there is a multiply, an add, and a relu.

  • This is a very simple neural network.

  • And the nice thing about having a TensorFlow graph

  • is that once you have this graph,

  • you can deploy it to a server.

  • You can train it on a TPU.

  • You can run it on your mobile phone with TensorFlow.js.

  • You can run it on a Raspberry Pi.

  • There's all sorts of things you can do with it.

  • You have-- we have compilers that take these graphs

  • and produce highly-optimized code

  • for all sorts of platforms.

  • The downside of this graph is that you end up

  • writing code like this, which you see five lines of code,

  • one of which has the actual computation that you

  • want to run.

  • And in case you're too far to read, it's the fourth one.

  • Everything else is boilerplate.

  • And some of this boilerplate is easy to deal with,

  • like these place holders.

  • But some of the boilerplate is not.

  • For example, if you want a TensorFlow model that

  • has control flow inside the model,

  • you have to take your mental idea of what code

  • you want to write, which probably looks

  • like the example on the left, where you have a Python "if,"

  • and turn it into the code on the right, which is what

  • we need to generate a graph.

  • And the reason why we need this is

  • that our computation graph, because it can be put on a TPU,

  • and put on a phone, and run independently

  • of the code that produced this graph,

  • has to have information about both branches

  • of your conditional.

  • So we need to somehow tell TensorFlow everything that

  • could possibly happen in this program.

  • And so you end up with code that's

  • very redundant in this way.

  • And we have answers for this.

  • And we're working hard to make your life easier

  • if you want to use TensorFlow.

  • One project I've spent a lot of effort

  • personally on is eager execution.

  • And the idea of eager execution is

  • to make your life easier by saying,

  • you don't have to do graphs at all.

  • And you can take this idea fairly far.

  • But eventually, you hit some limits.

  • Right?

  • If you're-- with eager execution enabled,

  • you have a very intuitive interface.

  • You just write Python code.

  • You use Python control flow, Python

  • infrastructures-- all the things you're already familiar with.

  • And it's nice.

  • But you don't get this automatic parallelism and distribution

  • and scale that Frank was telling you

  • about with all the TPU pods, because you still

  • have to run Python code.

  • And processors aren't getting faster.

  • And Python is inherently single-threaded.

  • We have a few ways of getting around this.

  • Like, from eager execution, we have

  • defun-- this primitive that lets you build little graphlets

  • and run them.

  • So within that graphlet, you have all the performance

  • of graph mode.

  • But you still can have eager execution in your outer loop.

  • We can also have py_func, which lets

  • you have full graph code that can run on clusters

  • of multiple GPUs, but with eager code

  • in the middle that's doing some complicated control flow stuff.

  • And these tools are useful.

  • Both defun and py_func are differentiable.

  • They run on all sorts of devices.

  • They give you a lot of flexibility.

  • But it's a little unsatisfying that you have

  • these seams in your program.

  • And you need to think about, what's going to be a graph?

  • What's not going to be a graph?

  • What if we could do better than this?

  • And AutoGraph is this tool that we've been working on,

  • in the Google Brain Cambridge office, that

  • tries to do better than this.

  • And what it lets you do is write eager-style, Python code

  • and have a compiler turn this into the boring, very

  • complicated, graph-style code.

  • Because if you think with me for a second,

  • if you look at the transformations that

  • need to happen to turn the code on the left

  • into the code on the right, that looks like something

  • that can be automated.

  • And AutoGraph is, indeed, the thing that

  • automates this transformation.

  • It's pretty easy to use now.

  • It's in TensorFlow.contrib.

  • And when 2.0 comes out, it will be in Core.

  • You just import AutoGraph from contrib.

  • And you decorate your function with autograph.convert.

  • Then, when you call it, under the hood,

  • AutoGraph will generate and run this more complicated code

  • for you.

  • And these are not academic concerns.

  • There's a lot of code out there that has control flow in it.

  • So for example, the implementation

  • of dynamic RNN in TensorFlow is very hard

  • to read if you're not familiar with how TensorFlow works.

  • But I can write one, here, in the slide.

  • And it's pretty easy.

  • If you squint, you have a "for" loop

  • that loops over my sequential data, applies an RNN cell,

  • and has some logic in it to mask the outputs,

  • because the different sequences in the minibatch

  • can have different lengths.

  • But this is not--

  • this is 11 lines of code.

  • There's no magic here.

  • But if you run it through AutoGraph,

  • you get the actual code that you see inside tf.dynamic_rnn,

  • which is substantially harder to read.

  • And not just RNNs--

  • in this-- I mean, this RNN is a basic building block

  • that we use in all sorts of sequence-to-sequence models,

  • when we're generating text, when we're generating images,

  • when we're turning data into other data.

  • And this is just one example of this kind

  • of control-flow-rich programs that you often want to write,

  • but that writing in normal Graph TensorFlow

  • can be more painful than it needs to be.

  • So what AutoGraph supports is a subset of the Python language

  • that covers what I hope is what you most need

  • to run in a TensorFlow program.

  • So we can do print.

  • And I don't know how many of you have ever

  • failed to use tf.print, because I do about once a week.

  • It's really hard, right?

  • Also tf.assert-- it's very easy to have your assertion

  • silently dropped in the void.

  • AutoGraph is also composible.

  • So you can have arbitrary Python code,

  • with classes and functions and complicated call trees.

  • And it will turn the whole thing into a graph for you.

  • You can have nested control flow, with breaks and continues

  • and all sorts of stuff.

  • And we can make that work.

  • And we're still working on it to make it better.

  • So another nice example, I think,

  • to visualize what AutoGraph is actually doing, is using what

  • has been unanimously decided to be the best interview

  • question in the history of humanity, which is FizzBuzz--

  • this thing where you loop over the numbers.

  • And you print "Fizz" for some of them, "Buzz" for some of them,

  • "FizzBuzz" for some of them.

  • And otherwise, you just increment your counter

  • and print the number.

  • And you can see this is like 10 lines of Python code,

  • pretty straightforward.

  • But you should try to turn it into in TensorFlow code.

  • And we ran it through AutoGraph, because I'm too

  • lazy to write this code myself.

  • It's not pretty.

  • And if you think this is it, this is not.

  • This is it [LAUGHS].

  • So this all looks nice.

  • But I think you would be remiss to believe me if you didn't

  • understand how this works.

  • So I want to spend a few minutes and tell you

  • what we're actually doing to make this possible.

  • And there's many ways of conceptualizing this.

  • But I think, really, the core of AutoGraph

  • is we're extending the notion of what's possible with operator

  • overloading.

  • And I don't know if you're familiar with operator

  • overloading.

  • It's a feature in Python that lets you customize

  • a little bit of the language.

  • And we use this heavily in TensorFlow.

  • So for example, when you write c = a * b in TensorFlow,

  • we don't actually run the Python multiplication operator.

  • What actually runs is the thing on the right,

  • c = tf.multiply(a, b).

  • And the way we turn one-- the code on the left

  • into the code on the right is by having this Tensor class,

  • where a and b are tensors.

  • This Tensor class defines this _ _nul_ _ method, which Python,

  • then, when it sees the operator, it rewrites to a call

  • to the multiply method.

  • And then, we can use this multiply method

  • to make whatever code we want run when you type the star.

  • And this is pretty straightforward

  • and shouldn't be big of a surprise.

  • But what Python doesn't let you do is this.

  • There's no way in Python to override

  • the behavior of the "if" operator

  • to run some code you want to run.

  • So ideally, if Python let us override _ _if_ _,

  • we would be able to make all sorts of graph rewrites that

  • take control flow into consideration.

  • Sadly, we can't.

  • So what we're trying to do, really,

  • is overwrite the syntax that Python doesn't

  • let us override the syntax.

  • And the way we do this is we read the Python abstract syntax

  • tree of your code.

  • We process it using Python's [INAUDIBLE] modules.

  • And we just run a bunch of standard compiler passes.

  • We rewrite loops so that they have a single exit.

  • We capture all the variables that you assign

  • in loops and conditionals.

  • We unify the variables that you mutate in either branch

  • of a conditional.

  • We do all these boring transformations

  • that need to happen, so that we can take the code that you want

  • to write, that has advanced control flow,

  • into the code that TensorFlow wants you to write.

  • And if you've been following me so far,

  • you might have a question in your head, which

  • is that, if you've written TensorFlow program--

  • AutoGraph didn't exist.

  • You look at your code.

  • It's full of "ifs," and "while" loops, and "for" loops,

  • and things like that.

  • And you probably don't want any of those things

  • to end up in a graph, because a lot of this

  • is just configuration.

  • You're choosing what optimizer to use.

  • Or you're choosing how many layers your network has.

  • And you have-- these things are very easy to express in loops.

  • But you might think that if we rewrite

  • every loop and conditional in your code

  • to be a TensorFlow loop or conditional,

  • you're going to end up with all those things in your graph.

  • And you don't.

  • And the way you don't is pretty clever,

  • which is that we use Python's dynamism

  • to help us instead of hurt us.

  • So when we rewrite the--

  • in our operator overloaded logic with _ _if_ _ and _ _while_ _,

  • we don't call tf.cond.

  • We call something that, in spirit, looks

  • like this function, where if you pass at a tensor,

  • it runs tf.cond.

  • But if you pass at something that's not a tensor,

  • it runs a normal Python conditional.

  • So if you have code that's not TensorFlow code,

  • and you run it through AutoGraph,

  • we should preserve its meaning, preserve its semantics.

  • So it should be safe.

  • And you can trust that it will only

  • change the behavior of code that you want the behavior to change

  • because it's doing something that you're not allowed to do,

  • like have an "if" that depends on the value of a tensor

  • at graph build time.

  • And to get there, we had to do a lot of work, and a lot of work

  • that, as I was saying, is the same work

  • that any compiler has to do.

  • We figure out what variables we're using.

  • We rewrite things in a single static assignment form,

  • so that we can handle things like breaks,

  • and continues, and breaks inside "ifs," and functions

  • with multiple return points.

  • Because the core TensorFlow syntax doesn't let

  • you have those things.

  • But thankfully, that is just syntactic sugar.

  • It's just simple transformations that you can do on

  • the-- you can remove all these features by doing

  • simple transformations on the code that

  • do not affect its meaning.

  • And in AutoGraph, we do those transformations for you.

  • So you can really write the code that you want to write.

  • And we'll try to run it as best as we can.

  • So we do break and continue, inline "if" expressions.

  • We're working on list comprehensions.

  • We have basic support for "for" loops.

  • We let you do multiple return statements.

  • We also de-sugar "for" loops into "while"

  • loops because we can't--

  • TensorFlow-- there's no tf.for loop.

  • But AutoGraph implements tf.for loop for you.

  • And a good way to think about what we're doing here

  • is that we're adding another stage of processing

  • in your computation.

  • Right now, if you're using TensorFlow from graphs,

  • you write graph-style code in Python.

  • You give this to the TensorFlow runtime to execute it.

  • What AutoGraph lets you do is write

  • eager-style code in Python, which is imperative,

  • which has no control dependencies,

  • and all these things that you don't want to think about.

  • And AutoGraph will go turn this eager-style code

  • into Graph-style code for you and then hand this

  • over to TensorFlow runtime.

  • And again, if you're at all skeptical at this point--

  • you're like, well, I've used TensorFlow before.

  • I've seen what the error messages look like.

  • If you're adding another layer of computation,

  • is that going to become even harder than it already

  • is, to debug what's going on?

  • And the good news is that it doesn't have to.

  • So if you think about it, we have these three stages

  • of your computation now.

  • We have, AutoGraph is processing your code.

  • That's stage 0.

  • Then stage 1 is, your code is being run to generate a graph.

  • And stage 2 is, TensorFlow is executing your code

  • to actually do the thing you want.

  • If you get an error in stage 0, when AutoGraph is processing

  • your code, you'll just get a stack

  • trace that points to AutoGraph.

  • You file a bug against us in GitHub.

  • And they'll fix it.

  • If we get an error during stage 1,

  • which is the code that you wrote--

  • well, the code that you wrote, that AutoGraph transformed

  • into code that you don't know how to read because it's

  • full of that extra boilerplate that we didn't want

  • to deal with in the first place--

  • and you get an error there, we actually

  • rewrite the Python stack trace to point,

  • not to the generated code line that has an error,

  • but to the line of code that you actually wrote.

  • So that you can connect your actions with their consequences

  • instead of having this black box that deals with.

  • And if you've used TensorFlow, and you've

  • seen the errors at runtime--

  • we already show you two stack traces,

  • one for when the graph is built and another one for when

  • the session.run was called.

  • And in principal, we can also rewrite the stack traces here,

  • so that the graph with build stack trace shows

  • the code you wrote, not the code AutoGraph generated.

  • And we're working on making this happen.

  • So what's in the future of AutoGraph?

  • We have a public beta.

  • It's in tf.contrib.autograph.

  • We encourage you to try it.

  • It's ready to use.

  • If you find a bug-- and I hope you don't-- but if you find

  • a bug, file something on GitHub.

  • Let's work with us to get a fix.

  • You might have seen that we are starting to announce

  • a lot of our plans for TF 2.0.

  • And one of our big things in TF 2.0

  • is that eager execution is going to be enabled by default.

  • And to make crossing the bridge between eager execution

  • and graph deployability easier, AutoGraph

  • will be a key part of this.

  • So we would love to get this tested as much as we possibly

  • can before TF 2.0 comes.

  • Meanwhile, we're working on improving our handling-- yes--

  • but also, enhancing our coverage of the Python language,

  • adding more and more and more operations

  • to what AutoGraph supports, so that you

  • have to think less and less and less

  • about how your code is going to be turned into a graph

  • and more about, what do you actually want your code

  • to do in the first place?

  • We also want to factor out the source code transformation bit

  • into its own library, so that if you have some Python project

  • where you want to override _ _if_ _,

  • you will be able to reuse their code to do that.

  • Because it's a pretty neat, self-contained little thing

  • that I think can be useful broadly,

  • beyond the TensorFlow universe.

  • So thank you for listening to me.

  • But if-- strongly encourage you to go open your laptops

  • and do this.

  • Google autograph collab, and click on the first result.

  • Or open this link here.

  • This will point you to a notebook

  • that's hosted on Google infrastructure.

  • So you can use our GPUs for free.

  • And it has lots of little examples for how-- like,

  • this is code before AutoGraph.

  • This is code after AutoGraph.

  • This is what an error looks like.

  • These are things we can do.

  • And I think playing around with this can really help give you

  • an idea that's a lot more concrete than what

  • I've been talking about.

  • What are the actual capabilities of this technology?

  • And I'm sure if you try this, you'll have a blast.

  • Thank you.

  • Oh, also, Frank and I are going to be at the TensorFlow

  • booth in the sponsor area if you have any questions, even

  • if you don't want ask them now.

  • OK.

  • Thank you.

  • [APPLAUSE]

ALEX PASSOS: My name is Alex.

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級

TensorFlow AutoGraph (TensorFlow @ O'Reilly AI Conference, San Francisco '18) (TensorFlow AutoGraph (TensorFlow @ O’Reilly AI Conference, San Francisco '18))

  • 1 0
    林宜悉 發佈於 2021 年 01 月 14 日
影片單字