字幕列表 影片播放 列印英文字幕 DAN MOLDOVAN: Thank you all for coming. My name is Dan Moldovan. And today I will talk about some of the internals and functionality of AutoGraph. Now, this is definitely not an introductory talk. And if you would like to learn more about the background or the motivation behind AutoGraph, here are a few resources that I believe can help. The talk will be otherwise fairly fast paced, quite dense. I'm hoping we'll be able to get through all of it in time. But if not, I'm hoping the slides will be able to serve as a good reference, should you decide to come back and look at everything more closely. I should caution, though, that I am oversimplifying a lot of things for the sake of brevity and time. But the essential things are in there. The talk will be structured in roughly in three parts. First I'll talk some about some of the more relevant implementation details, which are useful to understanding some of AutoGraph's behavior. Then I'll describe the various ways in which you can interact with it. And lastly, I'll go through various use cases that highlight what works, what doesn't work, common pitfalls, how to stay away from them, and what are our plans to eventually address them. So let's begin with the implementation. From a systems perspective, this is roughly what AutoGraph looks like. In broad strokes, we have the following. Going from the bottom to the top, we have an infrastructure for performing source code transformations with various helpers. And on top of that, we have individual transformations. For instance, there is a separate transformation that handles function calls. Another one handles break statements. And yet another transformation handles if statements. And these transformations are independent and composable. Many of these transformations then replace your code with calls to special AutoGraph functions. We call them overloads or operators. The is reason that they are similar to Python's overloaded operators. Now, of those overloads, there are the most interesting ones, the ones that specialize on creating TensorFlow ops. And lastly, there's a high-level API that glues them all together. And this is typically what you usually interact with as a user. One last note that I should make is that of all these pieces, only the TensorFlow specialized overloads and perhaps the high-level APIs, only these are specific to TensorFlow. Everything else is fairly generic and reusable, and we hope to eventually have them in a separate library that can be used for other purposes as well. So one of the fundamental pieces of AutoGraph is, of course, the source code transformation bit. So let's look at that a bit more closely. Source code transformation is essentially what makes AutoGraph a transpiler. It's unit of work is functions. That is at runtime, a function is being converted into a new function. So let's look more closely at that process. It is roughly, loosely speaking, a five-step process. The first step is to obtain the source code of the function. Now, the standard Python library makes that easy for us. It provides that inspect module, which is built in, and it lets us do that. This also highlights one of the fundamental requirements of AutoGraph. In order to convert a function, that function must expose its source code. And that's typically true for almost all functions in Python, although there are certain exceptions. Normally, you can test this on your function by calling the inspect get source. If inspect get source returns data, then AutoGraph should be fine with it. The second step in this process is to parse the code into an AST. And once more, there is a standard Python API for this, which is good. We, in fact, use a thin layer on top of that. It's a third-party library called Gast. It's practically identical to AST, but it handles all the version differences between Python 2 and Python 3. It's worth mentioning at this point that AutoGraph operates entirely at AST level. There is no lower-level intermediate representation. And we never interact with the bytecode. And that has some unique advantages. Now, the third step does the bulk of the work. And that's quite both literally and figuratively. The standard Python library offers a mechanism that helps us with that as well. The AST module provides a mechanism for visiting and transforming ASTs. That mechanism uses the visitor pattern, and it's sketched here. Basically you get some callbacks whenever the visitor encounters different types of nodes. And on top of that, we have built an entire library of such transformations, as we've seen in the previous diagram. These transformations are called in sequence. Now, once transformed, the AST is unparsed back into source code in the form of a string. There isn't a standard library for doing that. But thankfully there's a third-party library called Astor, which does a decent job of that. Essentially, it's lots of string concatenations. There's nothing special about that. Finally, the source code is being outputted into a file and then loaded using a mechanism that's identical to writing an import statement. Once more, Python helps us with that, with the standard module called Imp. The special thing about Imp is that it only works with files on disk, hence the need to generate a temporary file. I should also make a slight note that another mechanism that we could have used would be Exec. And we've been going back and forth between using that and Imp. There are pros and cons to using each. So we might revisit this in the future. A few other mechanisms that are worth mentioning, one of them is the templating system that we developed to help us with generating code. It essentially lets us write templates in the form of strings, code blocks of strings. And they support placeholders, and they let us generate more complex or new ASTs. If you ever poke inside the transformations library, you will see plenty of such templates. Another important piece is the static analysis, which is critical in supporting certain transformations, and we'll see more about that in a bit. The analysis itself can be viewed as just a simple walk over the AST. And it annotates nodes with relevant information. Another important mechanism is caching. Caching itself is completely transparent to the user. But it does help us make sure that every function is converted no more than once, loosely speaking. This cache relies on the key assumption that the conversion process is entirely static. That is the generated code, what ends up in the generated code does not depend on any arguments or variables or any other state of Python. Basically, if you look at some plain code on paper, you would know exactly what the output code should be. Next let's talk about some of the actual transformations that are being made. And before I proceed, I should clarify that I'll use the word variable a lot. These are meant to mean Python variables, not to be confused with TensorFlow variables, which are not involved here. Once such transformation is simplifying the code by replacing some statements with other statements, simpler ones. Such as, for instance, we replace break statements with variables and additional if statements. This essentially helps us avoid the need to add to build special handlers in TensorFlow for these statements, like break. They're just easier to lower into something simpler. And the process-- yes? AUDIENCE: Does that imply that if I do while n less than a million break, that that's going to be very efficient when inefficiently converted, because it will still loop over the maximal range? DAN MOLDOVAN: It will not loop over the maximum range because the while statement, as seen in this example, will have its condition augmented. So yes, the overhead is one maybe two extra conditionals, not more than that. Yes? AUDIENCE: Are you guaranteed a [INAUDIBLE] variable names? DAN MOLDOVAN: We have small modules. It's not mentioned here [INAUDIBLE].. We look at the function's globals, its closure. And those indeed depend on the context variables. So if you take a function, you convert it, and you get a certain name. And then suppose you create some other function. And then you run the converted function. It might clash. That's very unlikely. Because if you change the code that we transformed, then the function will be re-converted, right? So I'm not sure it's even possible to get a-- well, you could get a clash, in theory, but you would have to work very hard to do that. But, yeah, that's a very good observation. That is one of the deviations from converted entirely static. There are some minor exceptions. So going back to the loading, I'm not going to describe the entire process because it's fairly straightforward. I think an example would suffice. For instance, here, notice that the break statement was replaced with a did break variable. And then we have a few extra conditionals, like, for instance, this one at the bottom, if did not break, i star equals to 2 to protect the code. And the conversions for continue and return statements are similar. Another important type of conversion is for function calls. We do this for all function calls. We replace them with a call to a special wrapper. This wrapper, as its name suggests, might decide to convert the function at the runtime, to convert the target function at runtime. But it may decide not to do that. Many functions don't need to be converted. But the important part is that we replace all function calls because statically we do not know what their type is. And we do not know whether we should convert them or not. So we defer that decision to runtime. Another mention that's probably worth making here is that from the graph's perspective, functions are in line. We don't in line. We don't create any TF function. We don't create any graph functions. So from this perspective, AutoGraph is consistent with [? V1-style ?] graph code. AUDIENCE: What do you mean by runtime? Do you mean when TensorFlow runs it or when the Python user runs it? DAN MOLDOVAN: That's a very good question. There's more than one runtime. In this case, I'm referring to the Python runtime. Next, the quintessential transformation, we might say, is converting control flow. So let's look at if statements first. The transformation itself is actually quite mechanical. For example, the body of the if statement becomes a new function, and we add some return values to it. And we'll see more about that in a moment. And the if itself becomes a call to a special AutoGraph overload. I'm, of course omitting the else block here for the sake of brevity. But it's equivalent with the main block, main body of the if statement. Once more, all the if statements are converted in this way. And here we have an example for a statement. Note that and there's nothing out of the ordinary here. The body of the if becomes a body in a new function. And the if statement is replaced with the function call. Now, loops are ever so slightly more complicated because they use state variable, but not by much. Once more, the body of the loop is transformed into a new function. This time the function has arguments representing the variables. The conditional also becomes a function this time because it depends on the loop variables. And once more, the statement itself is replaced with a function call. Now, the more interesting question is, how do we decide which of the loop variables in your program? We could of course take all the variables in scope and make them loop variables. But that would be quite inefficient. The heuristic we use to do that is actually quite simple. It relies on static analysis. And in short, a loop variable must be both of these two conditions. First, it has to be modified inside the loop, which is quite evident. If the loop doesn't modify it, then it's invariant to it. And the second condition is that it has to be either live in or out of the loop. Now what live means is-- live in means that the loop depends on the value of the variable before it entered the loop. AUDIENCE: It seems like if something is a loop condition, it wouldn't have to be live into or out of the loop to be a loop variable. So is it either of these conditions? DAN MOLDOVAN: If it's a loop condition, then the variable would be live into the loop, because it's read before anything else. I'll show an example that hopefully clarifies that a bit. The live out of the loop is similar. If the variable is used after the loop, then it's live out. So here we have an example. To Josh's remark, a is both modified by the loop, but also live into the loop because once you enter the loop, the first thing that happens, a is being read. So the value of a before the loop is definitely relevant. If a starts positive, then the loop will cycle. If a is 0, then the loop will not. So a's live into the loop. b, on the other hand, is not modified by the loop. So we can leave that one out. And c is also interesting because it is modified inside the loop, but it is not live. That's because as soon as you enter the loop, the c variable is being immediately overwritten. So the fact that it had the value 3 before the loop is completely irrelevant to the loop because that value is being destroyed, regardless. And this is a sketch of the resulting code. And I'll leave it as an exercise to verify that, so that is indeed correct. Next as, I mentioned, the conversion process is entirely static. And all the statements are being converted. All the function calls are being converted. And that means that the overloads must handle any type or value verification at runtime, once more, at the Python runtime. Yes? AUDIENCE: So when you say modified in the loop, I guess it's not only things that are assigned. A function within the loop can also modify variables, is that true? DAN MOLDOVAN: That's a great observation. And I will talk about that in a bit more detail. In order for AutoGraph to correctly convert code-- this is only when it transforms the loop into a TensorFlow loop-- the side effects, such as these modifications, have to be visible. So if you build a function that hides that modification, then AutoGraph will not detect it. That's an excellent observation. I'll get to that. I'll get to a specific example of that in a moment. So as I was mentioning, there's the dynamic dispatch that's handled by all operators. An interesting observation here is that if we were to convert pure Python code in this way with Autograph, it would become quite slower because every if statement will do an instance or some type of [INAUDIBLE]. So you can imagine it would be much lower than normal Python code. However, in the case of TensorFlow for our purposes, this overhead is peanuts compared to when creating ops. So it doesn't really bother us in the case of building graphs. So far, when describing the process, I ignored an important piece in Python, and that is the variable scoping. So let's look at an example of that issue with a simple conditional, with just increments of value in an if statement. Now, naively copying this block inside a function won't work because due to Python's scoping rule, x would become a variable local to the if true loop. So any modification that you make to it would be lost to the Python runtime. In fact, you actually get an error here because inside if 2x is a local variable, by incrementing it, you're trying to access an undefined variable. So the way we solved that was by renaming the variables inside the function body. And essentially what we're after is avoiding to modify directly the x variable, because that's what causes Python to consider it a variable local to the state, to the function. Now, a quick note on mutating variables. So what I just showed, was valid for simple variables, like x equals 1, and so on and so forth. Mutating them, like this statement, where we say x.a, equals to something, that is fine because that will not cause x or x.a or anything to become local to the function. So x still points to the correct object, and we're safe. So mutating objects, in this case, doesn't bother us. But-- AUDIENCE: But with the x of a example in there, you have the problem that because we have to execute both branches of the function, we're going to unconditionally mutate x, right? DAN MOLDOVAN: That's exactly so, yes. And that's exactly what I'm going into, with a bit more detail on these mutations. And, as Alex alluded, you have to be careful about the effects of tracing when running TensorFlow statements. By you means we, AutoGraph. So we have to handle that case. So one of the complications with mutation is probably best explained with this simple example. Suppose we have a method that mutates itself. It does some changes in a loop to one of its properties, nothing out of the ordinary. With a naive transformation of this loop, not doing anything is fine. It is correct if that loop was executed as a Python loop. And the effective code that executed looks kind of like this. This is not what's generated. That while loop at the bottom is actually, in fact, an AutoGraph overload. But the effective code that runs is this. So we have a loop body function, a loop cond function. And there's the why loop, which calls them, as you might expect. All this works fine if it's a Python loop. However, it no longer works fine if it's a TensorFlow loop. Why is that? Well, I'll leave it as an exercise to think what is the value of self.a as this statement executes and after this tf while loop runs? And also what happens to the tf while loop at runtime? But I'll just go to a possible solution for this. And that is to create a temporary-- a special loop variable corresponding to self.a itself. And if we do this with a bit of careful handling, so that we have self.a point to the correct value, both inside the body and inside the cond and after we have executed the while loop. If we do this, the tf while loop will execute correctly, and you will get the result that you would expect. The problem with that is that it breaks Python semantics. And that's something we do not want to happen. To show that, let's consider that a is not a trivial property, but it's a custom setter that you defined in your code. And let's consider that that custom setter has some side effects. For instance, it prints some messages. Now, with this equivalent code, what would happen when it ran as a Python loop? Well, there's a lot of assignments. There are too many assignments to self.a, and that will trigger the side effects in your custom property. So you will see way too many prints, in this case. And that's definitely something we do not want to happen because we want to preserve Python semantics. I'm sorry. There was a question? AUDIENCE: It seems like this transformation is just wrong, though. Shouldn't all the self.a's be replaced with self_a's? DAN MOLDOVAN: That's a very good question. If we did go ahead and replace all those self.a's, with self_a's, then if you called any method-- suppose you called a method that itself, behind the scenes, did some more modifications to self.a-- then that method would not capture the value of self.a, right? So we have to make sure-- we have to put the proper values inside self.a because some other code might need it. So the way we solve this problem is to put these external modifications into separate functions. We call them get state and set state because, in a way, they capture the state of the Python execution at runtime during tracing. But what these allow us to do is they allow us to do this kind of modifications in the case of tf while loops. And then in the case when we run regular Python loops, we just don't call these methods. So both paths are happy now. Next, if there are no additional questions-- I know this is definitely one of the trickier parts of AutoGraph itself. But if there are no questions, let's go to the second part and talk about the APIs that users can interact with. First and foremost, the absolute recommendation is that if you can use function, just use that. It has certain advantages. It can do automatic control dependencies. There are some types of APIs that only work inside tf function. It also caches the graph, in addition to other things. And it has additional smarts to handle exceptions. So definitely function is recommended. But if you really, really can't use tf function, you can call AutoGraph directly. There's a specialized API. But keep in mind that it does not add automatic control dependencies. And it's also less user friendly. Other APIs that you can use to tweak things include the do not convert annotation, which has the obvious effect. Although, even in that case, it's still preferable to use tf function with AutoGraph equals false. Also, if you'd like to have a look at the generated code, there's an API for that. And as we experiment with new transformations, there is this feature enumeration that you can use to enable them, to enable transformations that are not stable enough to release into production. Now, a few words on debugging. By far, the best way to debug a code is to run it Eagerly. And there is a function in TensorFlow that helps you do that. Or you can just remove the tf function decorator. But this one lets you do it without changes to the code. And that causes tf functions to run Eagerly. And you can use PDB and everything else inside there. If running Eagerly is not an option, then one way to peek inside what AutoGraph is doing is to crank up the verbosity. There's this API for that. Increasing the verbosity is also useful when filing bugs. There is a caution I should make. Of course, increasing the verbosity can cause quite a bit of log spam. But it will also dump data in addition to code. It will log argument function arguments and things like that. So please be careful when sharing verbose logs. Now, if you do enable the PDB inside AutoGraph code using, for instance, PDB set trace, that will not crash. It will work in some way. Just remember that set trace is a function and like every other function, will be converted with an AutoGraph overload. And what that means is that PDB will land somewhere inside the AutoGraph API. At the time of this talk, you have to step out twice to get back into the generated code. And the other caveat is that, of course, you will land inside generated code. Now a note on this generated code. It definitely contains lots of boilerplate. It is designed to be robust, not pretty. And ideally, in a perfect world, you should never have to deal with it. You should never have to see that code. And we're working towards achieving that. But until then, if you do end up in a situation where you have to deal with generated code, even if you see it by accident or not, or even if you have to actually deal with it, please file a bug so that we can work towards avoiding that kind of exposure. The next section, I want to mention some of the semantics that are related AutoGraph because these dictate what you should and you should not expect of it. Now, rather than a detailed explanation, I'll just list some broad guiding principles. And the first such principle is that we intend AutoGraph to preserve the semantics of existing well-behaved code. By well-behaved I mean, in general, it runs without raising an exception. So traditional pure Python code should be functionally not changed under AutoGraph. And the same should be valid about existing TF v1 graph code. Now, to be clear, if you give such code to AutoGraph, it will transform it. But you should not expect its functionality, the functionality of the transformed code to change. Then with respect to Eager code, AutoGraph obviously supports a subset of Eager. But the parts that it does support, then those should also preserve functionality as they go back and forth between Eager and AutoGraph. That means that AutoGraph code should not change its functionality when executed Eagerly. And that essentially is what lets you remove the tf function annotation without having to modify the code in any way. So at least in theory, when you remove tf function, the behavior of the code should not change. Yes? AUDIENCE: Does this actually happen at all, Eagerly executing AutoGraph code? I guess I sort of assumed that we just disabled function at the same time we disabled AutoGraph. DAN MOLDOVAN: Yeah, you disabled everything. Basically you run the code exactly as it looks. AUDIENCE: So this is something that could happen. You could execute-- AUDIENCE: You said it had a flag that turned off. DAN MOLDOVAN: Yes, there is a flag to turn it off, or you could remove the tf function annotation. Either of those should not change the behavior of the code. AUDIENCE: So that flag still does AutoGraph transformation. It just runs it Eagerly? DAN MOLDOVAN: It doesn't do the AutoGraph transformation anymore. AUDIENCE: OK. So there is a mode where we can do the AutoGraph transformation and then still run Eagerly? DAN MOLDOVAN: There is, but not with tf function. With tf function, it's either running graph without AutoGraph or running Eager without AutoGraph. AUDIENCE: Well, like the explicit convert API, if I ran that-- DAN MOLDOVAN: Yes, with that one, you could potentially create some graph-like code and execute it Eagerly. And that one should also preserve its functionality. But that's, of course, in the Eager semantics, right? Eager should execute graph code as if normal, right? AUDIENCE: But there's no reason to do that, right? DAN MOLDOVAN: No, this is just for kicks, I guess. AUDIENCE: So if you want to maybe debug an issue with AutoGraph, right? AUDIENCE: It's true, yeah. AUDIENCE: Theoretically, you could. DAN MOLDOVAN: It could. But even then, keep in mind that while there's code that runs in graph and there's code that runs Eager, so just make sure that those two are truly consistent, right? An implication of these guiding principle is that presupposes that code which does not depend on TensorFlow for objects will not run as a TensorFlow statement. And that should make it hopefully easy to reason about code. So let me show you an example. Of these for statements, for loops, the one at the top is legal Python code. It doesn't depend on any tensors. Therefore it will run as a Python loop. The other three will run as TensorFlow loops because, well, one depends on the tf range. The other one depends on the data set. And the third one depends on distribution strategy. In this last section, I'll go through some usage examples which I believe are most interesting from a user's perspective. Now, I will focus a bit on use cases which are illegal in AutoGraph because we have lots and lots of samples of fairly complex code that works, but we have fewer examples of code that doesn't work. So here they are. So I'll begin with control flow. And as a warm up, I'll show the ideal code for AutoGraph. This is definitely code that AutoGraph can handle. And it's code that we like most. And that's because it does its operations in plain sight, no hidden side effects, no hidden triggers. Everything is plain. Another example that works well is using statements like break. As we've seen, these are low words, so AutoGraph can deal with those fine as well. Now, here's an example of code that has certain limitations when running as a TensorFlow control flow. So this if statement, it depends on a tensor, therefore it will run as a tf cond. However, notice that this x variable is only initialized in the if branch and not initialized in the else branch. And TensorFlow, as we know, does not have the notion of non-values or undefined variables. So we cannot just do this in a tf cond. And instead we raise an error. And this is actually one of the better error messages that we raise, where we explain that you have to initialize x in the else branch. AUDIENCE: So here you could do this by making x into an optional value. DAN MOLDOVAN: That is true, yes. And that's why I'm very excited about optionals. AUDIENCE: What if x is local to that branch? What happens then? DAN MOLDOVAN: Then it's fine, yes. If it's local, then it's fine. That's why we go through all that pain to do liveness analysis and modifications, just so that local variables don't trip it. Now, the same restriction can extend to things that you might not expect. For instance, if we have a return value, that would cause the if statement to deal with an undefined value. This is also illegal. And the error message is pretty nice in this case as well. It tells you that you have to return a value from the else branch as well. Another example of the same limitation, this time involving object properties. In this case, the error message is a bit confusing. And we're working to actually fix it. The error message is, actually, in my opinion, it's very confusing because it's a tf cond that actually would execute the else branch. But there's no else branch. So when is it trying to access it? It's definitely a confusing error. But it will be much more friendly hopefully soon. One quick note that these limitations around the non or undefined symbols can easily be avoided by initializing your variables early. So if you initialize, for instance, our x, add to the top with some default value, then everything else would work nicely once more. So they're fairly easy to prevent. Now, if I could be pedantic for just a moment, I would like to recommend that when you have to deal with situations where you have some default values, I definitely recommend that you have a separate Boolean to represent the state of not initialized or not valid. I definitely recommend that over using magic values. During that can save you a world of pain later on. And this is not strictly-- it's totally unrelated to AutoGraph. It's just a recommendation of a good practice in general. Now, a more significant limitation in TensorFlow control flow is around hidden side effects, as we actually had a question alluding to this. So let's look at this example where we have a simple helper method that mutates the state of self, then there is another method that calls this helper. So when converting this larger method, this method f, AutoGraph's static analysis, when looking at the variables for that if statement, has no way of seeing that self.a is being modified because that's hidden inside the method. And we do not do cross-function analysis. Well, not yet, at least. So that means that the tf cond will ultimately end up not accounting for that modification to self.a, and you get this rather obscure error message, which is quite confused. Once more, the error message can and hopefully will be more helpful. But the limitation itself remains. Solutions are definitely possible. And I think they would make a nice future project. But for now, they are a matter of future development. Now, a good defensive against these kinds of patterns is to use a more functional style in your code. Functional style, that means if your function modifies a value, return it. And that helps AutoGraph. For instance, in this example, if we modify our code like this to return, then the new value, that's really put in self.a. And then we do the assignment in plain, inside the converted function. Then things are happy once more, and everything works. And this is my last bit of pedantry, I hope, for this talk. In general, functional style tends to be loved by machines. Compilers and static analyzers have a much easier time dealing with functional code, code that takes its inputs as arguments and returns everything that's modified. And sometimes it could help the code become more readable, too. Next up, a few examples around data sets, which are quite satisfying, in my opinion, because it shows that the underlying TF data and distribution strategy APIs are powerful enough to facilitate these kinds of conversions. So the first example is that iterators, the TF data iterators, work in almost all cases. And we'll see the few exceptions in a second. But essentially, you can take the iterator. You can go through parts of it. And then you can break out of the loop. And then you can resume it. And everything works as you would expect. And if you're curious, the implementations for for loops using TF data iterators in data sets, they're quite an interesting read. I think they're quite a nice feat. The code, with all those callbacks, might be difficult to follow. But in my opinion, it's quite interesting. And you can find it in the specialized control for operators of AutoGraph. Especially for data sets, we actually to handle a for loop, we end up applying three operations in sequence, scan, take while, and reduce. And I think that's pretty nice. Consuming an iterator with the next function also works. And' just to be clear, this is code that runs as a graph code. Next let's talk about handling runtime exceptions. And since we were just talking about iterators, let's talk about the common pattern in Python. It's generally considered Pythonic, such a pattern, where you just try things and catch an exception if they don't work. So instead of having a pattern where you say, if I can do foo, do foo. This pattern is do foo, tried do foo, except can't do it, right? And one of the most common uses of such a pattern is the use of iterators, where you have a loop. And inside the try except block you just try to call next. Now, for pure Python control flow under AutoGraph, this works just fine. It works the way you would expect. Unfortunately, that doesn't work for TensorFlow control flow. And that's because-- well, it's a dual reason. One of them is that there is no exception catching in graph mode. There is no op that catches TensorFlow runtime exceptions. Now, on the other hand, you could conceive that AutoGraph could lower exceptions. I mean, we lower return statements, therefore, we should be able to lower exceptions as well. However, that could make the code prohibitively complex and slow, for instance, because any statement could conceivably raise an exception. You would have to wrap each line with an if statement. So the lowering would not look very pretty, at least not in the trivial case. So the implication of this means that if you have a TensorFlow control flow statement and you wrap a next call into such a try except block, AutoGraph will not complain about it. It will leave the try except into the code. It will not transform it. But if you think about it, the effective code will completely ignore it. The runtime terms TensorFlow exception says there is no caching inside the graph. Any runtime error will bubble all the way through the TensorFlow runtime. So essentially what this means is that if you put a try except inside graph code, the exception will not be caught. It will just fly past the accept statement. It also means that if you do end up trying to catch exceptions, you should do that in Eager mode outside of the TF function. Because in Eager mode, you can catch them, right? The runtime exception bubbles through the TensorFlow runtime. And then it's captured by TF function and re-raised as a Python exception. So you can't catch it, just not inside the graph. AUDIENCE: Should we add try catch up to TensorFlow? DAN MOLDOVAN: I would love that. I don't know what the implications about the optimization and XLA are. AUDIENCE: I think it's a little scary because of the unknown semantics of what ops could be running in parallel at the same time and only one of them generates an error. So it's [INAUDIBLE] the other ones get canceled or what. DAN MOLDOVAN: That's true. AUDIENCE: We can't actually use actual exceptions to implement this, because we are building without. AUDIENCE: Or we wouldn't use C++ exceptions. It would have to be a new TensorFlow runtime language feature. DAN MOLDOVAN: Exception Tensor. AUDIENCE: You could have an op that takes three function [INAUDIBLE] and calls another [INAUDIBLE] an exception, then always calls the third one to simulate the behavior of try catch finally. That would have the consequence that Josh pointed out, that the cancellation in TensorFlow is very non-deterministic about what actually gets executed in the presence of an error. But, yeah. It's a separate discussion. I just thought I'd bring it up. DAN MOLDOVAN: We've definitely stirred the hornet's nest on this. Yes? AUDIENCE: So a question about the previous slides. How can you tell whether a loop is a TensorFlow iterator and whether a loop is just a Python iterator? DAN MOLDOVAN: That's an excellent question. In this particular case, I'm just implying that stop condition returns a tensor. In general, we look at the condition of the loop. If the condition object is a tensor, then it will be at the if while loop. If it's a Python Boolean or while not, then it will be a plain Python loop. And it will be unrolled. Does this answer your question? So going past adding a TF catch exception to TensorFlow, now there are good news about this. And that is that in most of the code, you can avoid having to catch exceptions altogether. For instance, with data sets, you can transpose the computation a bit. And instead of having a while loop, you can have a for loop over the data set. And that will make sure that the loop stops when the data set is exhausted. And then you move the condition inside the if statement because we do support break statements. So these two pieces of code, the one in the right and one in the left, are functionally equivalent. And if you squint, I think it's actually even shorter. So I dare say, it's actually cleaner. Next up, let's discuss a bit about collections. So again, in normal pure Python code, this code snippet is, as you probably expected, it's very common, where you just take a list and operate on it. We do that in Python a lot, right? And as with any other pure Python code, this works just fine under AutoGraph. But if the loop is a TensorFlow loop, things don't work anymore. And once more you get the rather obscure error message that we will hopefully fix soon. But in general, the loop is that you cannot operate on a Python collection from TensorFlow control flow. That just doesn't align with things. And it's not supported by AutoGraph. Instead, it's a good practice to use specialized TensorFlow collections, like, for instance, TensorArray. And it's a good idea to do it even when you work in Eager mode because that would mean that you don't have to modify the code if you ever want to go to AutoGraph. AUDIENCE: You don't want to do that transformation yourself, to switch it to a TensorArray? DAN MOLDOVAN: The problem with that transformation is that it's difficult to do. AUDIENCE: It's sort of retroactive [INAUDIBLE].. DAN MOLDOVAN: The main problem, if we look at this back slide, there is this L being initialized on an empty Python list. First and foremost, we don't know the type of that list. And we don't know-- at this line, it's unclear whether the user even wanted a Python list or a TensorFlow list. So we'd be forced to make assumptions that would violate the semantics of normal Python code. It's definitely a challenge. That's why we resorted to the rule that, OK, if you want a TensorFlow list, please be explicit about this. And we'll offer as much syntactic sugar around that as possible. But you have to explicitly request it. Let's see a few other examples that need special attention, this time around loops that change the type or shape of their variables. So first off, a probably familiar example of a type, a TensorFlow while loop, which you probably are already aware that TensorFlow limits the degrees to which tensors can change shapes or D types inside the loop itself. And you typically get error message of this kind, that some tensor enters the loop in this shape and has a different shape after one iteration. Now, there are ways to deal with this, one of which is a specifying shaping variance for the loop. And we're working to add support for that in AutoGraph as well, with a special function call that lets you specify them. Another thing that we're working to address is making the error message more useful as well. For example here, it would be nicer if the error message was saying something about a variable a, rather than some obscure cond 0. More subtle effect of changing types in a loop is shown here. This variable a starts as a plain Python value. And then inside the loop, it becomes a tensor. Now, according to AutoGraph dispatch loop, when we first execute the loop, when we execute the first iteration, it would appear that it's a Python loop because a is a Python scalar, therefore, Python loop. But after the first iteration, it would appear that the loop is, in fact, a TensorFlow loop. So we're working to improve the error message here as well to be explicit that that happens. But in the future, we hope to actually just deal with that directly. So for instance, you can envision that we could cycle through the loop a couple of times. And if we decide that the Python loop should become a TensorFlow loop, we should just do that. And then we'd only have a few unrolled iterations before the loop. Now, going back to exceptions and to errors, let me show you a few examples of how AutoGraph modifies them so that they don't point to generated code. Now, first, graph construction errors are being modified by expanding their error message. And that is purely the error message itself. We don't touch the trace back of that error. That will still point to a generated code. However, the error message has this stack trace-like message that helps you locate the cause of the error in the original source code. And if you think about it, this is very similar to the way TensorFlow runtime errors have a second stack trace showing you the location of where the op was created. Now, this augmentation of the error message is done using a mechanism that wraps the original exception into a new one. And unfortunately, we don't have time today to discuss a lot of details about how it's done. But what's important to mention is that most of the time, the type of the error does not change. So if you raise a value-- if the TensorFlow runtime-- sorry. If your op raises a value error, then the users will see a value error as well. It's just that its message will be changed. However, sometimes when we cannot replace errors using the same type, you might see staging error instead. And that typically happens when you have constructors that are complex, and we cannot figure out how to call the constructor in a way that keeps the data. Most errors just have a simple init constructor with just a string message. And there we can just create a new error of the same type with an expanded message. But where we can't do that, you will see this staging error. The original type of the exception can still be recovered. So you can still inspect the exception to find the my special exception type and its original message and so forth. Now, lastly, runtime errors are modified in a similar fashion so that they don't point to generated code. In this case, the error message is not actually changed from what was originally raised. We just simply replace the references. So here, you will see-- I ran this code in ipython. And you'll that's why you see this ipython input. That's the reference to the cell. What's important is that you don't see a reference to some temporary file that contains generated code. Now, once more-- I probably repeated that quite a bit in the talk-- we try as much as possible to remove the references to generated code from error messages. If you do see any messages where that's still not the case, please do file a bug. One last example that I want to show is how decorators are handled. And in general in Python, decorators have just syntactic sugar. They are just higher order functions that get executed when the code is loaded. From any perspective, that's why decorators are actually difficult to detect because they're not materialized in the AST. At least most of the time they're not. Anyway, when you convert decorated functions with AutoGraph, the decorator will be converted. And that's the reason why you will usually see the source code of the wrapper. For instance, if you have this decorator that just replaces the function with a wrapper, and you tried to convert the decorated function, you will see the source code of the wrapper instead. That should be no cause for alarm because the recursive conversion will step into the wrapper and convert the target function as well. So things do work as expected. That's it. That's it for now. That's all I had for today. There are a lot of other topics that I didn't cover for lack of time. And I hope these and many other examples will be discussed in more detail, in more comprehensive reference guide that is currently in the works. And that's it. Thank you very much. [APPLAUSE]
B2 中高級 TensorFlow內部:AutoGraph (Inside TensorFlow: AutoGraph) 0 0 林宜悉 發佈於 2021 年 01 月 14 日 更多分享 分享 收藏 回報 影片單字