Placeholder Image

字幕列表 影片播放

  • FRANCOIS CHOLLET: So last time, we

  • talked about a bunch of things.

  • We talked about the functional API

  • for building types of layers.

  • We talked about features that are specific to the functional

  • API--

  • things like static input compatibility checks

  • across layers every time you call

  • a layer, whole-model saving, model plotting,

  • and visualization.

  • We talked about how masking works in the functional API

  • and about how masks are propagated.

  • So for instance in this example, the Embedding layer is going

  • to be generating a mask here because you passed this

  • argument mask_zero=True.

  • And this mask is going to be passed

  • to every subsequent layer.

  • And in particular, to layers that consume the mask,

  • like this LSTM layer here.

  • However, in this LSTM layer, because it's

  • a sequence reduction layer, it's not

  • going to be returning all the sequences, but only

  • the last output.

  • This is going to destroy the mask,

  • and so this layer is not going to see a mask anymore.

  • So it's really a way to handle masking

  • that works basically magically.

  • Masking is pretty advanced, so most people who need it

  • are not actually going to understand

  • very well how it works.

  • And the idea is to enable them to benefit

  • from masking by just doing, like, hey, this Embedding

  • layer, I want the zeros to mean this is masked,

  • and then everything in their network

  • is going to magically know about this as long as they

  • are using built-in layers.

  • If you're an implementer of layers,

  • you actually need to understand how it works.

  • First of all, you need to understand

  • what you should be doing if you're

  • writing a layer that consumes a mask, like an LSTM layer,

  • for instance.

  • It's very simple.

  • You just make sure you have this mask argument in the signature

  • and expects a structure of tensors

  • that's going to match the structure of your inputs.

  • So it's because it's going to be a single tensor.

  • And the single tensor is going to be a Boolean tensor, where

  • you have one mask entry per timestep per sample.

  • So it's typically a 2D Boolean tensor.

  • If you have a layer that can safely pass through a mask,

  • for instance, a dense layer or in general any layer that

  • does not affect the time dimension of its inputs,

  • you can just enable your layer to pass through its mask

  • by saying it supports_masking.

  • It is opt in because a lot of the time,

  • layers might be affecting the time dimension

  • of the inputs, in which case the meaning of the mask

  • would be changed.

  • And if you do have a layer that changes the time

  • dimension or otherwise a layer that creates a mask from input

  • values, it's going to need to implement this compute_mask

  • method that is going to receive the mask, receive the inputs.

  • If, for instance, you have an Embedding layer,

  • it's going to be doing this--

  • not_equal inputs, 0.

  • So it's going to be using the input values

  • to generate a Boolean mask.

  • If you have a concatenate layer, for instance,

  • it's not going to be looking at the input values,

  • but it needs to look at the masks--

  • the two mask that's being passed--

  • and concatenate them.

  • And if one of the masks is [? non, ?] for instance,

  • we're going to have to generate a mask of 1's.

  • OK, so that's the very detailed.

  • Yeah?

  • AUDIENCE: So just maybe a little bit more detail about masking--

  • so if you say supports_masking is true like

  • in the lower-left-hand corner, is that just using some default

  • version of--

  • FRANCOIS CHOLLET: Yes, and the default is pass through.

  • Yeah, so it is.

  • If you said this, this enables your layer

  • to use a default [INAUDIBLE] compute_mask, which

  • just says return mask.

  • So it get the input and a mask, and just

  • returns the mask unchanged.

  • AUDIENCE: So that assumes that the mask

  • is like the first or second dimension gets masked?

  • FRANCOIS CHOLLET: The first dimension

  • gets masked if zero is the base dimension.

  • AUDIENCE: And then where does this mask argument come from?

  • Like if I look at the previous slide,

  • it's not clear to me at all how this mask is being [INAUDIBLE]..

  • FRANCOIS CHOLLET: So it is generated

  • by the Embedding layer from the values of the integer inputs.

  • AUDIENCE: Right, so Embedding layers has a compute_mask

  • function?

  • FRANCOIS CHOLLET: Yeah, which is exactly this one actually.

  • AUDIENCE: And it returns a mask.

  • FRANCOIS CHOLLET: Yes.

  • AUDIENCE: So and somehow, the infrastructure

  • knows to call-- because you enabled masking,

  • it knows to call the compute_mask [INAUDIBLE]..

  • FRANCOIS CHOLLET: Yes.

  • AUDIENCE: The mask gets generated,

  • but I don't know where it gets put.

  • FRANCOIS CHOLLET: Where it gets put--

  • so that's actually something we're

  • going to see in the next slide, which is

  • a deep dive into what happens.

  • When you're in the functional API, you have some inputs.

  • You've created with Keras that inputs.

  • And now, you're calling a layer on that.

  • Well, the first thing we do is check

  • whether all the inputs are actually symbolic inputs,

  • like coming from this Input call.

  • Because there's two ways you could use a layer.

  • You could call it on the actual value tensors,

  • like EagerTensors, in which case you're just

  • going to run the layer like a function

  • and return the outputs.

  • Or you could call it symbolically,

  • which is what happens in the functional API.

  • Then you run pretty extensive checks

  • about the shape and the type of your inputs

  • to raise a handful of error messages

  • in case of a mistake made by the user.

  • Then you check if the layer is built. So the layer being built

  • means its weights are already created.

  • If the layer was not built, you're

  • going to use the shape of the inputs to build the layer,

  • so you call the build method.

  • And after you've done that, you'll

  • actually do a second round of input compatibility checks,

  • because the input spec of the layer

  • is quite likely to have changed during the build process.

  • For instance, if you have a dense layer, when

  • you instantiate it, before it knows its input shape,

  • its input spec is just the fact that its inputs

  • should have rank at least two.

  • But after you've built the layer then

  • you have an additional restriction,

  • which is that now the last dimension of the inputs

  • should have a specific value.

  • Then the next step is you are going

  • to check if you should be--

  • if you are in this case, if this layer expects a mask argument.

  • And if it does, you're going to be fetching the masks generated

  • by the parent layer.

  • You have this compute mask method.

  • AUDIENCE: So--

  • FRANCOIS CHOLLET: And--

  • AUDIENCE: And so where does that come from?

  • I mean, like, somehow there is a secret communication generated.

  • FRANCOIS CHOLLET: Yes, Which.

  • Is a bit of metadata set on the tensor itself.

  • There's an [INAUDIBLE] Keras mask property,

  • which is the mask information.

  • So it is the least error-prone to co-locate the mask

  • information with the tensor that it refers to.

  • AUDIENCE: So if you were to do something like--

  • I don't know, like a [INAUDIBLE] or something

  • that's not a layer.

  • FRANCOIS CHOLLET: Yes?

  • AUDIENCE: Would-- you-- so you get a tensor that doesn't

  • have a Keras mark argument.

  • But then you say-- but I guess you also were going

  • to it into a lambda layer.

  • FRANCOIS CHOLLET: So what happens

  • when you call ops that are not layers

  • is that they get retroactively cast into layers, essentially.

  • Like, we construct objects that are layers,

  • that they are going to be internally calling these ops.

  • But these automatical layers in the general case

  • do not support masking.

  • So if you do this, you are destroying the mask,

  • and you're going to be passing your mask

  • to a layer that does not support it, which is an error.

  • So it's not going to--

  • AUDIENCE: It's not a silent--

  • FRANCOIS CHOLLET: It's not a silent failure.

  • It's an error.

  • If you pass a mask to a layer which

  • is one of these automatic layers,

  • in this case, that does not support masking,

  • it's going to yell at you.

  • AUDIENCE: So, but wait a minute.

  • I feel like lots of layers are going to be

  • like trivial pass-throughs.

  • Like, if there's a mask, we want to pass it through,

  • but if there's not a mask, that's fine.

  • FRANCOIS CHOLLET: Yeah.

  • So it has to be opt in, again, because any change to the time

  • dimension of the inputs would need a smarter mask

  • computation.

  • And we cannot just always implicitly pass through

  • the mask, because you don't actually know what the layer is

  • doing.

  • AUDIENCE: If you-- couldn't you change to implicit events

  • through the mask as the shape or the outputs,

  • the shape of the input?

  • AUDIENCE: But what about something like a [INAUDIBLE]??

  • AUDIENCE: That has the same mask.

  • That should actually respect the mask.

  • FRANCOIS CHOLLET: That-- I think it's a reasonable default

  • behavior.

  • It's not going to work all the time, actually.

  • You can think of [? adversile ?] counterexamples.

  • AUDIENCE: [INAUDIBLE]

  • FRANCOIS CHOLLET: But they're not like,

  • common counterexamples.

  • But yeah.

  • So currently masking with the functional API

  • is something that's only useful for built-in layers very much.

  • So it's not really an issue we run into before.

  • I like the fact that currently it's

  • opt-in, because this actually saves us

  • from precisely people generating a mask in semantic layer

  • and then passing it to a custom layer

  • or an automatically generated layer that does not support it.

  • And that could potentially do things you don't expect, right?

  • So it's better to--

  • AUDIENCE: And is the mask supported

  • with an eager execution?

  • FRANCOIS CHOLLET: Yes, it is.

  • So in eager execution, essentially

  • what happens is that the call method, which

  • is programmatically generated for one of these functional API

  • models, is basically going to call both--

  • [INAUDIBLE] the call method of layer and its compute mask

  • method, and going to call the next layer

  • with these arguments.

  • So essentially, very much what you

  • would be doing if you were to use

  • masking using subclassing, which is basically,

  • you call your layer.

  • You get the outputs, your sublayer.

  • I don't have a good example in here.

  • So you call a layer, get these outputs.

  • Then you generate the mask using compute masks, specifically.

  • And for the next layer, you're going to be explicitly passing

  • these [INAUDIBLE].

  • AUDIENCE: And is it is just keyword argument mask equals?

  • FRANCOIS CHOLLET: Yes, that's right.

  • AUDIENCE: So all these layers [INAUDIBLE]..

  • So if I feel like a mask is going to [INAUDIBLE]

  • they can always fix it by passing an explicit one?

  • FRANCOIS CHOLLET: Yes, that's correct.

  • You can pass arbitrary Boolean tensors well to deboolean

  • tensors as the mask argument.

  • So, in general, if you are doing subclassing,

  • nothing is implicit, and you have

  • freedom to do whatever you want, basically.

  • AUDIENCE: Question about masking in the last--