字幕列表 影片播放
[MUSIC PLAYING]
MAGNUS HYTTSTEN: Hi there, everybody.
What's up?
My name is Magnus, and you're watching Cording TensorFlow--
the show where you learn how to code in TensorFlow.
[MUSIC PLAYING]
All right.
In this episode, we'll talk about saving and loading
models.
So why do we want to talk about this?
Well, first of all, whenever you train
a model of any significant complexity,
the training can take a long time.
Most of the models in this Getting Started series
will just take a minute or so to train,
where real-life models can take days or even weeks to train.
So if you were to hit Control-C on your training job
after it's been running for a day or so,
all your model weights and values will be lost,
and you would have to restart training from the beginning
and be a very sad camper.
But if you saved your model every so often,
you can always resume training from that point,
making you a happy camper.
Another benefit is that you can take your model
and transfer to another computer, where
you can continue training.
But I'm pretty sure you already guessed that I
was going to bring that up.
That's enough talking for now.
Check out the links below to locate the code,
because that's what we're going to do now.
Check out the code!
Oh, finally!
We get to check out the code!
That's awesome!
Let's go and check out the code!
All right.
Let's start by checking out the awesome licenses here
at the top.
Then install packages for HDF5 and JAML support.
And here we do some imports, and print the TensorFlow version.
It's totally OK if you have a later version than me here.
We use the MNIST data set to demonstrate
model loading and saving.
Then reshape the images to batches of 28
by 28 arrays, which is the pixel size of MNIST images,
and normalize all pixel values to be between 0 and 1.
Next is the model definition, which
is defined in the create_model function.
This is a very basic model, which
is totally OK, because in this screencast
we're interested in learning how to load and save models,
not creating the best model for the MNIST dataset.
And here, we finally get to see how a model can be saved.
checkpoint_path will be the path of the saved model.
A model checkpoint callback object
is created with this path.
We also specify that only the weights of the model
should be saved, and that we want debug output
when the saving is performed.
Finally, we perform the model training
by calling the fit method and providing this callback.
As you can see, this will cause a model
to be saved once every epoch has been completed.
And if we look at the checkpoints directory,
we can now see three files.
The cp.ckpt.data file contains all the weight values.
This file has a range sequence, because multiple partitions
could potentially be used if we have a lot of weights.
The cp.ckpt.index file specifies which partition file
contains which weights.
And finally, the checkpoint file is a text file that
points to the latest model.
In our case, we only have one data file,
but shortly, we'll see an example
where we have saved multiple versions of the model.
All right.
So now when we have our saved model,
let's try out loading it.
First, let's just create a model from scratch and try it out.
Since it hasn't been trained, you
can see that the accuracy really sucks.
And now for the magic.
If we call the method load_weights
with our checkpoint path, our model
gets initialized with the previous training state,
and has much better accuracy.
OK.
That's the basics to save and load models.
Let's look at some more options we have.
One option is to provide the period parameter
when creating the model checkpoint object.
In this case we use the value 5, which as you can see
saves a new model every five epochs.
Observe in this case, we also use a parameterized filing
based on the epoch.
This means a unique file is saved every time.
That's also why we can see multiple files when looking
at the checkpoint directory.
We can also use a function called
tf.train.latest_checkpoint that will return the latest model,
which was saved--
in our case, the one with index 50.
This function looks into the file with the name checkpoint
to find the latest checkpoint.
Remember that the checkpoint file is a text file,
so you can actually check the file content yourself.
And now we can load the model using the load_weights function
like we did before, providing the value returned
by tf.train.latest_checkpoint.
Another way of saving models is to call the save method
on the model.
This will create an HDF5-formatted file.
Remember that we specified save_weights_only
to true last time we saved a model.
In addition to only saving variables,
the save method saves additional data,
like the model's configuration and even the state
of the optimizer.
A model that was saved using the save method can be loaded with
the function keras.models.load_model.
And as you can see, we have the accuracy of a trained model.
In addition to everything we've looked at,
TensorFlow also has a very important file format,
called SavedModel.
This is a file format that allows
to exchange models between many different parts of TensorFlow,
like TensorFlow Python, TensorFlow.js,.
And also TensorFlow Lite.
We are currently building out first-hand support
for SavedModel in Keras, and you can check out the links below
to read more about it.
And that's it for this episode of Coding TensorFlow.
Make sure to subscribe to the channel
to get more videos like this.
Now it's your turn to go out there and create
some great models.
And don't forget to tell us all about it.
[MUSIC PLAYING]