構建影像分類器（ML Zero to Hero，第4部分）。 (Build an image classifier (ML Zero to Hero, part 4))

字幕列表影片播放

♪ (intro music) ♪
Hi, everybody,
and welcome to the fourth and final video
in this series of Zero to Hero with TensorFlow.
I'm Laurence, and today we're going to look back
at the very first problem that we spoke about.
And then we'll see how we can build a machine-learned model to solve it.
Remember this?-- all the way back in episode one
where we showed a scenario of rock, paper, and scissors
and discussed how difficult it might be to create an application
that recognizes hands of different shapes,
sizes, ethnicities, decorations, and more.
We discussed how difficult it would be to write code to detect and classify these
even for something as simple as a rock, paper, or scissors.
But since then, you've looked into machine learning
and you've seen how to build neural networks,
first to detect patterns in raw pixels to classify them,
and then to detect features using convolutions
to have a convolutional neural network trained to spot the particular features
that make up an item, like the soles of a shoe.
Let's put all that together, and in this video,
we'll see how to create a neural network
that is trained on data of rock, paper, and scissors
to detect and spot them.
We'll start with the data.
There's a dataset here that has several hundred images
of rock, paper, and scissors poses.
We'll train a neural network with this data.
So, first of all, we have to download the zip files containing the data.
The code to do that is here.
One file has the training set,
the other has a testing and validation set.
In Python, you can unzip a file with the zip file library,
and we unzip them to a temp directory like this.
This creates folders with sub-folders of each of our categories.
When training in TensorFlow using an image data generator,
you will automatically label the images
based on the name of their parent directory.
So, we don't need to create labels for the images.
It's a really nice shortcut.
So I'll achieve that with this code.
This creates an image data generator that generates images for the training
from the directory that they were downloaded, too.
We can then set up something called a training generator
which, as its name suggests, creates training data from that.
We can do exactly the same for the test set with this code.
Later, when you see the model.fit, you'll see that we passed these in
as the training and validation parameters.
Now let's look at our neural network definition.
This is very like what you saw on the last video--
just with more layers.
One reason is that the images are more complicated
than the grayscale clothing you saw previously
and another is that they're bigger.
You can see that our input is now 150x150.
Our images are certainly bigger than they were before.
And our output is a layer of three neurons.
Why would that be?
Because there are three classes: rock, paper, and scissors.
Between these, the code is very similar to what you saw previously,
just more of it.
So we have four layers of convolutions-- each with MaxPooling
before feeding into a dense layer.
The dropout is a little trick
to improve the efficiency of a neural network
by throwing away some of the neurons.
We'll compile the neural network as before with this code
and then we can fit the data with the model.fit call.
Note that we don't have labels-- that's because we're using the generator.
It's inferring the labels from the parent directories
of both the training and the validation datasets.
When you run this, you'll probably get accuracy of about 100%
on the training data quite quickly,
with the validation data getting to about 87% accuracy.
This is something called over-fitting,
which happens when the model gets really good at spotting
what it has seen before,
but it's not so great at generalizing.
Think about it this way:
So, for example, if all your life, the only shoes that you had ever seen
were hiking boots,
you probably wouldn't recognize high heels as shoes.
You would be over-fitting yourself.
There are a number of methods to avoid this
and one of them is called image augmentation.
And I've put the code for this into the notebook for you to try yourself
to see how it helps avoid over-fitting.
Once your model is trained, you can then call model.predict
to see how well it spots rock, paper, or scissors.
This code will take an image, reformat it to 150x150--
which the model is trained for-- and it will then return a prediction.
And here's a few examples that I ran
so that you can see that it's actually predicting quite well.
But the best thing to do is to try it for yourself.
I've put a link to the notebook in the description below,
and you can use this code to train a neural network
to recognize rock, paper, and scissors images.
That's it for this short series of videos.
I hope it was useful for you to see the new programming paradigm
that is machine learning,
and through these examples of computer vision,
how you can get yourself on the path
to become an Artificial Intelligence Engineer.
If you have any questions, please leave them in the comments below,
and don't forget to subscribe for more great content.
Thank you.
♪ (music) ♪