Placeholder Image

字幕列表 影片播放

  • [WHISTLE]

  • Hello.

  • And welcome to another video using Posenet and ML5.js.

  • But in this video, what I'm going

  • to do is take the output of the Posenet pre-trained model,

  • and feed that into an ML5 neural network to train,

  • oppose classifier, to recognize when

  • I'm making certain motions like a y, and m, a c, and an a.

  • Before I begin coding, let me quickly mention

  • something I added between the last video and now.

  • I'm mirroring the image so that when I raise my left hand,

  • it's mirrored to me what I'm seeing on the screen in front

  • of me over there.

  • This is important for interactivity.

  • It makes it feel much more intuitive and natural to see

  • yourself mirrored.

  • You might recall that the ML5 has

  • a specific function called Flip Image that will do it for you.

  • But I actually found, because I'm

  • drawing all this other stuff, that it's easier for me

  • to just write the code for it itself,

  • which involves a translate and a scale.

  • In other words, typically if I'm drawing an image, it's 00,

  • I'm drawing it right here, and the image gets painted across

  • the canvas.

  • But if I call scale negative 1,1,

  • it sets the x-axis going in the other direction.

  • So positive pixels go this way.

  • And if I translate over to here and put 00 here and draw

  • the image this way, it will appear reversed-- inverted,

  • flipped--

  • to the viewer.

  • So that's what's happening in these three steps right here.

  • The two videos that I'm assuming are prerequisites

  • here are the previous one, where I covered

  • all of the code for this particular Posenet example

  • that you're seeing running right here in the web editor,

  • as well as this train your own neural network set

  • of videos that covered the basics of how

  • the ML5 neural network function works to train a model

  • to play musical notes based on where the user clicks

  • their mouse in a canvas.

  • To get started, I could really begin with either one

  • of these sketches.

  • For example, I could go and get my Posenet code

  • and bring it into this particular sketch.

  • Or I could take the neural network code from this sketch

  • and bring it into the Posenet one.

  • I think I want to continue working from the Posenet sketch

  • itself.

  • And the first thing that I want to do

  • is create an object to store the neural network.

  • So I'm going to call that Brain.

  • And then after I initialize the Posenet model,

  • I'll say Brain is a new ML5 neural network.

  • And you might recall that anytime

  • you create a neural network, you can

  • specify a set of options for how you

  • configured that neural network.

  • All of the options for how to configure

  • an ML5 neural network, you can find on the documentation

  • page for the reference.

  • I'm just starting with these four basic properties-- inputs,

  • outputs, task, and debug.

  • So let's come over here to the whiteboard.

  • And let's diagram out what's going on.

  • Now remember, we're starting with the Posenet machine

  • learning model.

  • We're sending an image into that model as the input.

  • The Posenet model then takes that image

  • and does Pose estimation, making a guess

  • as to where all the key points are on the human body

  • that it sees.

  • And all of those points come in the form of xy pairs,

  • coordinates.

  • Here's my elbow.

  • Here's my shoulder.

  • Here's my ear.

  • It doesn't have an ear--

  • whatever-- nose, there's 17 of them.

  • All of this data is what I want to send

  • in as the input to my ML5 neural network.

  • ML5 neural network will take all these xy pairs

  • and classify them into a given pose that has a label.

  • It's a dab pose, or a Saturday Night Fever pose.

  • I don't know what kind of poses I'm going to make.

  • I'll do YMCA.

  • Why not?

  • This now tells me how I want to configure my neural network.

  • I want to send it 17 pairs of numbers.

  • That's 34 inputs.

  • And I want it to classify those 34 numbers

  • into one of four labels.

  • That is four outputs.

  • 34 inputs, four outputs, the task is classification.

  • And I do want to see debugging as I'm training the model.

  • And I have to give those options to the ML5 neural network

  • itself.

  • This is where things get kind of complicated because I

  • need to call Brain.AddData.

  • That's the way I add training data to my neural network.

  • So somewhere I have to have some kind of interaction.

  • Maybe I press a key.

  • I'll press the key Y and then it will wait a little bit.

  • And it'll know after five seconds,

  • for when I come over here, to start collecting pose data

  • for a certain amount of time.

  • Then it will stop.

  • And then I'll come back over here, and press a button,

  • and do something else.

  • So this requires a lot of thoughtfulness

  • in terms of how I might build the interaction around this.

  • I'm just going to try to do it in a simple way

  • that I can get it to work right here right now in this room.

  • For a much nicer example around interaction and collecting pose

  • data, you can take a look at Google Creative Lab's Teachable

  • Machines.

  • So I've made video tutorials about training image models

  • and sound models that can actually be imported into ML5.

  • At this moment, you cannot import the pose model into ML5.

  • That's something that we're working on.

  • And I'm hoping that this video tutorial

  • will lead the way to that.

  • But essentially, what I'm building

  • is a pose teachable machine.

  • I just won't do as thoughtful of an interaction as here

  • in the actual Teachable Machine project.

  • You can see here in Teachable Machine, for example,

  • there's a button that I can press.

  • And it's going to give me a 10-second countdown.

  • And then when I come over here, after 10 seconds

  • it's going to start collecting my poses.

  • So this is a much nicer example.

  • I encourage you to look at it for inspiration.

  • Of course, that was terrible training data.

  • But now I'm going to go back to my code

  • and try to implement my own version of this.

  • To keep track of the flow of the sketch,

  • let me add a variable called State.

  • And I'll just initialize it to waiting.

  • And then I will add the key pressed function.

  • And when I press the key, I want to say state equals collecting.

  • Only, I don't want to start collecting immediately.

  • I want to wait a little bit because it's

  • going to take me some time to walk over there and get

  • into my pose.

  • So I'll use Set Time Out for a delay.

  • So, Time Out is a built in function in JavaScript.

  • It's not part of P5 that will execute a function

  • after a certain amount of time.

  • And maybe I want to execute this function after a certain amount

  • of milliseconds.

  • So I can put a little function inside here.

  • I could use the arrow syntax.

  • There's a variety of ways I could approach this.

  • Let's just say 10 seconds later.

  • Right?

  • So when I press the key, 10 seconds later,

  • set the state equal to collecting.

  • Also have a variable called Target Label.

  • And I'll set the target label equal to the key that

  • was pressed.

  • All right, so I have this function going.

  • When I press the key, whatever key I press

  • is the target label.

  • I want to see that in the console.

  • And then 10 seconds later, I want

  • to see it say that it's starting to collect.

  • Let me make it one second later so I

  • don't have to wait as long.

  • All right, and I'm going to press the Y key.

  • Y, collecting-- perfect.

  • So this is the right idea.

  • Once that state switches to collecting,

  • I want to call the ML5 neural network Add Data function.

  • Where I want to call the add data function

  • is right here when I have a pose.

  • So when I have a pose, I want to say Brain, Add Data, and then

  • the inputs and the targets.

  • The inputs are all of the xy locations of the pose itself.

  • There's 34 of them.

  • I mean, I have kind of an issue where

  • the camera can't see my legs.

  • So I probably should ignore some of them.

  • But I'm just not going to worry about that.

  • I could also consider using the confidence scores.

  • Like maybe the confidence score, the neural network

  • could learn when it's a low confidence score

  • to kind of ignore that point.

  • But I'll ask you to try all that stuff if you're making

  • your own version of this.

  • I'm just going to use these 17 xy pairs.

  • So I need them to be in a plain old array.

  • And if you recall, they're not in a plain old array.

  • They're in this pose at key points

  • which each has an object, which is position.x.

  • So I need to flatten the data.

  • Whatever format the data is in, I

  • want to just put it into a plain array.

  • So I'm going to grab this loop.

  • I'm going to create an empty array called Inputs.

  • And I'll just say inputs.push x, inputs.push y.

  • So this is me going through the entire pose,

  • getting all the xy's, putting them

  • in an array, which is the input to the neural network.

  • And what's the target?

  • It also wants an array.

  • But in this case, it's one thing, just the label.

  • So I can take the target label, put it an array.

  • And that's what I'm giving an Add Data function.

  • You might recall in my previous neural network examples,

  • I was making objects that I passed in with named inputs

  • and outputs.

  • So this is just showing you that you can do it either way.

  • If I want to have names for all the inputs and outputs,

  • I can build an object with properties.

  • If I just want a big array of numbers,

  • I can just make it a plain array.

  • But there's a new problem.

  • The new problem is once I start collecting the data,

  • I'm going to strike the pose.

  • And maybe I'll collect the pose for a little while.

  • I've got to stop collecting the pose.

  • So let's go back up to where I started collecting the pose.

  • I'm going to do something awful.

  • This is so painful.

  • I don't want to do it.

  • Let's just do it and then we'll revisit it later.

  • We will.

  • [MUSIC PLAYING]

  • I'm going to call set Time Out again right inside here.

  • Because a second later or 10 seconds later,

  • I want to stop collecting.

  • This might be some of the worst code I've ever written.

  • It's really awful to look at.

  • It's what's informally known as callback hell.

  • And there's a variety of ways I could approach this differently

  • by using promises, and async, and await.

  • But in this case, really all I want to do

  • is set the state to collecting in 10 seconds.

  • Then 10 seconds later, set it back to waiting.

  • And I think this will work for me.

  • Let's give it a try.

  • I need to first press Y. One 1,000, two 1,000, three 1,000,

  • four 1,000, five 1,000-- collecting.

  • 10 seconds later it should say not collecting.

  • All r right.

  • OK, that worked.

  • What I'm doing here, quite poorly I might add,

  • is implementing a state machine.

  • So it might be nice for me to, in a separate video which,

  • if I can ever get around to making it,

  • talk about a more proper way of implementing a state machine.

  • But this works.

  • I set this state variable to collecting.

  • 10 seconds later, set it back to not collecting.

  • And during that time, I am adding data

  • to the ML5 neural network.

  • [DING]

  • Sorry for a second.

  • I'm coming to you from almost weeks-- several weeks,

  • a month later, a really long time,

  • look how much my beard has grown--

  • to issue a correction.

  • I have made a very significant error in this video

  • that you're watching.

  • And I don't correct it any time throughout the course

  • of this video.

  • And the error is that I forgot to actually

  • include an if statement here.

  • In the Got Poses event, when I receive a pose,

  • I should only actually call Add Data

  • when the state is collecting.

  • I was just doing it anyway.

  • Sure, I set the state to collecting,

  • and waiting, and then to collecting, and waiting.

  • But I didn't actually include a conditional.

  • So I had a lot of messy extra noise in the data.

  • So I just redid it now in the time

  • that I'm talking to you right now,

  • and collected the data again, retrained the model,

  • and performed so much better than it actually

  • does in the video.

  • And the code that's released has that small correction in it.

  • Amazingly, it kind of worked anyway, as you'll see,

  • as we continue watching.

  • But just note that correction when you go look at the code.

  • It's an important detail.

  • Thanks, and enjoy the rest of this video.

  • Immediately what I want to do right now

  • is add a function to save the data.

  • Because I do not want to do this many, many, many times.

  • So in key pressed, I'm going to say if the key is S--

  • Brain, save data.

  • So let me quickly try collecting that one pose again

  • and make sure it can save to a JSON file

  • that I can reload later.

  • Press Y, I've got 10 seconds now.

  • Collecting, not collecting.

  • So I can come back over here.

  • I can press S. And I now have a JSON file that was saved.

  • Let's take a look at that file.

  • And this is what that file looks like.

  • For every single pose it's got X's, those are the inputs.

  • There should be 34 of them, 0 through 33.

  • Then it's got the Y's, which is one label, Y.

  • So these are all of my poses saved here

  • in this big JSON file.

  • Great, I can now train them--

  • I can now collect the data for all four of those poses.

  • Let's see if I can manage to do that.

  • First, Y. Collecting.

  • Not collecting, OK.

  • Now I'm going to do M. Collecting.

  • Not collecting, OK.

  • Should I really do all of these?

  • C-- really noisy data.

  • One more, A. OK.

  • Now we save the data.

  • Save.

  • OK, stop.

  • Stop the sketch.

  • I've got the data.

  • Two megabytes-- so that was a large file but not ridiculous.

  • I'm going to rename it to YMCA.

  • I'm going to now upload it to my sketch.

  • And then I'm going to comment out all the data collection

  • stuff.

  • Because I'm just going to consider myself

  • done with data collection.

  • And I'll actually duplicate the sketch.

  • And let me call this one Data Collection.

  • I'll duplicate it and call this one now Model Training.

  • The next step is when the sketch runs,

  • to load the existing data.

  • So now I don't need to collect the data.

  • I could load the data, collect more data.

  • There's so many ways you could do this.

  • But I just want to load what I collected previously

  • and then, when the data is ready--

  • when the data is ready, when it's loaded,

  • then I can call the Train function.

  • There's a lot of options I could configure Train with.

  • But I just wanted to go for 10 epochs.

  • That's running through all of the data 10 times.

  • I might need a lot more.

  • When it is finished, I want to just console log

  • that it's been trained and save the model to my Downloads

  • folder so that I have it saved.

  • Let's see if this works.

  • So, I see the graph pop up that would show me

  • the loss while it's training.

  • But it never went down at all.

  • Let's try giving it 100 epochs.

  • This is not a good sign.

  • [DING]

  • Guess what I forgot to do?

  • Something very important.

  • What is the data that I'm collecting?

  • These xy values are on my P5 canvas,

  • which has a roulette resolution of 640 by 480.

  • So they are large values.

  • They need to be normalized down to a standard between 0 and 1.

  • So I'm going to let the ML5 library take

  • care of that for me by just adding the normalized data

  • function.

  • So right here in data ready before I train the model,

  • I can call normalized data.

  • Let's run it one more time.

  • Let's just go down to 50 epochs.

  • And there we go.

  • Now I see a loss going down the way I had hoped it would.

  • And the model is trained and presumably saved

  • to my Downloads folder.

  • Let's go take a look.

  • Because I was doing this multiple times,

  • I have a mess of files down here in the Downloads folder.

  • But the one that was most recent is number 5.

  • So I'm going to get rid of everything

  • and just rename these back to the default names.

  • And now I can upload the model that I trained back up

  • to the P5 web editor.

  • Lets duplicate the sketch one more time.

  • I'm going to call this Posenet Deploy.

  • Let's create a folder, called it Model.

  • Add the model files.

  • I can see the files over here.

  • And now instead of loading the data,

  • I can load the trained model.

  • But if you recall from the previous videos,

  • there are three files for the model.

  • So I need to create an object to store all three file names.

  • The format for how that has to be,

  • I can find on the reference page for ML5 neural network.

  • Copy this to clipboard.

  • Bring it over here and put in my path,

  • which is just called Model.

  • Let's run the sketch and see if I can just get model ready here

  • in the console.

  • Oh, oh, I'm just missing a quote.

  • Thank goodness.

  • And I'm inconsistently using single and double quotes.

  • Let's fix that.

  • Oh, it's not neural network.

  • I called it Brain.

  • So close.

  • Posenet ready, Posenet ready.

  • Huh?

  • Oh, I have a callback for Posenet

  • for when it's ready called Model Loaded.

  • So this needs to be--

  • I'll call this Brain Loaded.

  • So remember, we're using two machine learning models here,

  • Posenet, which is doing the pose estimation.

  • Then the outputs of that model are

  • going into my own neural network that I've trained called Brain.

  • Let's try this one more time.

  • Pose classification ready, Posenet ready.

  • Wonderful.

  • Incidentally, the classification model

  • was loaded first because Posenet actually

  • has to reach out to the cloud and download

  • the model from Google servers.

  • We are so close to being done with this.

  • Just a couple more steps.

  • Once my brain is loaded, I can actually ask it to classify.

  • So I can say Brain.ClassifyInputs.

  • And when you've got a result, call Got Result.

  • The question is, what are those inputs?

  • These are those inputs.

  • The same thing I did when I was collecting the training data--

  • grabbing the xy pairs, flattening them into an array--

  • I need to do that for the inputs that I'm

  • sending in when I'm deploying the model

  • and asking it to guess, asking it to do that classification.

  • Here's the loop from the data collection

  • where I flattened it into a single array.

  • I can grab that and I can bring that in here.

  • But I'm going to, once I get the result, want to do this again.

  • So really, I should take all of this,

  • write a new function called Classify Pose,

  • make sure there is a pose in the first place.

  • Then once the brain is loaded, just call Classify Pose.

  • I need the Got Result callback which has a two arguments

  • error, results.

  • And I'm just going to say console log

  • results index zero dot label.

  • Let's console log the whole results as well.

  • So in theory, the first pose I get should be--

  • if there is a pose when the brain is loaded, or there

  • won't be.

  • So I'm going to have to--

  • I'm going to put the recursive loop call in here

  • and call Classify Pose again.

  • So the idea is when the brain is loaded, classify a pose.

  • If there's a pose, call Brain.Classify.

  • But what if there's no pose?

  • I'll never go get a result. Otherwise, hmm--

  • this is really silly, but I'm anticipating an issue

  • that I'm going to have.

  • So what I'm going to do is if it doesn't detect a pose,

  • let's just say hey, in a little bit--

  • why don't you wait like 100 milliseconds

  • and call Classify Pose again?

  • So that way it will continue to check.

  • So at some point eventually, there will be a pose.

  • It'll call Brain.Classify when that's done.

  • It will call it again.

  • If there [? ever loses ?] detecting a pose,

  • it won't stop.

  • It will actually continue and just every 100 milliseconds,

  • keep trying again.

  • All right, I think--

  • there's no way this is going to work, right?

  • Let's give it a try.

  • All right, pose ready.

  • Y A, Y--

  • C, C-- A, Y, M. What?

  • This actually worked?

  • Well, let's just throw caution to the wind

  • and draw the label in big letters on the canvas.

  • I'm going to set it equal to Y so that I see it

  • right there at the beginning.

  • And at the end of Draw, I'm going to say fill 255,

  • 0, 255, no stroke.

  • OK, there's the Y--

  • oh, it's going to be reversed.

  • It's going to be flipped because of the--

  • so I'm going to also out push here.

  • It doesn't matter, Y is symmetrical.

  • Y and the C won't work.

  • And then pop here so that the Y is always in the center.

  • OK, now when I get a result, set it equal to that label--

  • I kind of want to see the confidence.

  • So I'll log the confidence to the console.

  • Because I want to see how well it-- how sure

  • it is about that particular label.

  • All right, here we go.

  • [HUMMING "YMCA"] One, two, three, four, one.

  • It's fun to stay at the YMCA.

  • It's fun to stay at YMCA.

  • All right, we need these to be capital letters.

  • [HUMMING "YMCA"] One, two, three, four, five.

  • [HUMMING "YMCA"]

  • Let's go slower.

  • YMCA.

  • So, thanks for watching this.

  • I'm kind of shocked it works.

  • There's so much more that could be done with this.

  • First of all, a question came up, which is you

  • forgot to normalize the data during this classification

  • process.

  • And ML5, one of the nice things it does,

  • is it saves the normalization minimums and maximums

  • from the training process and then reapplies them later.

  • So I don't have to--

  • I don't have to call Normalize Data again.

  • That's happening behind the scenes.

  • Otherwise, you know, what other kinds of labels,

  • what kinds of other outputs-- you know, could

  • I play different drumbeats based on a pose?

  • What other kinds of things could you classify?

  • If I collected a much larger data set,

  • if I was more thoughtful about how I was collecting the data,

  • can I get this to be much more accurate?

  • Give this a try.

  • Make something.

  • What I definitely also want to do,

  • which I'll need to make another video about,

  • is turn this into a regression.

  • So could I take that example where I play a frequency--

  • [HUMMING] --and alter that frequency

  • by changing my motion, my movements?

  • And in fact, I could actually have the output

  • of that regression be color.

  • That might be something to explore.

  • I got this idea from a viewer named Darshawn

  • who submitted a community contribution where,

  • instead of painting with sound, you're painting

  • with color with a regression.

  • And that's really interesting because it

  • requires you to have three different outputs--

  • an R, a G, and a B. So take a look at that project.

  • And maybe I'll think about doing something

  • with poses and color output in the next video tutorial.

  • [HUMMING 'YMCA"]

  • I have an idea.

  • What if I only update the label if the confidence is

  • above a given threshold?

  • Let's take 75%.

  • Maybe this will eliminate some of the noise.

  • [HUMMING "YMCA"]

  • It totally helps, right?

  • Because it's not flickering as much because it's

  • only altering it if it's really confident about what

  • the pose is.

  • And I can maybe make that threshold even higher.

  • [HUMMING "YMCA"]

  • The M and the A are so similar.

  • You're all right about that.

  • It is able to get it though.

  • M is something slightly different.

  • I'm a little out of breath, but one more thing I want to say.

  • Again, unlike with my previous examples where

  • I've trained models that do something similar based off

  • of images and pixels, in this case,

  • the pose data is more generic that I think,

  • if you go to the URL for this sketch, which

  • is in this video's description, you can make those poses

  • and hopefully it'll recognize them.

  • So give that a try.

  • And I can't wait to see you in a future coding video.

  • Goodbye, have a great day.

  • [WHISTLE]

[WHISTLE]

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級

ml5.js.PoseNet和ml5.neuralNetwork()的姿勢分類使用PoseNet和ml5.neuralNetwork()進行姿勢分類。 (ml5.js: Pose Classification with PoseNet and ml5.neuralNetwork())

  • 2 0
    林宜悉 發佈於 2021 年 01 月 14 日
影片單字