Placeholder Image

字幕列表 影片播放

  • Hello and welcome to another beginner's guide

  • to machine learning with ML5.js video on pose estimation

  • and posenet.

  • So this is the third, the last one that I'll do in this series

  • here about posenet.

  • First I looked at just what posenet is and how it works

  • and how you can get the key points of a human skeleton.

  • Then I took the output of the posenet model, all

  • those key points, and fed them into another neural network

  • to do pose classification, to recognize different poses

  • that I made with my body.

  • And in this grand finale pose video,

  • I will do exactly what I did in the previous video

  • with post classification.

  • But perform a regression.

  • So the final output instead of being a classifier,

  • am I making a Y, M, C, or A pose, I will make a regression.

  • What do I mean by that exactly?

  • So to review, the setup I have is as follows.

  • [MUSIC PLAYING]

  • The system starts with an image.

  • It sends that image into the pre-trained posenet

  • machine learning model.

  • That model performs pose estimation and gives as its

  • output 17 x,y pairs.

  • Wrist, elbow, shoulder, shoulder, elbow, wrist,

  • et cetera, et cetera, et cetera.

  • And then I take all of those and feed them

  • into another neural network, an ML5 neural network, which

  • then classifies those key points as Y, M, C, or A.

  • So that's the process that I've built in the first two videos.

  • I want the final output to no longer be categorical.

  • It's not one of four option.

  • The final output is any number.

  • So you could think of it as the final output

  • is going to control a slider.

  • And that slider is going to have some sort of range.

  • So what I did previously in other examples of regression

  • in this full series if you go back,

  • I used a neural network to output a frequency value

  • to play a musical note.

  • So I certainly could do that here.

  • I could train the machine learning model

  • to play the note [SINGING] for this pose

  • and [SINGING] for this pose.

  • And I could actually have something that output like

  • [SINGING].

  • So I could go and do that.

  • And boy wouldn't that be fun to watch?

  • But I want to do something different.

  • That I'll leave as an exercise to you.

  • Make a gesture or posed based musical instrument.

  • I am going to control color.

  • And this comes from a project that I referenced inspired

  • by a viewer, Darshawn, who made a project that does an output.

  • Because specifically what I want to demonstrate

  • here is that the regression output doesn't

  • have to be a single number.

  • In this case, I want to have three values.

  • And I'm going to think of those values as an R for red,

  • a G for green, and a B for blue.

  • So I can say things like, and the training can be,

  • this pose is this particular color.

  • This pose is this particular color.

  • And then this pose is this other particular color.

  • And then as I move, it will interpolate

  • between those colors by trying to guess the value according

  • to the regression.

  • Now I'm ready to start implementing this in code.

  • So I'm not going to write everything again.

  • I'm going to start from the pose classifier.

  • And the first thing that I need to do

  • is adjust the configuration of the neural network.

  • The differences instead of four categorical outputs, Y, M, C,

  • or A, I just need three continuous outputs.

  • So I could actually just change this number to three.

  • Because it's still a number of outputs but the task

  • is now regression.

  • The other thing I really need to do

  • is think about during the training process,

  • how am I going to create these target values?

  • And this is going to be really tricky.

  • So maybe this color scenario isn't the best one.

  • I only was one person here.

  • But I think to demonstrate this idea,

  • the best way would be for me to make these literal sliders.

  • So I'm going to adjust the sliders

  • and make the target outputs based

  • on the position of the sliders.

  • And then when I actually deploy the model,

  • the model will control the sliders themselves

  • and I'll see the color.

  • I think that's going to work.

  • So this target label is no more.

  • I don't have a target label, there's no categorical output.

  • Instead I'm going to have sliders.

  • So let's comment this out and say, three sliders four red,

  • green, and blue.

  • They're all going to have a range between 0 and 255

  • with some default value, in this case 0.

  • And I'll have the sliders start with red at 255 and G

  • and B at 0.

  • So we can see these are the sliders

  • that I'm now going to control.

  • And match their positions with a given pose.

  • Now if you recall, I had this horribly awkward,

  • for a variety of reasons, interface.

  • As in, no interface at all with just key presses

  • to set a label.

  • And then I'd have this, like, callback

  • hell with nested set time outs.

  • Let me improve this for a little bit for this round.

  • So one thing that I can do to improve this,

  • and I haven't been using this throughout this video series,

  • I've been staying away from it.

  • But I'm going to replace this with something called async

  • and await.

  • These are key words that operate in JavaScript.

  • They're part of ES 8 which is a newer version of JavaScript

  • that allows me to have asynchronous events happen much

  • more sequentially in the code.

  • And I've covered this previously in several videos.

  • If you haven't seen that, you'll want

  • to go watch those or read up about promises and async and a

  • await somewhere else.

  • But what I'm actually going to do

  • is I'm just going to go get the code

  • from a very specific video where I wrote this delay function.

  • I'm going to bring that in here.

  • And then I'm going to change key press to use async and await

  • with that delay function.

  • And let me just do that and then explain what I mean.

  • [MUSIC PLAYING]

  • Oh, it is so lovely, look at it.

  • Look at this nice sequential code that's, like,

  • set the target label, console log it, wait 10 seconds,

  • then do this.

  • Then wait 10 more seconds, then do this.

  • Isn't this lovely?

  • It is really worth taking some time

  • to read up and explore async and await so that you can have

  • some much more readable code.

  • This is all still happening asynchronously.

  • JavaScript, everything happens asynchronously.

  • This is just sweet syntactic sugar

  • to make our lives a little bit more joyful today.

  • But, ah, that's not really the content of this video.

  • That's not the topic.

  • The topic is, I don't have a target label anymore.

  • What I have is--

  • and actually, let's just change this to if key equals--

  • like, I'm no longer going to be collecting

  • a particular key press.

  • So let's just have the collection

  • moment happen when I do--

  • so D for data.

  • And then I'm going to have a target color.

  • And it's going to be an array with the values of all

  • the sliders.

  • [MUSIC PLAYING]

  • So the idea is that when I pressed the D key,

  • I'm going to pull the values from the sliders.

  • I'm going to set that to a target color.

  • I'm going to wait 10 seconds so I can get in position.

  • And then start the collecting process,

  • collect for 10 seconds, and then jump out.

  • Now it would be much better interaction wise

  • if I could manipulate the sliders

  • while I'm making the pose.

  • And if I could just, like, open the magic door

  • and have a volunteer come in and help me with this,

  • that might make more sense.

  • But I guess I didn't think of that in advance

  • so I'll do that another time.

  • I also think that I'm going to be able to get

  • into position a little faster.

  • So let me change this to 3,000.

  • But I haven't done the important part.

  • This target color needs to replace the target label

  • when I collect the data.

  • That's happening right here.

  • So previously I had this target label

  • that was a character that I put into an array.

  • And then passed it and add data.

  • I think I can get rid of this now and just say target color.

  • So this should be good.

  • OK, dare I say that I can collect this data now?

  • Oh, the chat thankfully is pointing out

  • that I missed adjusting these to G

  • and B. Oh, that would have really gotten me later,

  • thank you.

  • So I think also I just want to collect

  • data for, like, 3 seconds.

  • Because I'm going to do things like set the color,

  • set the color.

  • I'm going to move my arms maybe like this.

  • And then just set a lot of different colors

  • with lots of in-between states.

  • That'll really show, I think, the regression more clearly.

  • Let me also console log what colors there just so I see it.

  • I'm going to start with the sliders

  • in their original position.

  • And press D. One, two, three.

  • Collecting.

  • OK, I got some data.

  • Now let me adjust the slider a little bit.

  • Let me add some of this color.

  • I really should pick something where I could see what it is.

  • Oh well, next time.

  • Add, press D.

  • Wait, happened to my pose?

  • Uh-oh, I have a bug.

  • Bug, bad bug.

  • Bad, bad, bad bug.

  • I re-declared target color.

  • I'm making it a global variable so that I can use it across.

  • I mean, there's ultimately a nicer way to organize the code.

  • But I want it to be a global variable.

  • So I set it here and then when I'm adding it I get it here.

  • That was the problem, OK.

  • Now, let's collect some data.

  • Collecting, OK.

  • Now, let me move the sliders around.

  • I really should visualize the color.

  • But what are you going to do?

  • I'll just add a little green and take away a little bit of red.

  • I don't know.

  • And press D again.

  • And, where was I?

  • I'll go like this.

  • Really make this pretty arbitrary.

  • Oh, it really would be good for me to see what I'm doing.

  • I'll make this pose.

  • Let's do this.

  • So you, following this along, if you're

  • going to try to build the same thing,

  • think about how you might really thoughtfully make

  • a bunch of colors with a defined set of poses

  • that means something to you.

  • I'm doing this somewhat arbitrarily

  • just to see if we get some results.

  • Now I could hit S to save the data.

  • And I have a nice JSON file, this default name

  • that downloaded.

  • Let me change this to color poses.

  • Let's take a look at in Visual Studio code

  • just to make sure it makes sense.

  • Looks like it does.

  • It's got a bunch of X's, 34.

  • It's got some Y's.

  • The Y's are the outputs, and it's an R, G, and B value.

  • So I could have done the thing where I named the outputs.

  • If I wanted to have names show up in the data

  • I could change this to--

  • [MUSIC PLAYING]

  • So ML5, the neural network is just dealing with numbers.

  • But ML5 will allow you to specify names of the output

  • so that when you get them back later

  • you can figure out which is which.

  • But I'm just going to remember there's three

  • and they're in the order, red is 0, green is 1, blue is 2.

  • That'll be simpler right now.

  • Now I can go to the training the model stage.

  • So the truth of the matter is, I could add

  • and key press another option.

  • I press tree T, it trains the model.

  • But the way I made my classifier,

  • I did those in three separate sketches.

  • Collect data, train the model, deploy the model.

  • So I'm going to keep going in that way.

  • I'm going to open up the model training sketch from before.

  • I duplicated it and renamed it to regression.

  • The only thing that I need to change here

  • is the outputs are three, the task is classification.

  • And then I need a new data file.

  • So I'm going to delete other data files I have

  • and upload my new file.

  • Load that file.

  • And then everything is the same.

  • I'm just running the train function.

  • And then when it's done, save the model.

  • So there's very little that I need to change here, just

  • a different configuration.

  • Load a different data file and train the model.

  • I hope this works.

  • I really hope this works.

  • If you're watching this right now,

  • you don't know how many times I've tried to get this to work

  • where it hasn't.

  • That's promising.

  • A little bit of wonky stuff going on, but it trained.

  • The loss went down.

  • I think I've gotten some results here.

  • And it looks like those files have downloaded.

  • And we could see those files there

  • in my downloads directory, which means

  • I can go to the last sketch, the one where I load the train

  • model and deploy it.

  • So I've opened that sketch, I've duplicated it,

  • and now I just want to delete the model files

  • and upload my new ones.

  • Files are uploaded.

  • Adjust the configuration of the network.

  • I'm going to delete some old code that's

  • no longer being used.

  • And I don't have a label anymore.

  • This shouldn't be called classify pose anymore.

  • Let's just call it predict, predict pose or predict color.

  • Call it predict color.

  • And this should be the brain.

  • Because I'm doing regression I shouldn't

  • call the classify function.

  • I should call the predict function.

  • This changes to predict color.

  • And now the main work here is I need

  • to change what happens when I get the results.

  • So before I was looking at a confidence score

  • and getting a label.

  • Now I just want the raw red, green, and blue values.

  • So this should change to predict color.

  • Let me just console log the results.

  • So let's see what the results look

  • like in the case of a regression.

  • They'll be formatted differently than when

  • it came as the classification.

  • It's no longer a sorted list of labels

  • ordered by confidence score, sorted by confidence score.

  • Let me also make sure to comment out this post label, which

  • no longer gets drawn.

  • We're just going to look at the console now.

  • So in theory the first pose that it sees,

  • I should get an output here that has

  • red, green, and blue values.

  • Uh-oh, I don't see any values.

  • What happened?

  • Oh, for some reason the path says model 2.

  • I've been messing around with this code

  • and that says model 2.

  • Weird that it didn't give me an error like it couldn't find it.

  • Is it in the--

  • oh, yeah.

  • It's saying failed to load here.

  • I don't know whose fault this is, whether the web

  • editor should be showing this or ML5 didn't log it correctly.

  • But that's definitely the problem.

  • Path is model.

  • Let's try this again.

  • No pose, no pose, no pose, no pose, predicting.

  • And I'm seeing, oh, three objects, R, G, and B. Let's

  • take a look inside.

  • An R, G, and a B. An R, G, and a B.

  • So I should be able to use those values now

  • to set the positions of sliders.

  • Oh, I got to put the sliders in.

  • And then also I could just draw the color.

  • I kind of want to see the sliders move, though.

  • I think it would be fun.

  • [MUSIC PLAYING]

  • So I have three sliders there.

  • So now when I get the result, I can assign it to the slider.

  • So I can say R equals results index 0 dot value.

  • Pretty sure if I go back and look

  • at what was in that object, you'll

  • see there was an array of three objects.

  • And the red value was in a property called value.

  • And then there's 1 and 2, so there were three.

  • Then I should be able to set the slider's position

  • to these values.

  • And then I also might as well add something

  • to draw the color.

  • So here before, when I was getting a label,

  • I drew it as text.

  • Let's draw a background overlay on the video

  • with a little bit of alpha.

  • Let's grab the values from the sliders which were set.

  • And set that at the background with some alpha.

  • Did I get this?

  • Let's run it.

  • OK, look at the colors moving.

  • The sliders are sliding based on whatever pose I'm making.

  • If only I could remember what it was that I did.

  • But anything is going to give me a predicted value.

  • I'm controlling sliders with my body.

  • It's all very arbitrary.

  • But hopefully you can see that if you

  • did this in a thoughtful way, maybe color

  • isn't the output that you want.

  • Maybe three isn't the number of regression outputs

  • that you want.

  • Maybe it's music and frequencies or this or that.

  • You must have a creative idea.

  • But you can see that if you can, as you're moving your body,

  • match the position of your body with some set of numbers

  • you could then train a model to learn

  • all of those relationships.

  • And then interpolate between them

  • as you move your body around.

  • So I imagine that there's a very creative, exciting, fun, unique

  • way of doing this.

  • And I hope that you will explore it.

  • So if you do, please share it with me.

  • There's a variety of different ways to do it.

  • You can find the page on TheCodingTrain.com

  • for this particular video.

  • Ask your questions in the chat, on social media,

  • all of the above.

  • We have a new Discord, which I'll just

  • happen to mention in this particular video.

  • Coding Train has a discord, you can find the link

  • to that in this video's description.

  • That's another way you can join the community and share what

  • you've done.

  • So thank you so much for sticking with me.

  • I don't know how easy this was to follow

  • or if this makes sense because I used so much

  • of the previous code in it.

  • So if you didn't watch those previous videos,

  • hopefully those would fill in some gaps for you.

  • But let me know.

  • I can always revisit this in a future video.

  • And thanks for watching goodbye.

  • [MUSIC PLAYING]

Hello and welcome to another beginner's guide

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級

ml5.js.PoseNet和ml5.neuralNetwork()。使用PoseNet和ml5.neuralNetwork()進行姿勢迴歸。 (ml5.js: Pose Regression with PoseNet and ml5.neuralNetwork())

  • 1 0
    林宜悉 發佈於 2021 年 01 月 14 日
影片單字