Placeholder Image

字幕列表 影片播放

  • what's going on?

  • Everybody and welcomes you.

  • Part four of the M l in daylight.

  • Three tutorials.

  • Ah, where we left off we were creating for real.

  • Hopefully this time some training data gonna be pretty angry if this is not correct.

  • But we checked it last time, and I think I'm all set.

  • Um, Anyways, what I've got here is actually over 50,000 files and was 50 gigabytes of training data.

  • I think we're all set.

  • So I'm gonna go and stop all these from running, and we'll close these and we'll pop into here, come into training data, and let's sort by size, make sure looks like they're all between 2000 4 2000 and six.

  • Want to make sure none of them are just, like empty.

  • And then we can also sort by the name.

  • Looks like the largest is just under 5500 But pretty quickly, I mean, we only have a handful of 1000.

  • How light games.

  • Like where?

  • Where the Aye.

  • Aye, Somehow collected over 1000.

  • Which is pretty impressive because we're called.

  • We're only making one ship, so that starts at 5001 ship goes down to 4000 and we set this threshold to save as just basically 4100.

  • So did that ship collect anyhow, Light.

  • Now what we want to do is find out, um, where should we draw the line?

  • So s O, for example.

  • Um, it's come into this is just the data.

  • I don't really care about that.

  • Let's go over it.

  • Okay.

  • So I am, Um I guess we could take testing grounds.

  • Um, I think we'll just modify testing grounds is a couple of things I want to check.

  • Um, so the first thing would be let's do import O s.

  • And then all files equals O Estado lister training data.

  • Don't forget to put this in quotes and what I'd like to do import, Matt, plot lib DuPuy plot as peel tea and what I want to do.

  • Actually, this machine pride doesn't have Matt plot lips.

  • Let's go ahead and grab that.

  • Let me make sure Pip is to python that we're using.

  • Hurry up.

  • Okay.

  • Pit pit.

  • Been stall, Matt.

  • Lot lib.

  • So what, Grandma, plot lib.

  • Make sure we can.

  • We should be good, though.

  • I just wanna make sure no errors on.

  • What I'd like to do is plot, like a distribution of the score's kind of for two reasons.

  • One is I'm just curious right now.

  • But also, what we could do is keep this exact same threshold.

  • And then after we've trained a model, we can see, uh, is the distribution exactly the same as in, like, almost like, what's the average score of the games?

  • You also could do an average, but or both.

  • I'm curious to know, Uh, after we train a model, how much better did is our new model.

  • So coming back over here.

  • All files host out Lister trained data.

  • Let's just print Lin all files.

  • Run that really quick.

  • Let's see how quickly we can get through that pretty pretty darn fast.

  • So then four f because if we use file, I think that's Ah.

  • I thought file was a key word.

  • Anyway, I'm gonna keep with F for F in all files.

  • Um, so that should be the file name.

  • So then we just want to say, uh how light Hal item amount equals.

  • F dot split.

  • We'll split by the dash, and then we'll go with zero with elements.

  • So if we print, Hal, Iet amount will just break after the 1st 1 Brick, um, whom we could see 4100.

  • So then, um I'll just do how light amounts.

  • Here.

  • Uh, then we will delete Thio here, pal.

  • I amounts on DDE amount.

  • Uh, print.

  • Let's do print, Lynn, pal.

  • I underscore amounts also just so I can dev this to start, Let's just do, like, 500.

  • Cool.

  • Okay, so now we want a plot.

  • Hissed a gram.

  • I believe it's p l t dot hissed in Matt plot lib and then for a history, ma'am, you just have to pass exes.

  • So we want to pass, Hal.

  • I TTE amounts, and then it's like bins.

  • It might be a TSH bins or end bins.

  • We're gonna find out in a second.

  • I'm gonna say five and then peel tea that show Run that.

  • Whoo!

  • It worked.

  • Okay, so it's only 4100 because as we generate, I think we're just going in order.

  • So let's do, um, let's do 15th hated.

  • Let's do 1501st to see OK, so we can see clearly.

  • The most common is 4100 and it kind of goes down.

  • Uh, let's do them all.

  • I think I think that's looking pretty darn pretty darn gate.

  • Yeah.

  • I'm actually not sure why that was so difficult.

  • Why would that be so challenging for it?

  • Because we said we only wanted so many bins.

  • So why is the display so challenging for it?

  • Well, it's curious.

  • What about done wrong?

  • Um, I feel like it didn't actually put them in the bins.

  • Right.

  • Uh, hello.

  • Uh, like, pulling up real quick.

  • I don't know what I've done wrong here.

  • Um, it's too, uh, map plot live, hissed a gram pipe lot hissed.

  • That's correct.

  • See if we say you know, five bins.

  • It really should only have, like, five categories have been, uh, for clearly.

  • It wanted to label like a ba jillion.

  • Could you please not take five years to load and just, I don't know, load any of them.

  • These are all mad plot lived out or about this one.

  • This is a map lot live.

  • Yeah.

  • There we go.

  • Yeah.

  • How come?

  • How come they're bins?

  • Look normal.

  • Numb bins, X numbers.

  • How come miner is so ugly?

  • Uh, let him out.

  • Ben's equals five just to make sure Ben's is the key word.

  • It looks like that person passed.

  • It's still gonna be a pain.

  • Isn't it weird?

  • I don't know what I've done.

  • I thought with the bins, it would just show it would be like the range Almost.

  • But that doesn't appear to be the case.

  • Yeah, we got, like, all these 1,000,000 tick marks, Someone coming below what I've done wrong.

  • It's kind of a bummer.

  • I'll keep these around.

  • I'll keep the script.

  • Uh, the other thing we could do is, um I think it's from Is it statistics that has the mean operation from statistics import mean?

  • Um, for now, I'm just gonna comment this out, I guess because I don't think that's really working Print.

  • I mean, Hallet amounts.

  • What?

  • I was wrong.

  • Oh, dude.

  • Okay.

  • So I bet I know what I've done wrong up here.

  • Let's see.

  • So right now, these are not, uh this needs to be an integer, because right now it's a string.

  • We'll see if that fixed it.

  • Whoa.

  • Wow, man, what a dumb mistake.

  • Okay.

  • Okay.

  • So, uh, so this sort of helps.

  • I'd probably cut it off at about the 10,000 mark just to see what the real distribution is.

  • But yes, you can see it's a pretty significant or even maybe 2000 mark or something.

  • But you can see we got very few of these games and it takes up real fast.

  • Okay, Cool.

  • So then the other thing we really could do now is this.

  • And we get the mean, which is 42 02 So that's the average.

  • So later, after we train a model, we can come back and see what is the new meat.

  • Okay, so, um, so now that we've done that, um, I'm going.

  • What I'm gonna do is I'm gonna come in here training data old.

  • This is from before.

  • So over time, I'll probably keep it least the previous training data set.

  • Obviously, it's 50 gigabytes at some point.

  • We can't be doing that.

  • And soon we're gonna have multiple ships, and, uh, we're just gonna have a huge day sets so we can't keep too many of them.

  • Normally, I probably won't have 50,000 games, though that's a lot of games.

  • So really, we would just raise the threshold, Probably.

  • But for now, I do want to see if we can improve the mean game from one training example.

  • So testing grounds instead of calling that testing grounds Now what?

  • I'd like to call it I'm not sure why I had to save it.

  • Let me just test it real quick.

  • Here.

  • Cool.

  • Uh, what?

  • We're gonna call this instead?

  • Now we're gonna just copy paste.

  • I don't.

  • Why do we have another test?

  • Oh, this is the output.

  • Okay.

  • Coming back over here, I'm gonna get rid of test.

  • Has just taking up space.

  • Also sent a bought two.

  • I think we only needed that for testing as well.

  • I'm gonna delete that.

  • This I'm going to say, um uh, let's call this training data district distribution.

  • Cool.

  • Okay, so now let me just copy and paste either of these and we'll call this, uh, sent to train.

  • I don't know.

  • Anyway, basically is gonna train.

  • Um, we'll use this to be our training script.

  • Okay, So, to train, I'm gonna use tensorflow.

  • You could also use straight care off somebody's care us from tensorflow.

  • But you can do whatever you want.

  • If you don't know much about care oss, intense airflow and all that you could always go to python programming dot net.

  • Finally, it knows that one over you could type in care oz and then come down here and learn all about Cool.

  • Okay, So, uh, while we still can't get too mad plot This has been happening for, like, years were like, I just can't access Matt plot lib dot org's.

  • Sometimes I really wish they would figure out what the heck's going on there.

  • Anyway, I think we're done there.

  • Will open up a browser if we need it later.

  • So let's get started.

  • Import tensor flow as t.

  • I hate that sublime.

  • Does that to me.

  • Import tensorflow asked if we're going to use OS, we're gonna probably use numb, pious and P We're going to import time.

  • We are going to probably need random.

  • We need random toe like shuffle.

  • To my knowledge, there is no shuffle for secrets.

  • You can modify secrets to an extent toe have a shuffle, but that's not really the point of the secrets module.

  • So they don't have one.

  • Ah, we're also gonna from t Q d m import T Q t m, which I don't have ti que tm So it's going pip install T Q d m.

  • This just gives us a nice progress bar as you're doing things.

  • Since we have so many files as we iterated, it would be nice to know, where are we in this freaking process?

  • Um, okay.

  • Now from tenser.

  • In effect, Um, I'm not sure I can do this any faster than I can write it, but I might be able to, so it's gonna python permanent net.

  • Let's go back to that.

  • Care off Siri's, because you always just need, like, the same freaking imports.

  • So let's go down to how to use your train model.

  • Optimizing models, maybe lets you part three here.

  • So convolution all neural networks.

  • Let's see what the imports are to the statement thing.

  • Who did these awesome frickin pictures?

  • Yeah, basically all of this.

  • These like, sequential dense dropout activation.

  • Flatten calm, Studi.

  • Max pulling two d.

  • We need all that.

  • And I don't really see any point in typing all that out.

  • Really?

  • Sure.

  • This is faster.

  • Okay, so the only other thing that we want is gonna be from tensor float.

  • I dot care oz dot callbacks.

  • We want to import tensor board.

  • This is so we can display the beautiful, beautiful graphs as we go.

  • Okay, so I think we're ready.

  • So now, as we build out this model, we have, um, a few things that we need to do.

  • We we basically, uh the data set is too big toe load into memory.

  • First of all, we can chunk it down, probably and actually load the entire set onto a GPU, at least in the in the setting that it's in right now.

  • But I know if anybody's following the long and you don't have, say, a 24 gigabyte GP or a 12 or whatever and you can't fit in, the memory is important that we batch it in.

  • Now I know care.

  • Oz has some sort of generator, and you can load it in, and I've done stuff with that before.

  • But then when I need to start customizing stuff, that's where it starts getting unbelievably complicated.

  • So I just don't do it that way.

  • If you wanna, uh, use the generator and post how you would do it, go for it.

  • I just find it adds undo complexity to a problem.

  • And like in this case, I'm just trying to do like testing here.

  • I'm not trying to spend five days figuring out how I can create a custom generator that loads batches of files.

  • So anyway, not gonna do it, but yeah, if you want to do it, go for it.

  • Have fun.

  • Eso The first thing I'm gonna say is load train files, and I'm gonna set that to be false.

  • So basically true.

  • If we already have, uh, batch train files, I guess I'll call this train files.

  • This will make a little more sense when we get there, but basically, we're going to take these files because keep in mind, these are only