Placeholder Image

字幕列表 影片播放

  • what's going on?

  • Everybody.

  • And welcome to part five of these self driving cars with Carla Tensorflow, Kare Aas and other things.

  • Basically, we're trying to reinforcement learning here.

  • Okay, that's a tutorial.

  • Siri's where we left off.

  • We created the Caryn and we created the agent, and now we're just going to tie everything together and actually hopefully start the training process.

  • So with that, we're gonna make a couple of imports here real quick.

  • Import tensorflow as t f import Kare aas Dante back end dot tensor flow underscore Back end as back end and then from threading will import capital t threading with those imports, we're gonna go to the very bottom, and we're gonna start our main dare.

  • I call it a loop, but really, if statement if, uh, name equals Dunder.

  • So if this is the script that's running, we're gonna just set, we'll say F p s is 60.

  • Um, you'll see why in a moment we'll wind up.

  • Probably.

  • There's got to be tweet.

  • This is Ah, we we wish kind of number.

  • We're not gonna get 60 but anyway, uh, and then yep, rewards will just prime that with a negative 200 to begin.

  • Uh, random dot seed one.

  • We're gonna just set some of random seeds.

  • Basically, this is just for repeatability mp dot random dot seed one and then the tense airflow dot set underscored random underscore.

  • Seed said that the one as well, this'd so we get as repeatable is possible things so we can kind of compare into bug.

  • So now what we want to dio is most people probably don't have to do this.

  • But you're running multiple things on your GP you or you're like me.

  • And for some reason, you're GPO really wants to allocate more memory than it possibly has.

  • We're gonna do GPO underscored options.

  • Also, if you don't want to run like multiple agents, you have to do this.

  • We're in a T f G Pew options.

  • Uh, and then we're gonna say, per underscore, processors were GPU our score memory underscore for action.

  • And then we're gonna say that whatever memory for action is, so scroll up to the bottom or the top here, minus set toe eight.

  • I'm actually gonna downgrade that to six, even, uh, for you, you can probably say one point.

  • Oh, uh, or 00.9 or you know, whatever.

  • Especially if you want to continue using your computer while on agent trains.

  • You consent it even lower than that.

  • And then you might not even notice that you've got training going on in the background.

  • And let's try to play some video game or something.

  • Um, so anyway, that's how you can tell, Basically, tensorflow, You know, it's pretty obvious, but this is per process.

  • So basically per model, each model will, in my case, use 60%.

  • So if you wanted to have, like, five models, each model would have to use less than point to write for your memory for action.

  • So Okay, cool.

  • Coming down here.

  • That's the GPU options.

  • Now we need to set that, so we'll stay back end dot set underscore session TF dot capital s session config equals t f dot fig Perrotto.

  • That's title cased.

  • And then we're just gonna say GPU options equals GPU options.

  • Cool.

  • So tensorflow now will hopefully listen to me.

  • Okay, Now that we've done that, we're almost ready to start training.

  • But we need somewhere for our models to go so and say if not os dot path Thought is, is your Come on.

  • Oh dear.

  • But models, so that path doesn't exist us dot Make ders models done.

  • Then we're gonna define our agent equals D Q and agents, And then we need and that's gonna be car end.

  • Cool.

  • So once we have that, we're ready to initialize the trainers or as a trainer, underscore thread Eagles Capital T thread Target will be equal to agent dot train in Loop.

  • Lehman equals true, uh, and then trainer thread dot start And then, while not asian dot training initialize.

  • So until that flag ish is true, we're just going to timed out sleep for a minimal amount of time.

  • Basically, just wait here until we're ready to go.

  • And then, uh, just like if you recall training loop here we do a model dot fit, um, to kind of prime the fit mints because the first one's gonna take so long.

  • And then we engage in our infinite loop.

  • We kind of want to do the same thing with the predictions because your first prediction will also take an exceptionally long amount of time.

  • So we're just going to say agent dot get underscore cues, n p dot ones and we're going to say end dot ym.

  • Mmm.

  • Mmm mmm.

  • Actually, now, hold on.

  • This might be different in my notes.

  • Let me go to tippy top here.

  • No, we do.

  • If we have, uh, well, we've got court end, so m dot Yeah, in height and then comma m don eme with I still see this in Versace image.

  • Um, that has me a little cautious at the moment.

  • I think we've got an air, their imjin image height.

  • So we got a couple of those image height, and I don't Yeah, and we don't have any definition for those, so that's that's gonna That would have been a problem.

  • So what we want to do is I'm going to control H.

  • I'm gonna search for image height, and I'm gonna replace that with him.

  • Height, replace all, and then I'm gonna do the same thing for width.

  • So wits.

  • And that needs to be INGE with him with cool.

  • So that would have bitten me.

  • Don't worry.

  • We'll have other errors that I'll hit.

  • I'm positive.

  • So, uh, in height and width, uh, and then three.

  • So again, it's just gonna get some Q values for something that is totally irrelevant and not important.

  • Anyway, I totally just dribbled.

  • You saw nothing.

  • Um okay.

  • Uh, now we want to Dio is basically start iterating over however many episodes we have, so we're gonna say four Episode in T Q d.

  • M did I importante que dio?

  • Uh no.

  • From T to d m.

  • We're going to import tedium back to the bottom.

  • Okay.

  • Four episode in teak UDM.

  • Uh, range 12 episodes plus one, and then we're gonna say asking equals true.

  • This is for our Windows brothers.

  • Um, so it looks pretty still, and then So if you're on Lennox or something, you don't have to say, ask equals true.

  • And then we're gonna say unit is episodes.

  • I don't forget what the default unit is, but at least it'll make sense at this, So m iss Okay, then.

  • Ah, what we want to do is say end dots collision underscore.

  • Hist is going to be an empty list and trying to side if I want to put this, Uh, no.

  • Uh So this is where our collision censors gonna start slapping in information.

  • Uh, the next thing we're gonna say is agent dot tensor board dot step Steppe Eagle two episode.

  • So you've got you've got steps.

  • And normally what would happen is every frame Wouldn't theory be a step to tensorflow?

  • Because every every frame will It would even be wolf the bachelor.

  • Anyway, it would at least every frame be a step.

  • Let me put it that way.

  • And we don't want that.

  • That's way too absurd.

  • So instead, we're actually saying Okay, every episode is a step.

  • Um, So now tensor board will show us, uh, basically a plot per episode instead of intra episode.

  • D'oh!

  • Uh, we're gonna say episode underscore.

  • Reward is equal to zero, and then step equals one.

  • Let me fix that.

  • And then we're gonna say current underscores state equals m dot resets gives us our first observation ever done.

  • Will be false to start, and then, eh, piss episodes start equals time.

  • That time I'm getting over a cold, and I can the brain is still a bit foggy, but, uh okay, So for every single episode, we, um we at least set these values we do in every set.

  • And now we start the actual while True, which basically is wow, not done Pretty much Um, but just because the done flag gets triggered, we're gonna do a couple of things in the exit immediately.

  • So we're actually just going to, like, we want to dictate exactly where we exit s.

  • So anyway, we're just gonna use a while true.

  • And then, if done at a specific point, we break.

  • So while true, if n p dot random dot random is greater than whatever the Epsilon value currently is, we're gonna say action equals NP Don't Argh, Max Agent, get underscore queues for the current state.

  • But if that's not the case, we're going to say action equals np dot random dot rand int between zero up to three and then we're gonna say time dot salute, sleep one divided by MPs.

  • So So what's the deal there?

  • So again, we're not gonna actually get 60 frames per second.

  • We're probably gonna get like, I don't know, 20 to 30 on a really nice GPU and then four on a badge.

  • Eep You were really probably none.

  • You can't run Carl on a bad GPO, unfortunately, uh, but the so when epsilon is really high, you're gonna take a lot of random actions.

  • Well, taking a random action.

  • Um, we get to avoid the agent.

  • I get cues, so we get to avoid doing a dot predict.

  • So we get to avoid using the neural network at all, which is what's gonna take time.

  • Whereas a random dot rand int this is like a nearly instant operation is gonna happen really quick.

  • So this is problematic because we actually want the agent to whether it takes a random action or takes ah, model action.

  • We want that time that it takes to be about the same right, because otherwise because frames per second really helps the model.

  • So the higher the frames per second, the better the models probably going to behave.

  • And if you take a random action but like a split second later, that is way quicker than normal.

  • You take a model action just not good.

  • We want we definitely want the friends per second of either one of these to be as close as possible.

  • So the way that you can eventually figure out what friends per second you're getting is at least wait until the, uh, your memory replays filled up and you're actually making predictions and then hat like set your Absalon toe like zero and then read your friends for a second.

  • That's how many friends were saying that you'll get for if you predicted all the time and then you can set this value, you can also guess.

  • I mean, it's probably unfair of me to say 60 at all to even start with.

  • No one's gonna get 60.

  • I don't think I don't think I would get 60.

  • So maybe put 20 or something like that.

  • Um, I'm gonna put 60.

  • I put.

  • I'll just stick with 20.

  • That's why we're not gonna get I actually might get 20 on this machine, but it's it's unlikely, But anyway, I'll go with that.

  • That's fine.

  • So, uh, all right, once we've done that.

  • So while true, we take our action.

  • Uh, the next thing we want to do is actually dio new state comma reward.

  • Done underscore because we don't have a value for that where they end dot step based on whatever action we decided to take Episode underscore reward plus equals, Whatever the reward Waas Then we want to update.

  • Yes, I'm happy with that.

  • Thank you.

  • Look, we want to update, so we'll say agent dot updates underscore a replay underscore memory.

  • And we're gonna update that with our transition.

  • Basically, And that transition is current state comma action reward new states and then done.

  • Okay, so then once we've done that step plus equals one, and then finally, if we are done, break cool.

  • Um, and then if done, break, um, and then we would come back here true.

  • And then four Actor in end dot actor underscore list, lest list actor don't D Uh, Okay, so So let me break this part.

  • So this is what we'll do during the episode.

  • So we're just updating a replay memory and then also take note.

  • So So so here we're making our actual predictions.

  • And then we're updating the replay memory for, like, the target model on all that.

  • And then you might be wondering what's going on.

  • The target model.

  • Well, don't forget, we are training in loop here for that actual target model.

  • So that's off doing its own thing.

  • It's taking an information from where well, from our games, where we're actually updating the replay memory.

  • So So basically, this is doing everything at this point that we need to do both to play and train so predict and trained at the same time.

  • Okay, Now, uh, we want to track some stats, but I think I'm gonna not type that out.

  • There's one.

  • I think we've typed it out before, into it will waste time.

  • Um, so we do have some stats that we want to collect and trying to decide if there's anything else here in the texts of pulled or text based version.

  • Um, did I not post?

  • I must not have posted the owner that I know what that is that So here I am.

  • I suppose we could take because we need to take this.

  • And we also need to take our tensor board script, so I'm gonna grab both of those things.

  • So I put a link in the description for both of those things.

  • And then basically, when you get there to part five, um, so just in the description, there would be a link that says, like, text based version, it'll be sample coded, text based tutorial or something along those lines.

  • The first little block is the code that we started with.

  • But if you continue scrolling down, usually it's at the very bottom, but actually at the very bottom of gotta play script for us.

  • So go to the very bottom skip that first gripped boom.

  • There's your seconds group and honest with you, I'm just gonna first will copy this bit.

  • So the actor destroy We've already done.

  • I'm gonna come over here pasta.

  • Um, when I tap this over, um, a little cautious as to why that was where it waas.

  • That's probably why we're not actually accepting anything at the moment.

  • So that's why that was tabbed over s.

  • So anyway, yeah, let's talk about this right before I just copy and paste and do nothing.

  • So so But this is all code that you've definitely seen before.

  • Same thing here because everyone watching has done the reinforcement learning Siri's.

  • But basically we're just collecting information like, what was the lowest reward?

  • What was the highest reward and what was with the average reward?

  • So we're kind of hoping average reward ticks up overtime in this case, Maxwell board should also take up over time.

  • Well, we got a long way to go before our maximum award will ever max out on then mineral ward.

  • This should also hopefully go up over time.

  • So even our worst episode, um, improves, right?

  • But something tells me prime in reward.

  • Also, it never mineral word's going to be really challenging because sometimes it's just gonna have a bad spawn and crash immediately.

  • So I don't know also like cars.

  • Can I've seen it happen?

  • Where will spawn and then like an NPC?

  • If we use NBC's drops right on top of it, there's nothing you could have done.

  • So you still have to deal with that.

  • Uh, and here again, based on any metric you want.

  • So in this case, this is just a long as the minimum, like the minimum reward is greater than some value.

  • I actually probably would go with average award now, but anyway, you can set this to be whatever you want, so you could save Average Award is such and such, and mineral ward is greater than some other value.

  • Whatever.

  • Well, Freddie said that however you want, it'll save the model, and then before we exit the other thing, we definitely want to do basically needed to these last three lines.

  • So, um, so this is going to run right for every episode that we have.

  • And then when we're completely done, we want to clean up to some degree.

  • So I'm gonna copy and paste in these three lines and what's happening here.

  • So here we have this will terminate our agent.

  • Uh, this completes basically the threatening.

  • And then finally, here we save the model, regardless of, um, you know, whether or not we hit this.

  • So the first time actually ran this script, it actually looked pretty good.

  • And I didn't save it because this this this thing wasn't met.

  • Well, we're only gonna run.

  • Ah.

  • 100 episodes.

  • Uh, yeah, just for testing purposes, I'm gonna make sure that script actually runs.

  • And, uh, anyway, when I when I ran it the first time, um, I didn't save the model.

  • And I was like, Dang, that was actually a pretty decent looking model.

  • So, uh, actually, I think I saved Yeah, I put both us all show the tensor boards in here on.

  • We'll talk about that in a second, but the first thing I want to do is actually make sure this script can run at all.

  • Um, so one.

  • Make sure you've got carla up and running because If you don't have that, it's not gonna work very well.

  • Uh, and then, too, we need to run the script that we're actually working on.

  • So examples and tutorial five dot pie is what we're doing.

  • Pie dashed over 3.7.

  • Tutorial lips, five dot pie.

  • Uh, okay, let's run that of horse.

  • Oh, right.

  • Okay.

  • Yeah.

  • So there's been quite a few errors that people have pointed out leading upto me, actually finally recording this one.

  • Um, so in this case, we have a positional argument is required.

  • Anyway, It's mostly input shape needs to equal that.

  • So let's do, uh, base model in poor shape.

  • Uh, equals save.

  • And let's try again.

  • Is it thread?

  • Actually, that was not unexpected error, is it?

  • Thread.

  • It's probably thread that makes a little more sense.

  • But let me confirm that.

  • Yeah, it's just thread.

  • What I knew now not gonna fix, uh, threatening.

  • Let me find where I actually typed threading from Oh, I imported thread.

  • But then I came down here.

  • Oh, did I actually type thread correctly?

  • Yes, I did.

  • Okay.

  • Carrying on.

  • Hey, we're getting some information here.

  • It's kind of going off screen.

  • My apologies.

  • I'm sure we'll hit in error at some point, right?

  • Surely we got a hit in there.

  • I thought there was more errors than just this.

  • Oh, we never did Our modified Tenzer board.

  • Uh, okay, so that is definitely in a text based version.

  • Uh, let me just find one exam.

  • Here we go.

  • Um, so from cares callbacks.

  • Uh, tensor board.

  • I don't remember if we brought that in or not already, Kara, I don't see it.

  • So pick whips paste that in on dhe.

  • Then again, I'm just gonna copy and paste this again.

  • This is yet another thing that is coming from the reinforcement learning tutorial series.

  • I did explain it once already.

  • I'll explain it again.

  • But, yes, the reason why we're doing this is because tense or bored by default will create a log file per dot fit while we have it.

  • Don't fit for every frame that we process.

  • Uh, and for every agent, so that would just get absurd.

  • And then every step would also happen.

  • Every fit and we don't really want that.

  • We just want to step per episode yet anyway, So a custom tensor board object is kind of required for reinforcement learning, at least for Deke Youens.

  • Okay, that's right.

  • Again.

  • I'm pretty confident we have a few more errors.

  • Hoop that will run into, um, Let's see what happens taking for ever.

  • There we go.

  • Okay.

  • Blueprint library not defined.

  • Probably because it's self taught.

  • Blueprint Library is my guess.

  • Uh, loop.

  • Yes.

  • Self doubt blooper.

  • Oh, my gosh.

  • Self doubt.

  • Blueprint library.

  • All right, let's try again.

  • Well, we wait.

  • Shouts out to recent new channel members.

  • Master of none, a Sushant bird, a Lance Campbell, Vincent Simon in Liam ends B and a name I can't pronounce, but feel free to tell me in the comment section how to pronounce that.

  • Thank you guys very, very much for your support.

  • You guys are awesome.

  • Now, Will it work this time?

  • Probably not.

  • I'm sure we have other errors.

  • I'm trying to recall all the ones that because somebody was following along in a, um no has No, actually, somebody was following along in an iPod.

  • I thought notebook and was like posting the errors in the discord because you run every cell, right?

  • Um M height.

  • Car end has no attributes in height.

  • What do you mean?

  • It doesn't swear, does him?

  • Oh, it's under under lower cased part.

  • Uh, I'm trying to decide how I wanna handle that.

  • Actually, um, I think that's popping up on the get cues.

  • Yeah, And then is that going to continue?

  • Let me do get cues.

  • Um, end.

  • So it happens here, currents and then on the current state.

  • So, yeah, the only time we actually use that is here.

  • And in this case, it's actually a lower case in height and him with All right, All right, all right, all right.

  • I think I think we're coming down the home stretch of, um, issues.

  • I think this time it's gonna work.

  • I have faith.

  • I believe I'm hope while we wait.

  • Um, I am running a live stream.

  • You can actually see it in the background.

  • But if you got a twitch, not tv slash Centex.

  • Probably every day for the next few days.

  • I don't know when you'll see this video, but, um, I will be putting up the stream.

  • I'll explain what my read boy predict in ST dot shape.

  • Okay, this one is a little more complicated, but, um, this and this is why you don't do unpacking.

  • I'm blaming Daniel for this one s o ST dot shape.

  • I got too busy talking about how that actually works.

  • But what I forgot to do is we reshape one comma state shape.

  • But if you did that, it would unpack the values of ST dot shape.

  • But instead, what I did is I forgot the comma because of Daniel's.

  • It's all his fault.

  • And instead I did negative one times state shape.

  • So instead, what we need to do is return model dot Return Self doubt.

  • Model that predict return self time model.

  • Don't predict, Bob.

  • Blah, blah comma state shape running that one more time.

  • This is this is it.

  • I'm telling you guys, it's gonna work.

  • Anyway, this is, uh, the lime stream of the agent running.

  • Of course.

  • Uh, this is I wouldn't expect that you're gonna be able to get this is currently running eight agents.

  • Um, we'll talk about that in a moment, but this is basically this has been running now for about a day in maybe 12 hours or something like that.

  • Um, it does pretty good.

  • It can.

  • It can at least stay in the Lane.

  • Okay.

  • Uh, wait.

  • Hold on.

  • I have here things running.

  • I wonder if it's Oh, my goodness.

  • About ready to hurt self dot org be.

  • How many errors are we gonna hit?

  • Okay, uh, that's gonna be rgb camera.

  • Yep.

  • So I'm gonna do this.

  • Pasta, pasta, pasta.

  • See what happened here.

  • You don't even know how I made that mistake.

  • Okay, okay, okay.

  • Okay.

  • I know I've said it before, but this time, this is the time it's gonna work.

  • I mean, how much more errors could we have?

  • Not many.

  • I mean, at some point, there's a finite number of errors that we could possibly hit.

  • Anyway, I'm gonna run this pretty much every day around 1 p.m. Central time, maybe a little earlier.

  • Basically, at night, we'll set the Epsilon quite high.

  • Let the Epsilon come down to about 10%.

  • And then that's where rewards will hopefully back propagating The agent will actually learn something, um, and make most of its improvements, whereas overnight, it's just gonna be doing a lot of exploration.

  • So I probably won't streaming at night, and I'll just stream it kind of during the day.

  • But as you can see.

  • I mean, it's it's doing OK.

  • I mean, it learns how to do some stuff I've seen.

  • It makes some really good turns and stuff like that.

  • Like, right now that what you're looking at is 10% Absalon, Um it still is a good way to go, and we're at about 100,000 episodes.

  • So that should tell you that it will indeed take quite a while to train a model like this.

  • So here I'm running.

  • 100 episodes were on episode two out of 100.

  • It is going to take some time, but no more heirs.

  • So you guys, you can trust me when I say there will be no prayers, there will be no more errors.

  • So Okay, um, I think what I'll do is all posit and then her.

  • In fact, I don't see any reason deposit, because I'd at least have some imagery that I can show you guys.

  • So of the model training.

  • Um, so this was the 1st 1 that I trained on my own.

  • Like while I was writing the text based tutorial.

  • This was the tensor board that we saw.

  • And what I'd like to do is make it big.

  • Yeah.

  • So this was our accuracy.

  • Obviously, this is Epsilon over time again.

  • It's only 100 episodes, but we can already kind of learn a couple of things from this.

  • And I'll talk more about this in the next tutorial.

  • But, you know, accuracy for the model is about what you would expect.

  • The fact that it hits 100% accuracy a couple of times eyes, a little bit of a red flag.

  • Uh, it's not good.

  • Uh, Absalon Decay.

  • I mean, it did what it was supposed to do, uh, reward average.

  • I mean it.

  • Maybe it started going up.

  • It's too early to tell, um, and then all the rewards stuff.

  • So it's really not much we can see here other than already within a couple of episodes.

  • Within 100 episodes we have Expert loss has exploded twice to astronomical numbers and again in the next tutorial, talk more about that stuff.

  • But that's a bad thing we have that We don't want loss to explode like that.

  • Like lost can't explode.

  • Thio.

  • You know, maybe double digits Have we'd like to avoid this.

  • Um, this is bad.

  • This tells us something is going wrong.

  • Uh, and then I ended up because the model didn't say if I ended up actually running it again and this was the image.

  • So just open that again in a new tab.

  • And this, actually this year's look better that things were going in the right direction again.

  • We still peaked at a perfect 100% accuracy.

  • Red flag.

  • Um, our loss looked a little more under control and at least eventually kind of came down.

  • But again, were the scale.

  • Here is not really a scale that we want Now It's better scale, right?

  • This is up to the power of nine.

  • This one's only up to the power of five.

  • Um, so it's better, but not quite the scale we want for loss.

  • So something's going on there and again.

  • I'll talk a little bit about that, but it's where his reward average and Max and men.

  • I mean, it was improving, but when I go to run the model, it was actually just doing circles.

  • It would only do one action, and then, from there you could do a couple of things to try toe alleviate that.

  • So I think me zoom back.

  • Yeah.

  • So I want to say this is the second.

  • Yeah, this is this will be the second agent in action.

  • You can see I'm getting 18.5 frames per second, but yeah, you can see all he wants to Deuce is due turns.

  • And then here you can see the the actual these air, the Q values we can see that they are changing and kind of interacting with the environment.

  • And then sometimes, actually this you know, this value is higher than this value is.

  • Sometimes this ones are This one's always the highest, right?

  • So this is our turn, right?

  • Eso it always wins.

  • But sometimes this one is higher than this one over here.

  • So?

  • So that does sometimes happen in this video.

  • Didn't.

  • But it did just trust me.

  • And so we could, like, wait, these or something like that.

  • But, um, we already knew something was wrong with loss anyways, so I'm gonna talk about that in the next tutorial.

  • But if you want to kind of tinker with your model and train the model for a little longer, try a different, uh, neural network.

  • All that you can use the following script here, and this would be our played up pie.

  • Basically.

  • So, um, whatever the name of the code that you've been working on so far is that's what you would put here.

  • So tutorial five code is just kind of my placeholder.

  • Um, sometimes I put dashes, and I've been doing that on the previous tutorials.

  • Obviously, you can't import a module with dashes, so replace those underscores or whatever and then model path.

  • You'd have to modify that to be whatever the name of your model is when it gets saved.

  • Ah, but then you can run this and actually see your model playing in action.

  • Unfortunately, nothing was too exciting here, but, um, but But that's part of it is is you can actually see what is the model actually doing?

  • If we use zero absalon and you'll see that Oh, it's terrible.

  • So So anyway, in the next video, we'll talk a little bit more about the findings here.

  • I'd like to run this a little longer before saying all the changes that we made and really noting the impacts that were made because of those changes.

  • So um, it definitely like it's it's only done one real full back propagation of the rewards from one decay.

  • So hopefully at by the end of today, we'll kind of have a better idea.

  • For how much longer can I actually train this model before?

  • It's probably at its max and then we can start to decide.

  • Do we want to continue pushing?

  • Carla, do we want to continue?

  • Maybe give him or actions?

  • So this agent currently has to go full throttle.

  • We could give them control of the throttle.

  • Were Actually we've been referring to this as Charlotte.

  • So it's a sheet, um, so we could give her access to throttle.

  • So whether or not she wants toe have full throttle or no throttle, I don't think we're going to do any sort of pit or anything.

  • Um, and then also maybe access to the break so they could slow down.

  • So maybe do that, and we might actually find that does better.

  • The other thing, I wouldn't mind testing.

  • Is cycling different cars stuff like that?

  • Because the dream would be to move this to Grand Theft Auto five.

  • And before I could do that, we would definitely need the camera position to be somewhat dynamic in the training data.

  • Otherwise, what's likely to occur is as soon as you change that, even just a little bit, the model just loses his mind.

  • So we definitely want to do that.

  • Might even might even be a good idea to change field of use, just like dynamic field of view for the camera sensor, Um, stuff like that.

  • So, anyway, uh, yeah, we'll see what happens.

  • It's still kind of up in the air, definitely have learned quite a few things, but I think I'll save that package that for the next video.

  • But yeah, as you can see, I mean, does pretty good in a straight line.

  • I mean, eventually he decides he's she so hard.

  • I'm so used to being Charles.

  • Um, but ah, yes.

  • So it does a pretty good job staying in a straight line, and then I've seen it take some turns most of time still plows right into a wall.

  • But yeah, I show some more examples of things in the next video.

  • So questions, comments, concerns, suggestions, whatever.

  • Feel free to leave those below.

what's going on?

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級

運行我們的強化學習代理--用卡拉和Python實現自動駕駛汽車 p.5 (Running our Reinforcement Learning Agent - Self-driving cars with Carla and Python p.5)

  • 1 0
    林宜悉 發佈於 2021 年 01 月 14 日
影片單字