Placeholder Image

字幕列表 影片播放

  • what's going on?

  • Everybody.

  • And welcome to part three of the self driving cars with Carla tutorials in this tutorial and the coming probably at least like to plus more.

  • We're going to talk about doing reinforcement, learning specifically deep que learning with the Carla environment.

  • Now, to do this, we kind of have thio change the architecture A little bit of how we're approaching this problem.

  • We want to approach this problem with the, um with everything being set up to work well with the concept of reinforcement learning.

  • So there's a 1,000,000 ways that we could do this.

  • But with the advent basically or the introduction of open A I into the world now known as closed, I, um they had kind of set the standard for how to approach and work with environments to do reinforcement, learning and that that methodology kind of has persisted and tends to be how everybody handles reinforcement, learning whether or not they're actually using the open.

  • Aye, aye.

  • Environments themselves, like Jim or retro are now maybe is the name and I forget the other one, but a Tory, maybe, anyway, not important.

  • But basically, you wanna have some sort of environment that has a couple of methods.

  • So, um, I we're gonna basically keep all this code here.

  • We're gonna have to get rid of a lot of stuff because we're not We can't really run this as a script anymore.

  • We need toe.

  • Probably use object oriented programming.

  • So, um, we are probably done with all of this.

  • I do kind of want to leave process image.

  • We're gonna just convert that to a method rather than a function.

  • But that's okay.

  • Uh, in within the height.

  • Well, prime that moving those.

  • I'm just gonna delete those Most I just think this block of text, I'm determined to never write it anyway.

  • Basically, the the standard for, um for reinforcement Learning environments is to have some sort of dots step method for the environment where you pass an action to that step.

  • And maybe if the syntax was working, it would look better.

  • So, anyways, you'll have some sort of step method where you pass action.

  • You'll do something with that action and then, um, collect any information like determine what the reward was and stuff like that.

  • And then when you're done, you return the next observation.

  • The reward whether or not we're done.

  • So this will be like a flag.

  • It'll be a Boolean flag, basically, So this will be true or false.

  • Whether or not we've either reached the end of the environment like week were successful in what we tried to do or Ruben at a time, or we died for whatever reason.

  • So and then finally, usually just you see it as an underscore, but it's just extra info.

  • So if there happens to be extra info, this just again just kind of the standard.

  • We generally almost always care about observation, reward and whether or not things were done.

  • But any environment.

  • We tend to throw this in that way, and the reason why we want to use this standard and continue to use the standard is there are different reinforcement learning models, basically that you can work with and whether someone did it with, you know, opening I, Jim or they did some other environment or whatever.

  • You can easily just swap these these models round to all kinds of different environments, and some environments might want that return some extra info for who knows why, but maybe, and we just we just leave that there, so that is possible.

  • So some sort of extensive bility is possible, But anyways, this is what we want to do.

  • So we need a step method, and then we also want a reset method.

  • Um, and this is either at the very, very beginning of the environment or after we've returned a done flag.

  • If we want to run another episode, so to speak, will reset and run another episode.

  • Okay, so that's what we want to d'oh.

  • Now what we need to do, he's actually do it.

  • So what I'm gonna do is just come down again.

  • I'm just gonna leave this process image code here because, uh, I'm gonna pretty much needed I'm just gonna add a bunch of self dot in front of everything, so I'm gonna leave that there.

  • But we're gonna start some new code here, and the first thing I'm gonna do is we're gonna at the top of this script, we're gonna have a bunch of, like, starting Variables and Constance, But for now, I'll just have show preview, and I'm gonna set that to be false for now.

  • And this is whether or not we want to display the actual camera from Carlos so you wouldn't want Always display at one.

  • You know, if you have ah worthy machine, you'd want to run many agents at the same time.

  • Ah, but to displaying that previous gonna hog up computing resources both CPU and or GPU, I believe open CV purely uses CPU, so it's gonna lock up CPU.

  • Um, but you just you just it's gonna use Resource is nonetheless and probably we're gonna be maxing things out, so we'd rather not.

  • But if we want to use it for the bugging purposes, we can turn this flag to true just to see a couple of previews and then turn it back off, maybe whatever.

  • Anyway, I'm just a bit too falls for now, but you definitely would want to be able to see what's happening sometimes.

  • So then we're gonna have class, and we're gonna call this car end environment.

  • And now, when I just set some of our initial value.

  • So first we're gonna have show Cam.

  • And for now, that'll just equals show preview.

  • We might also have show cam show every even if show preview is set to true.

  • Maybe it shows every you know, I don't know.

  • 10,000 episodes or some Not 10 $10.

  • That's a big number.

  • 100 episodes or something like that.

  • S o show cam equals show preview for now, uh, then we're gonna have steer underscore amount.

  • We're gonna set that toe one point out.

  • Basically, we're gonna fully you know, there's three actions we can take.

  • It will be in, um 01 or two.

  • So we can either, Um we're actually, it'll be negative 10 or one.

  • And basically, we can either steer fully, uh, left, go straight or steer fully.

  • Right.

  • But later, you might want to make this may be cumulative so that so that steering wheel slowly turns one way.

  • And then if you want to go the other way, you slowly turn the other way, depending on what kind of friends.

  • For a second we get, we could probably do so much more fancy stuff there.

  • But for now, we're gonna do full turns every single time.

  • Ah, Then what we're gonna say is we're going to throw in an M.

  • Um, in fact, I guess we'll just toss this up here, so em underscore with a totally just deleted all this stuff with 6 40 m height for 80.

  • I was wondering how that was getting typed in there.

  • Okay, I see.

  • Um, cool.

  • So we've got those.

  • And then, um, for now, I guess we can set those to be equal to these as well.

  • So am I.

  • Him underscore with and with Will set equal to m with in an M height am underscore high.

  • It will set to em height.

  • Okay, so then what we're gonna do is we are going to do.

  • I think we'll just said we'll say front, underscore ir camera for now, we'll say that's none.

  • And I think that's it.

  • That's all of our, like, basically, initial values that we want to set.

  • Now, what we're gonna do is we're gonna define innit?

  • And self here.

  • And what we're gonna say is basically all that starting deleted it.

  • But the starting code to connect it's not too much anyway, so we need to connect to the server.

  • So self dot client is going to be carla dot client on.

  • Then we're gonna connect local host for 2000 self dot client dot Set underscore.

  • Time out.

  • We're gonna set that to be two seconds Still self dot World is equal to use self dot client dot get underscore World blue blueprint underscore Library library equals self doubt World dot gets underscore Blueprint underscore Library.

  • So okay, we've got the blueprint library.

  • Now we want to grab our car so self dots model underscore three is equal to Blueprint Libere eri not Filter.

  • And we're gonna filter for model three model three.

  • Actually, it's just model three new underscore there and then the zero with index.

  • So with that, we probably have everything we need when we first initialize.

  • Because we just we just wanna have access to our blueprint library.

  • We want Thio.

  • Well, at least in this case, we just grabbed for the car.

  • Uh, in fact, we might even we probably I think this should just make this self w print library because we're gonna probably need to access that elsewhere because we need the sensors as well.

  • So I'll leave.

  • I'll just do self w print liberate.

  • Okay, so with that, we're ready to do our, um, reset methods.

  • So we're gonna say define reset and again here just past self.

  • There's nothing needs to be passed step is what's going to take in the action.

  • So at the beginning of every reset environment, what do we need?

  • Well, we're gonna say self dot collision underscore history.

  • That'll be an empty list because the clued in sensor returns a list of these like collision events.

  • Basically, if we have any collusion event, we're going to go ahead and reset.

  • At least that's what we're gonna do.

  • Start later, I might customize that I haven't looked too deeply into the collision sensor.

  • The collision sensor basically reports like some sort of like magnitude value and sometimes, like if you just drive over like a pretty non substantial curb, even like you're totally fine.

  • Even though that theory pride throw the car out of alignment it it says that was a collision.

  • And I've even seen it call a collision like if you just simply let go uphill really quickly.

  • Um, like maybe the bumpers scraped the floor or the ground because you went too fast.

  • I'm not really sure, but I've seen it registered.

  • That is a collision as well, so we might want to require the magnitude to be higher than anything, right?

  • But for now, if anything is in this list.

  • So if any collusion is detected, we're just going to say, Hey, you failed.

  • Uh, then self dot actor underscore list is also going to be an empty list.

  • Um, And again, we just always wanna track actors so we can clean them up at the end.

  • So they were to say, self dot transform equals random dot choi's self dot world dot Get underscore map dot Get ghat get spun points.

  • Okay, Don't forget your open and close.

  • That's a method got get map on get spawn points again.

  • Method.

  • Okay, so we've got the transform now we're gonna do is self, not vehicle equals self dot World not spawn Underscore actor, actor And we want to spawn self dot that model three from above and then we're gonna spawn it to self dot transform.

  • So we've got we've just spawned an actor.

  • What do we need?

  • Do we need to self dot actor underscore lists dot up and ah, self dot vehicle.

  • Okay, so now that we've done that, we want to get our RGB camera.

  • So that's the next thing.

  • I'm going to go ahead and ads or is a self rgb underscored Can equal self?

  • Uh, no, we're gonna use Ah, blueprint library.

  • So self don't blueprint library.

  • Uh, and they were going to say Don't find and then sensor dot camera dot org Be secretly actually an RGB alfa camera.

  • Self doubt rgb rgb cam dot set underscore attributes, and we're going to say image under scorers size underscore X is the f string of self dot lips self Don't, uh I'm with em with cool, and then what I'm gonna do is copy this paste pays Will do three pace.

  • So m s o, then why?

  • So you've got your ex your Why is your height height?

  • And then this will be field of view, and then we're going to just I'm just gonna hard code this to be 1 10 for 1 10 for now.

  • Okay, So just like our vehicle, we get the vehicle.

  • What do we need to do?

  • What?

  • We need to specify the transform.

  • So I'm just going to say transform equals carla dot Transform Um, that's a Capital T.

  • Did I transform?

  • Transform?

  • It was Carla.

  • Did I do a court?

  • Where's the other terms form?

  • Okay, I don't really see it anyway.

  • carla dot capital t transform.

  • I'm just curious.

  • Why wanted lower case that tea?

  • But I'm sure when we go to run this will Sea Air's so, uh and then carla dot the current location.

  • And this is again I believe, to be a relative position.

  • So it's just in relative to where we throw it.

  • Uh, we want to move this.

  • So transform.

  • Then what we want to say is thes self dot sensor is equal to self doubt.

  • World dot spawn underscore actor, factory actor self dot rgb underscore Cam.

  • So it spawns the actor, uh, and then transform and then attach underscore two equals self don't vehicle.

  • So now we've got this camera on the front of our car.

  • We've got a new actor.

  • What do we do?

  • Well, we append to the actor list So copy paste.

  • And in this case, it's not self doubt vehicle it is self dot sensor.

  • So once we've done that, the next thing we want to do is we want to say, uh self dot sensor dot Listen, and we'll do land, uh, for lips Landa data and then self dont's process underscore image data Landa Data.

  • What have I done.

  • Oh, okay.

  • Lambda Data, self doubt, process, image data.

  • Okay, so then we need to We need to convert this to be a method inside of our environment, which we will.

  • D'oh.

  • I'm just gonna go and finish what we're doing here, and then we can We can do that pretty quickly.

  • So once we've done that, we've got our core.

  • We've spawned the car.

  • We've got our camera, We've spawned the cameras.

  • We've got all that.

  • And what?

  • Whenever you, um whenever you create this car and you spawn a car, I don't know why they did this, but they did.

  • Thanks, Carla.

  • People, when you spot a card actually falls from the sky.

  • And maybe they did that because that was the easiest way to overcome issues where you like you spawned and were clipping a little bit like your tires.

  • Maybe we're clipping and you couldn't move.

  • I don't know.

  • I'm sure there's a great reason for it.

  • But you fall from the sky, and when you fall from the sky, one issue is you actually can't drive yet.

  • The other issue is when you hit the ground.

  • Sometimes the collision sensor registers that as a collision because you just fell from the sky.

  • Uh, and then also initially, your RGB camera hasn't yet.

  • I don't know if it initialize is or what, but it doesn't actually start pulling in data.

  • So sometimes, as soon as you try to grab him injury from the RGB camera, you cannot.

  • Itjust itjust returns and none.

  • And this is for a variable amount of time.

  • I couldn't figure out, like sometimes it would immediately return imagery.

  • And then other times, it would take some time.

  • I don't know why, but it's a thing.

  • So we have toe handle for that sort of nonsense occurring.

  • So eso we're listening.

  • Uh, Then what we're gonna say is self, not vehicle.

  • Don't apply.

  • Underscore control?

  • Yes.

  • S o I left out one more thing, too.

  • Is that, um the duration that it takes for the car to actually start doing things also is variable.

  • So, like the camera, they're just weird things that happen.

  • And, um, I didn't really discover this.

  • Daniel seem to discover that if you just apply some control, like even if you don't do anything, if you just send in the command to control, um, it makes the car.

  • React quickly.

  • More quickly.

  • Anyway.

  • So I'm gonna throw this in.

  • I have no words.

  • Anyways, Carl taught you already know how to do vehicle control.

  • So we're just gonna say throttle equals and we're gonna say 0.0 and then break equals 0.0.

  • Definitely more research is required here to figure out exactly how to feed Carlo.

  • What?

  • It wants to make it act quick as quickly as possible.

  • Because the speed that we can run through episodes, the faster we can do that, the faster we can train these models.

  • So it's really important that we can get this distinct get going like is quickly as possible.

  • So, um, we'll be we'll have to just continue to work to try to figure out how to make it go as quickly as possible.

  • For now.

  • Time not sleep four seconds.

  • So considering the fact that episodes air gonna be about 10 seconds full sleep of four seconds on top of those 12th episodes, um, stinks.

  • Ah, really Wish we could get things rolling a little quicker.

  • So anyway, Carla people, if you're listening, but you're not, I'm sure, But if you were, how can we do this quicker anyway?

  • Now, uh, we've got the car, We've spawned the car.

  • We've wiggled the car.

  • Well, we haven't wiggled, but we've sent some command, so hopefully cars paying attention.

  • Now, the last thing that we want to do is we're going to do the coalition's and Sir Cole sensor equal self dots.

  • Uh, what was it?

  • Blueprint library, self dot blueprint, library dot Find.

  • And we are going to attach these sensor dot other dot collision collision.

  • Great.

  • So that's our collusion censor.

  • The next thing that we want to do is self dot Cole sensor is equal to self dot world don't spawn underscore actor.

  • And we're gonna spawn Cole sensor, uh, transform.

  • And then we're going to attach underscore two equals self dot vehicle.

  • Um, as I write this, I wonder a wonder of transform even matters like I would guess I would I would almost assume you wouldn't need to Like, I don't think this is gonna matter that we've We've added this collision sensor here.

  • Um, I just kind of realized that we were doing that.

  • Anyway, the collusion sends air works like that.

  • So I'm gonna leave it, Um you probably don't have to adjust it, but we're just gonna reuse transform.

  • Okay, so, uh Okay, so we've spawned an actor.

  • What do we need to do?

  • You guys know the deal, so we need to self dot actor list on upend.

  • Uh, Cole sensor.

  • Cool.

  • Okay, so now that we've done that, we want to, uh, witness a self dot cole sensor.

  • Cole sensor dot Listen, and again, we're going to Lambda Will say event this time.

  • Self dot Collusion underscored data event.

  • So now we need this collision.

  • Will also need the collusion data, uh, method.

  • But again, that's a really simple one.

  • Uh, and then we in order in order for this agent to run, like, basically, we're done at this point.

  • But in order for this agent actually run, it has to return an observation.

  • Well, our observation is the front facing camera.

  • So if after four seconds that camera still is not ready, we are going to say while self don't front camera is none.

  • So it returns none until it starts returning.

  • Things were going to say time, not sleep 0.1 Now we might We could possibly, like, do all the handling for sensors and maybe a few other bits of information and then comment this out that way Were, uh, doing the least amount of waiting possible s Oh, yeah.

  • There's definitely much more optimization that we can do here, but for now, we're gonna follow up with this.

  • We just want to make sure things work first, uh, and then we can start to optimize other things.

  • But anyway, well, we'll do that once that's done.

  • It means the episode actually has started.

  • So we're going to say self dot eh piss underscore start equals time dot time because we want to run episodes for 10 seconds.

  • Otherwise, we would just pretty much run infinitely s.

  • Oh, yeah, I think we're just going to say episodes or 10 seconds fixed.

  • We could later maybe even remove episode time.

  • So the longer it runs, But the problem is, agents will sometimes find things that they can do that you didn't expect them to figure out.

  • So, for example, they might find that they can drive in circles like that was one of the things that we discovered, and they'll just do that infinitely long.

  • So sometimes it could be better to actually ended at some point, but when it's first starting out, I mean, it's gonna crash constantly.

  • So, um, you might not even really need to worry about this, But anyways, we're gonna track star episode length anyway.

  • Self.

  • Okay, so then we're gonna issue one more of these self vehicle controls again.

  • It just seemed like throwing these at Carla could get things running.

  • Um, I really don't weigh.

  • Have no definitive testable findings here yet.

  • We're trying to figure out how we can get things to just work quicker, but yeah.

  • I mean, if you're just trying to run a car one time, you almost don't even notice the issues.

  • But when you're trying to run 100,000 plus episodes, you start realizing there's strange things happening here.

  • So anyway, something big apply control.

  • Great.

  • Uh, and then we will return self dot front underscore camera.

  • Okay, so we've done that.

  • Uh, now we need to add those two methods below here.

  • So So camera.

  • Ah, sort of.

  • Say define collision underscored data, and this will be self and then it takes in event.

  • And then, um, what will do here is we're just a self aunt.

  • collision underscore hist dot upend any event.

  • So if we have an event will just depend into the history, and then we'll we'll handle for this in the actual step method, and then we need or process image methods.

  • So now I'm gonna take this cut.

  • Move it up here, Pasta.

  • Hey.

  • Well, I appreciate this.

  • Thank you.

  • Ah, so process image will pass self image.

  • I equals blah, blah, blah, blah, blah If will pass here if self don't show Cam, I want to say yeah, show can.

  • So if self don't show Cam, we will show the camera.

  • Dude, you're killing me.

  • Uh, okay.

  • Uh oh.

  • And then we need to fix this, which will be self in with self in height.

  • So, uh, actually height and then with, say, I tried to make that mistake.

  • I told you that was gonna happen.

  • Him height.

  • Maybe I'm just stupid.

  • Self dot Ym with or both.

  • Uh, I'm with Can't I can't rule that one out.

  • Uh, okay.

  • Think, uh oh, uh, hear what we want to do.

  • If self that shook him.

  • So, um, we'll just do this.

  • We'll say self dot front underscore.

  • Camera equals I three uh, And then we'll just have to remember to scale that down with the negative.

  • Divided by 2 55 will do that at the end.

  • Um, okay, I think that's good.

  • I'm sure I've got problems, but I think we're good there.

  • So now the last thing we need to do is the step method.

  • Well, it's not the last thing, but the last thing we need to do for Carla Environment.

  • I was tempted to do that in the next video, but I think we'll just throw that in really, really quick leagues.

  • That's pretty basic.

  • So define step.

  • Uh, and then we'll pass self action.

  • So basically, it takes in a specific action, and then we're going to have if action Um, yeah, totally misspoke.

  • Faction is equal, zero will pass, and then we're basically going to LF action equals one and then we're not actually gonna pass.

  • But if and then, um Els.

  • Actually, I'd rather this being LF action equals two pass.

  • So if the actions were going to say the actions are 01 and two, um, yeah, we wouldn't want to do negative 11 and one would rather these be like Arcamax is of output.

  • So 012 So zero is basically the left.

  • One will be go straight and then to let's say turn right.

  • So if action is your over as a self dot vehicle dot a pole I control on, then it's carla dot vehicle control.

  • And then we're going to say, every time the throttle, we're gonna set that toe full throttle, and then we're gonna say steer, um is equal to negative one times, ah, self dot steer amount.

  • So this should say negative one times one right.

  • So steer will be a negative one and then cause again later.

  • We might change steer amount to do other things.

  • So for now, this will be fine, So that will be Go left.

  • Then this will be go straight and then this one will be go, right.

  • So rather than negative one, it will be one.

  • And then here, In theory, you would say zero time steer amount, But I'm just going to put in a zero just for easy legibility sake.

  • Okay, then.

  • Basically, you could say, as this thing goes on, you could, uh, you basically, if it's not collect colliding with something you could give it a reward.

  • But instead, what we're gonna do is I'm gonna use speed on Lee because I know the car will figure out to just drive in circles.

  • And we don't want that.

  • Uh, so we're gonna save E for velocity equals self, not vehicle dot Get underscore velocity.

  • And then because velocity is not really super useful, we're gonna convert that two kilometers per hour.

  • We're going to say equals the value of 3.6 times math dot square root.

  • I might not have imported math by now, but that's OK of the velocity X squared.

  • Hey, bro.

  • Uh, plus velocity.

  • Why squared plus velocity Z to the parachute.

  • Ok, um cool.

  • So that should be our kilometers per hour.

  • And I don't think I important, man.

  • I'm sure it's wise.

  • Angry?

  • Yeah.

  • So let's go ahead up go pure import math, and then we'll go back down to the bottom and Okay, so then what we're going to say is if If the len of self dot collision history does not equal zero like later, we could do something much more fancy with the events that existing the collision history.

  • But right now, if we've registered.

  • Any collisions were going to say done is equal to true.

  • Uh, and then we're gonna set the reward eagle to Negative 200.

  • Again.

  • I'm just pulling these numbers out of nowhere.

  • You might want to make that that penalty basically much larger.

  • Maybe native 500.

  • Negative.

  • 1000 You know, I don't know.

  • I really don't of these just making it up as I go.

  • So LF kilometers per hour is less than 50 then what we're gonna say is done equals false.

  • We're still running.

  • Everything's happy.

  • Uh, and instead, we're gonna say reward is evil to negative one.

  • So any steps that we take, any frames that go when we have not We're not going faster than 50 kilometers per hour, which, honestly, is not very vast.

  • But I used 50 because to reach 50 kilometers per hour, just driving in a circle is kind of hard.

  • So that's why I went with 50 again.

  • Just kind of making it up is a go.

  • Um, So it'll be negative one, but we didn't crash, so it doesn't need to be a huge penalty again.

  • You also could set this to be zero.

  • I don't know the answer to that.

  • We have to research to find out.

  • Uh, l if else we're gonna say done equals false in the reward equals one.

  • So if it is above 50 then we'll keep appending one to the total reward.

  • Okay, so that's basically we've got our actions handled here.

  • We've got our calculation for reward, but the other thing we want to do is we want to limit the lengths of these episodes.

  • So we're gonna see if self, not episodes start.

  • Plus, um, seconds will do Seconds per episode.

  • Episode.

  • So probably to start, we'll just say 10 seconds so we can come up to the two pity top here.

  • We can say seconds per episode.

  • We want 12th episodes.

  • Um, if it is less than, um, time dot time, time dot Time Then what we're going to say here is done equals true.

  • Cool.

  • So then, finally, the last thing that we're gonna do here is return self dot front underscore camera.

  • So observation reward done.

  • And then for now, none.

  • We don't have any extra information that we want to pass.

  • Um, okay.

  • I think I'm happy with that.

  • Uh, I'm sure I've got issues here.

  • So stay tuned when we actually run all of this stuff to fix any errors.

  • Uh, if you feel if you see something, feel free to comment below, but I'll find it out.

  • Uh, we've also written a lot of code here, So right now, I don't have the text based version of this tutorial up yet, but by the time you see it, it probably will be up.

  • So So I'll have links in the description, uh, to that.

  • So if you wanna make sure your code is the same or whatever feel free to do that with that shouts out to my channel members, these air people with me for one month.

  • Carter Babin, Blackhawk 3003 Jesse Jones Pain Max Philip Wagner Not Brian Bradley.

  • And vis around Kumar.

  • Thank you all very much for your support.

  • You guys are amazing individuals.

  • Uh, okay.

  • Questions, comments, concerns, Whatever.

  • Feel free to leave below.

  • Come hang out with us and the discord.

  • That's discord dot g slash Centex.

  • If you want to support the channel, you click that beautiful blue join button.

  • Otherwise, that's it for now.

  • I will see you guys in the next video where we will be working on the agent itself.

  • And then hopefully after that, we Kenbrell everything together and actually start running wth e agent.

  • But it's going to take a long a long time to do the learning.

what's going on?

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級

汽車代理的強化學習環境--用Carla和Python實現自動駕駛汽車第3頁。 (Reinforcement Learning Environment for Car Agent - Self-driving cars with Carla and Python p.3)

  • 4 0
    林宜悉 發佈於 2021 年 01 月 14 日
影片單字