Placeholder Image

字幕列表 影片播放

  • What's going on?

  • Everybody.

  • And welcome to part two of our poking around with neural networks.

  • Tutorial Siri's in this tutorial.

  • What we're gonna do is kind of build on the last one.

  • So the last one we had this input data that was basically a Shakespeare play, and we got it.

  • We got the neural network to produce something that looked very just like this.

  • Basically, uh and then we wondered, Could we do this with python code?

  • So that's what we're gonna try in this tutorial.

  • So the first thing that we're gonna go ahead and do is in the data directory.

  • We're gonna make a new one.

  • So there's tiny Shakespeare here.

  • Let's go ahead and make a new one.

  • And I'm gonna call this one, uh, pi code and then inside pi code, I'm going to add a new file, and I'm gonna call that, um I don't know, standard lib compile.

  • Then I'm gonna make that pie.

  • Yes.

  • And then let's open it with some lime.

  • So first of all, we need to basically what we're gonna d'oh to get you know, to get our sample data we have to use, we have to have some sample code.

  • And so I think, pray, what better place to get that sample code than from the standard library, So depend on your operating system and where you've installed Python, this could all be different.

  • But if you just do, like, import cysts and then we just do Sis, uh, let's do print a cyst on path.

  • If we run that, we can see basically all the places where, um Python believes your path to be And all you're really looking for is your lib path.

  • So So in my case, it see python 36 slash lib and in their Knorr is my standard library.

  • Eso site packages is where third party libraries go.

  • But in this one right here, that's where, like things like you know, time and all that are stored insists, for example, So, um, yeah, I'm gonna go ahead and pull this back down, so we're gonna grab code from our lib directory.

  • Now, the way that we're gonna do that is to start over, we're gonna import s standard lib location, and in my case again, that was right here.

  • So I'm just gonna copy paste, And if you're on windows just make sure you double backslash there and let's make sure we don't have that extra quote and then we can specify just to keep the size decent.

  • You could say how many maximum files you want?

  • In my case, I'm going to say 1000.

  • Um, we're actually you might want to say 100.

  • That would probably produce something close to what we're looking for.

  • Um, well, we'll see.

  • So now are you mad at me for this?

  • Oh, now we want to do is we're just gonna count.

  • We'll have a simple counter here, and then what we're gonna say is with open on, then we'll do the input dot text with the attention to a pendant and with the encoding as ut f a.

  • So with open that as f what do we want to do?

  • What we're gonna do is for path and directories.

  • Um, actually, path directories files in os dot walk, and then we wanna walk through this standard live location.

  • So basically, it's gonna give us the path all of the plausible other directories and then the actual files in there, and that's basically Os walked us is just gonna let you recursive Lee literate, basically through every option that we possibly have.

  • So once we do that, we're gonna just say four file in files.

  • Uh, we're gonna say account plus equals one.

  • And it just seemed a little bit maybe that zoomed in.

  • So four file in files will count.

  • Plus, it was one.

  • If that count is greater than our max files, let's just go and break out of this l if ah, we're going to use dot pie in file.

  • Let's do a try except exception as evil French String E there and we're gonna try is with open O s stop half dot join.

  • We're gonna join the path and the specific file.

  • Um so path is just the path all the way to that file, and then you just have that actual file name itself.

  • So we're gonna join the path and the file, and we want to open that with the intention just to read it as data F.

  • And we're gonna say contents equals data f dot read, and then we've got that.

  • So now we're gonna input f, which was, uh, yeah, we're gonna input f dot right?

  • Our contents.

  • And actually, we price you called this input f is well, or we could call this F.

  • Yeah.

  • Okay, well, just f dot right contents, and then we'll f dot right?

  • And we're just throwing a new line.

  • Okay, So what that's gonna do is hopefully produce us, eh?

  • An input file.

  • So let's go ahead and just run it real quick and see what happens.

  • That's fine.

  • So at least one of them didn't work.

  • But our input now is clearly a bunch of python code.

  • Very well.

  • So now what we can do is attempt to train on on this data.

  • So, uh, what I'm gonna do is come back to our trained up I file, and basically, all we want to do here is we're just gonna change, like, one thing.

  • So we'll come back to this main directory here, trained up pie, open the command prompt in them, basically just gonna say python trained up pie.

  • And then instead of, um, basically the data during needs to change the data dirt equals.

  • And in this case, it's data slash pi code.

  • So the data dir justice, Whatever.

  • Contains, um, your input dot text.

  • So for us, that's data slash pi code.

  • So let's go ahead and run this one.

  • Now, maybe Did I not hit her?

  • I just wanna make sure it works.

  • And then I'll do the same thing I did before.

  • All just posit.

  • Well, it trains.

  • Yeah.

  • So I'm deposit now?

  • Well, that's training, and they will pick back up.

  • I'm not sure I'm gonna do all 50,000 or 50,000 steps.

  • Rather, we might posit a little earlier than that, but we'll see.

  • All right, so we're at about 35,000 steps.

  • Here is the current train.

  • So the blue one is the model that we're training right now.

  • So, interestingly enough, the loss went significantly low, uh, lower than the other one, which is I'm not really sure.

  • What?

  • Why, why that would be because we use the same parameter is just different data anyway.

  • Um, interesting.

  • So I mean, it's easier to learn problem.

  • So what I'm gonna do is I'm gonna let this hit 36,000 then I'm gonna stop it, and then we're gonna sample.

  • We've rude.

  • Okay, so that's saved.

  • So I'm gonna cut it.

  • And then now let's just do a python sample dot pie.

  • Uh, and then let's do n equals 1000.

  • Let's see what we've got.

  • Wait for it.

  • Cool.

  • Okay, so first of all, this is a little hard for us to read just in the consul like this.

  • Uh, so one option we have is to just output it, But, uh, at least right now, I mean, it looks pretty good, but what we could do is we could just give it an out.

  • So, uh, let's do 2500 and then let's just do it out dot text.

  • Actually, we should on out that pie.

  • I know.

  • Why didn't do outdoor pi?

  • Let's just I don't want to stop it.

  • Just you, uh it doesn't really matter.

  • Um, let's just do I'm just gonna change it to outdoor pi.

  • And now, now that's running me.

  • Open up our let's see, Where are we?

  • Pi code here than here.

  • We shove it out.

  • Yeah, area.

  • So now let's open that in sublime and cool.

  • So, uh, we have some quote issues here.

  • The mountains common.

  • I like redo it.

  • So if you like, for example, I bet if we fix, let's zoom out real quick.

  • Um, So, like, somewhere like quote and get closed.

  • Let me lessons do.

  • Let's delete this.

  • No, that still doesn't want to do it.

  • Mmm.

  • Mi re do it.

  • Let's just redo toe out dot Pie No one looks really messy.

  • I've seen to do much, much, much better than that.

  • Obviously, it's not gonna get syntax perfect.

  • Also are sequenced.

  • Length was only 50 so we probably would want to make that a little larger, and we'd have a little bit more success.

  • Uh, having only 50 is kind of a problem.

  • This looks a little better.

  • It looks like it never closed off.

  • It's Ah, it's Ah, Doc String, though, which again, if part of the problem is we're sequence length probably isn't long enough.

  • I mean, we can see that.

  • At least you know, it started to figure stuff out.

  • Ah, it looks like it thinks it's in some sort of class.

  • Also, these extra lines are kind of confusing.

  • Um e I think the problem is if we we probably are having new line in return.

  • So what I would do from here is Ah, let's see.

  • I think we should change.

  • I wanted to get rid of Ah.

  • So maybe what we can do.

  • Look, let's weaken.

  • Just add it.

  • Let's edit sample dot pie.

  • And then here we print data.

  • Dr Cody UTF eight.

  • Let's also let's just do with open Oh, you can't even see what I'm doing.

  • And in fact, let me just bring this easier for me to do this in Sublime.

  • So just bring this up in sublime.

  • All right, So here, printed up to code.

  • That's fine with open out dot pie with the intention to write as f f dot Right?

  • Uh, and we want to write, uh, basically dated out to code ut f a and then we're gonna do a dot replace and it looks like we have a situation.

  • Where is this not closed off?

  • Cool.

  • Where rather than just a new line, we have this, like return New line.

  • It's a rumor place, all instances, a return new line with just straight up a new line.

  • So we left out, right?

  • That and that should be good.

  • So now we'll save that.

  • Let's come back to where we were sampling and then we don't want out toe out.

  • We'll just out to the well.

  • The script now does it for us.

  • Hopefully not any errors.

  • If we don't, we'll see.

  • Let's move this aside.

  • This come over here should rewrite this file for us.

  • We'll see if it does that.

  • I also printed out for us.

  • Please tell me.

  • Yes, it did rewrite it for us.

  • Um, part of me thinks that we should just to like a longer one.

  • But anyway, I think we're gonna have a lot of issues with, like, i Wonder line unexpected in debt and stuff.

  • Really?

  • Why is that an unexpected in that I feel like that's valid int.

  • And, um, I just I must just be stupid or so I guess because this is not a like this is an error.

  • It's like, I wonder if I did this.

  • Oh, and then this needs to be proud, closed off, and suddenly it goes away.

  • Okay, that's weird.

  • That would call that an unexpected in them.

  • But I guess it's because this wasn't fully done.

  • Any way we can see that this is looking.

  • It's clear that it's python code.

  • Ah, but the problem is, it's ah, it's really, really messy.

  • And, um and and and then we obviously we have a lot of instances where we've got, like, these docks strings that never close off.

  • And the reason for that is it's 50 characters is over really quick, and so I think it very quickly forgets that it was ever in a doctoring to begin with.

  • So my line of thinking here is that we actually want to train with the sequence length that is much longer than the stock 50.

  • So, uh, so what I would suggest we do is go to training and instead, uh, do a much longer sequence length now, depending on your jeep, you you may or may not be capable of doing this.

  • Um, but the following is the command that I've already run, so I'm not going to rerun it.

  • Ah, but it was this python trained up.

  • I I changed the size of the network.

  • I actually don't think that that helps much in this case.

  • I didn't notice that it necessarily helps.

  • The rial thing that seems to be helping is sequence length.

  • Okay, so I actually already trained this model.

  • If you want this model, train it for 97,000 steps.

  • Um, which is a lot out of sequence length of 2 50 and I just trained it basically overnight, and here's just like a little snippet.

  • I'm just gonna copy and paste it.

  • But again, I have hosted this model, so it's on Python pergament.

  • It's in the text based version of this tutorial.

  • Just go down to the bottom.

  • Um, in the model is there for you.

  • So if you wanna see it, play with it, train it further.

  • Whatever.

  • Awesome.

  • But this is the result and are at least a sample.

  • So as you can see, it actually could figure out Oh, I'm still in a dock string and finish the doctoring.

  • Interestingly enough, this looks like more than to 50 characters.

  • Um, and maybe it was able to figure out Oh, I need to close it because of, like, returns and then it, like, realized.

  • Okay, I'm done with the doctoring and then closed it.

  • That looks like more than 2 50 Anyway, um, this looks really good, Like, obviously there's some errors here, but ah, you know, that's clearly python like to me.

  • I just think this is super cool because, uh, because it's definitely learned so many things, like, even just here, um, you can tell.

  • First of all, it knows it's a built in method with this underscore here.

  • And I'm sure you know somewhere, least it thinks it's a method so itself and its passing all these things and the docks during I just think it's hilarious that it that it learns Doc strings, um, it knows about like I said, look clearly about like itself and stuff like that.

  • And then you've got unfortunately, you've got some things that are being passed to this method and not being used, which is funny.

  • Um, and then here, like things are being used that weren't past.

  • But what we what we do have is first of all, it's honoring.

  • Is it honoring Pep eight there, too?

  • I think it is.

  • Lissy, Is this Ah, yes.

  • Oh, this is mad because this is too long.

  • But you can see here it's actually put the parameters on a new line because it was gonna violate pep eight.

  • Probably.

  • Anyway, that's money.

  • Um, but it learned all kinds of things.

  • Like to find that it's a name, parentheses, Oftentimes that first parameter, if it's a method, is gonna be self and the coal and new line and then it learned all of the white space.

  • Like just the fact that I learned white space is pretty cool.