Placeholder Image

字幕列表 影片播放

  • What's going on, buddy?

  • Welcome to part five of our chap, but with python in Tensorflow tutorial, Siri's in this tutorial.

  • What we're gonna be doing is hopefully actually putting things into our database.

  • So, uh, let's go ahead and get started.

  • So basically, where we left off was here, and we wrote our final bit of code was finding out whether or not the data was acceptable in the data.

  • Here, in this case, is going to be the body text.

  • Basically the text of the comment Should we even consider it?

  • So, um, initially, I've always actually put this right before all in searches, because that's like the order of operations in my head, but actually score a super cheap, so we might as well do that.

  • So if the score is greater than two, Sure.

  • But then what we're gonna do is I say, if acceptable, uh, body, then let's do these other operations.

  • I think that's a better way to do it rather than go through all of these operations and then ask, uh, toe, look for business.

  • Go through all of this only to find that the comment isn't acceptable.

  • Although most comments are probably more than acceptable, so I don't know, not really sure what the best way to go about it is to be honest.

  • Anyway, we'll do that.

  • It should be pretty cheap to do it anyway, So that's all right.

  • And really after business a at after the score is high enough.

  • We're pretty.

  • We're gonna insert the question is, are we gonna insert this as a new row where we're gonna insert this azan update?

  • So no matter what we're gonna insert, no matter what that means, we're gonna ask this question.

  • So I suppose this could save us in theory, a little bit of process and living long.

  • Um, well, I guess in this case, it would like if this was not true.

  • Whatever.

  • Anyways, um, the show must go on.

  • So if the score is greater, then the score and the comment itself happens to be acceptable.

  • Then what do we want to actually D'oh.

  • Well, if this is all the case, then we're gonna go ahead.

  • And, um, we would like to do a SQL insert.

  • Replace rule, please.

  • Comment.

  • Okay.

  • And for now, I'm just gonna leave that there.

  • Um, I'm just gonna put this year so I could quickly find it.

  • Now, if there isn't an existing comment score, what we're gonna want to do is come down here and we're going to say else.

  • Um and we've already checked if it's acceptable, so then we're gonna ask if parent data, Um, so if there is a parent that we have data for, then what we want to do is SQL insert, uh, has parent and then and then otherwise else SQL insert no periods.

  • Okay, so that's what we were basically of these three different functions that we're gonna work with now.

  • Chances are they're probably better way to do it from, but that's what I'm gonna do.

  • So, um, so now we're gonna pass the information toothy.

  • So basically, if we're gonna be replacing something, that means we're gonna take comment.

  • I d We're gonna take a parent.

  • I d We're probably gonna take the parent, uh, data because we got it the body.

  • So it's gonna be the new comments.

  • This will be parent.

  • This will be the reply.

  • Um, and then subreddit created you TC and then the score.

  • So that's if we do have, um, we have an existing comment already in place, and we're gonna update it because our new score is higher.

  • Now, if we do have that parent in our database so we have information on that, we want to go ahead and insert with that that information that we do have.

  • So in this case, it's gonna be again comment I d.

  • And in fact, trying to think, if that would be like it should be ever should be the exact same.

  • I'm pretty sure take the body.

  • It should be exactly the same.

  • Data is here because we have all that information.

  • It's really only this one here with no parents, that we don't have anything that no parent body information to throw in.

  • So this one would be comment I d We do have a parent idea because everything has apparent idea.

  • If it's a top level comment on it has no parent comment.

  • The actual parent is the threat itself, though the reddit threat.

  • So anyway, parent, I D.

  • But we don't have parent data, but we do have body subreddit created you TC and sport.

  • So why would we insert these if we don't have a parent information to go with them?

  • Well, because this comment might still be some other comments parent that we want to get the data on.

  • So that's why we actually still want way still want to store that information?

  • Um, yeah.

  • So now what we have to do is actually build all three of these, um, inserts.

  • So they're all three pretty similar.

  • Uh, part of me wants to build them all.

  • I guess we'll just build them all together.

  • Yeah, let's just, uh, just throw it down here.

  • Like I said, there's probably a better way to do this than create them like this, but I'm gonna go this way.

  • So defined SQL insert place comments.

  • Um, we've got common I d We've got parent I d.

  • We've got parent comment, subreddit time and score.

  • And then again, in this case, let's say, um just in case we hit an issue, we'll try except exception as e.

  • On and then in this case will say print, replace comments, and then whatever e waas look, you know, if you have to string, you're not gonna go and pass that the I don't think we have any other ease just beside the stuff that we commented out I've been playing and go too much recently.

  • I can't remember if you have to throw a straight or not.

  • Anyways, um, yes.

  • So we want to have that information there.

  • So now we're gonna say SQL equals and again, we'll just use reports here.

  • And then we're gonna say update updates a parent reply set.

  • Undecided.

  • If I really want to write all this out or if I just want to post it And then I think I'm just gonna copy and paste this.

  • I'm not sure what gain we're gonna have by writing out all these queries.

  • Um, I'm gonna copy impatient, so I'll put a link in the description to the text based versions tutorial for Forget someone, remind me.

  • But if even if I do forget, it'll be live on python programming detonate, so you should be able to find it.

  • Um, yeah.

  • I just don't see any benefit to writing all this out.

  • So anyway, here we have the three functions.

  • So basically, what's happening is in this case, uh, first I want to update this.

  • This should be, uh, yes, I don't know, update.

  • And then we'll call this one parent.

  • No parents.

  • Okay?

  • So basically, what this is gonna do is if, because it has, it's just gonna overwrite.

  • Basically.

  • So what we want to do is we want to overwrite, um, all this information, basically, where the parent i d.

  • Was whatever that comments parent, I d was cause So basically, any time we've got that parent, I d any reply to that parent comment?

  • We want to make sure that's the new comment that has a better score on then.

  • SQL has Painter insert has parent.

  • Basically, what this one's doing is just Ah, we're just inserting where there was apparent I d or basically what we're saying.

  • We're inserting a new row right where we have the paranoid.

  • But we also happen to have the data for that parent.

  • So we're inserting information about that parent body basically, and then this one were inserting.

  • There was no parent, But we wanna have the paranoid He just in case somehow maybe it was out of order, but also mainly were inserting this one.

  • So we have parent information for another comment whose parent might be this comment.

  • Okay, Um, yeah, we just saved pride 15 minutes doing doing it that way so anyways, But if you have any questions or whatever, you can feel free to ask.

  • But it's all pretty simple.

  • SQL queries there.

  • So now the last thing I want us to go ahead and do is so we can actually press go on this script to make sure it works is defined the transaction builder.

  • So, up to this point, um, we've been, um we've been making you know, we've been building these queries, but as you can see here, we're using Transaction Builder and passing the SQL.

  • So now we're gonna do is add a final little helper function and that's going to be defined.

  • True.

  • Not in all caps, though.

  • And in fact, let's just copy that Define transaction builder.

  • It takes in SQL statements and then it there's something is gonna piss some people off and it's going global.

  • The SQL transaction says global global ing the this variable here.

  • So we're going global that that way we can because we said we'd be in stuff things into the SQL transaction, but eventually we want actually cleared out.

  • So we're gonna global for that reason.

  • So now we want to dio is, um Well, come down here.

  • And so we're gonna take in some sq on.

  • Basically, what we're gonna do is we're just gonna keep building this transaction until it's over a certain size.

  • Um, So what we're gonna do is we're just going to say SQL uh, transaction dot of hand, the SQL statement.

  • So we just keep depending these SQL statements to the transaction.

  • And then there's will say if the length of the sq alot transaction is greater than 1000 you could choose different numbers of thousands.

  • Not going to be all that much slower than 10,000 by the 1000 is gonna be a whole hell of a lot faster than one or 10 or something like that.

  • Um, anyway, si dot executes.

  • And then to do this Thio insert like a bulk statement you need Thio, begin trance transaction.

  • Here we go.

  • Um, so So we start to transact transaction that way, and then we're just going to say four s in sq.

  • Oh, transaction.

  • So for each of those little SQL statements, what do we want to do?

  • Well, we're gonna try to see dot Execute?

  • Yes.

  • Otherwise, we're just gonna accept and, um, commit unholy sin of past, and then we're all done.

  • Once we've said execute all this stuff we're gonna run to commit, you also could execute, commit.

  • But I'm gonna go and just connection not commit, because there's a method for us.

  • And then once we've done all that, what we want to do is SQL Transaction equals nothing.

  • We want to empty it out.

  • Who?

  • Okay, um, looks good.

  • Very good.

  • Eso Now we should be ready.

  • Let's go and say that we should be ready to actually run this code.

  • Um, and see what what we have.

  • So let me see if deposits and then I'll pull up where the database should be.

  • All right, If I did everything right, which I'm sure I didn't, um this database here should start to be populated, so let's go ahead.

  • I'm just gonna press that five to run this, um you know, we've done everything else that we really wanted to D'oh.

  • Um, I guess one thing I want to add before we get too deep because I just I'm gonna not run this whole thing on video.

  • I just really want to make sure it works.

  • Um, is basically in this four loop will just come down here.

  • I think that's right.

  • Should be.

  • Ah, what?

  • Two one tab to tab three tabs over 123 Then what were you say is if roe counter module 0 100 1000 equal zero.

  • So for every 100,000 Rose was going to print, um, total rose red peered rose time, and then we'll go ahead and format.

  • Uh, the row counter.

  • Pierre grows Pettigrew's.

  • We also didn't do the code for the pair grows anyway.

  • Well, at that moment and then string daytime dot now So apparent road is basically the only time.

  • You know, if we're updating an existing comment, um, that's not a new pair, so we don't really need to do it.

  • There also.

  • Are we OK?

  • We are in committing the row counter.

  • Um, anyway, if there is parent data, um, that means we're inserting, and this will be the first reply we've got.

  • So really, it's it's after this one that we definitely want to say paired Rose plus equals one.

  • Uh, but then here, when there's no parent, that's not a pair.

  • So we don't really need anything else there either.

  • So we really just need to throw it in that 11 Okay, let's run this and see what?

  • Ares?

  • We've got an invalid syntax.

  • Not a surprise.

  • Uh, here.

  • That's not something we can say.

  • So he could be a double equal assignment versus comparison.

  • Let's go again.

  • Do we?

  • Wow.

  • I mean, the same exact mistake.

  • Okay.

  • Comment.

  • I d is not defined.

  • So I got this air Here s cable.

  • Insert parent comment.

  • I d Did we not do a underscore?

  • Possibly.

  • Oh, we just simply never defined the comment.

  • I d, um Interesting.

  • So let's just say comment.

  • Underscore i d equals ro.

  • And this one was kind of funny.

  • I think it was road name.

  • So let's try this again.

  • Who's, uh And it does appear that the database is growing in size.

  • I think I'll just wait until the very 1st 100,000 rose.

  • Um, I forget what it's called as it s sorry, viewer, I forget the name of this is called for viewing the database anyway.

  • Okay, So for the 1st 100,000 Rose, we've paired 3220 comments.

  • That sounds about right.

  • Especially because those first batch is not gonna have too many parents for sure, historically.

  • But as we continue to build, we should get a few more pairs per 100,000 Rose.

  • Uh, anyways, I'm gonna go ahead and stop this and make sure our database looks appropriate.

  • Open that up.

  • Okay, uh, let's browse the data.

  • Okay?

  • That's a correct comment.

  • I d That's good.

  • So, as you can see here, um, this is just our database.

  • Basically, um, handheld.

  • What's a Vape, huh?

  • Why not reduce the volume of the tank night here?

  • This one should make a little more sense.

  • Hot.

  • Have you seen her pick?

  • She's supposedly 31.

  • Probably a meth head.

  • Uh huh.

  • Like a red.

  • Okay, um, don't tell me when to sleep.

  • You're not real.

  • Okay.

  • Anyway, clearly this is working subreddit.

  • Okay, uh, and then score and all that so, Sure enough, all of these scores are higher than one.

  • Uh, anyway, so we could keep running that keep building a massive database of parent reply information.

  • Basically, what you should do is at least run that for the entire 2050 May 2015 data set.

  • But if you really want to have a good chat by a private and continue to build that even even larger than that, for example, that the current chat, but was built off, like 20 million pairs.

  • So So quite a bit larger of a day set.

  • Then what?

  • You're going to get off just this one, Um, this one dump the newer stuff that we're the other model that I'm gonna be working on those might require a little less data.