Placeholder Image

字幕列表 影片播放

  • what's going on?

  • Everybody.

  • And welcome to video where we're gonna be going through some of our Google takeout data.

  • If you're not familiar with Google, take out.

  • It is a way you can go to Google like your Google profile and download all of the data that Google has on you, at least through Google Service's.

  • So if you want to do that, it can take a while to actually get the archive.

  • But if you log in, you go.

  • There's like a blue, you know, account top, right click on that.

  • Then on the left hand side there's data and personalization.

  • You can click on that and then from there, scroll down just a little bit.

  • And then you can download your data and then you're gonna pick the things that you want.

  • Now you might go through this list and you might think you know what those things mean.

  • But you probably don't so just get it all.

  • So you get a good idea of all the things that Google has on you because you would actually be kind of surprised.

  • Or at least I was like there were some things that I thought I understood, like I thought I knew, like what kind of data Google had on me, but I didn't.

  • And so it's just a good idea.

  • Just grab it all.

  • Just so you have a full idea of all the information that they have.

  • So it's even like little stuff like they've got, like, purchase history, for example, even though you might not shop on Google but Google persons, your e mails and from parsing your emails, they've got purchases history from, like confirmation e mails and hotel booking emails, all that kind of stuff.

  • They're just extracting that by parsing your e mails.

  • So stuff like that, um, you might not realize they have eso anyway.

  • Definitely, I think just get it all.

  • But mainly we're going to be focusing on the my activity section, uh, least to start.

  • But later on, if you guys wanna suggest stuff that you want to see us go over, feel free so we might do other stuff.

  • So finally you'll create your archive, and this will take like, at least for me.

  • It took a few hours, and you'll get like a confirmation e mail and stuff like that.

  • Like for security?

  • I can't think of a much more scary email to receive them.

  • You've requested your ghoul takeout data.

  • And if you weren't expecting that email, so anyway, cool.

  • So that's how you get the data.

  • And then once you get it, you'll get like it.

  • Take it like you extract it and you'll have a takeout directory Kind of like this.

  • This is not the full one for me, but just some of the stuff I wanted to point out.

  • So there's that purchases and reservations thing I was talking about location history.

  • This is you tracked everywhere you go based on your phone, but not just everywhere you go.

  • It's not just coordinates.

  • It's Are you walking?

  • Are you staying still?

  • Are you in a car?

  • I didn't think they were doing that, but they are, and based on that, you can.

  • You can then extrapolate information.

  • So based on that, you know where the person because, like where they sleeping at night?

  • Basically, that's probably home.

  • Where do they go every day?

  • Monday through Friday, Similar times.

  • It's probably the work, you know, or school or something like that, Uh and then in based on the coordinates, they can pretty easily find out what's at those coordinates and stuff like that.

  • So anyway, like, I remember the 1st 1 of the times I moved, I went to the same store, like, twice in a row or something, and Google just assumed that's where I worked.

  • It was really weird, but my phone was like, You need to go to work soon.

  • One day and I was like, What you talking about?

  • I have a job.

  • So anyway, um, yeah, so we're interested in my activity inside here.

  • There's stuff like Android again.

  • One of the things I didn't think they tracked was like, every app you open look, they just track it.

  • Um, all right, cool.

  • And as time goes on, like all this stuff, like on its own or alone doesn't really sound that nefarious.

  • But like then you start realizing like all this stuff adds together to basically be your whole life, and it's kind of creepy.

  • So anyway, on that note, we're gonna be going through our entire search history and hopefully, um, you know, first of all, just as a four warning, you guys don't get to comment on my search history unless you post your search history.

  • So no making fun of me.

  • So anyway, this is what it looks like.

  • It's an HTML file, which is kind of weird because other stuff like, for example, the location data.

  • Let's just pull up that, um that's a straight up Jason file, so I'm not really sure why this one is an html file on, like, some stuff is Jason and stuff I don't really know, but anyway, uh, so we want we actually, in this case, we're gonna have to pour ce this HTML file, which is kind of a pain, but versus, like, the location history, like getting through that is like a breeze.

  • You just important, Jason and you go on, in this case, you could use beautiful soup or something, but we're just gonna do some stupid splits.

  • I think I think that's the way to go.

  • Um, so the first thing we're gonna do is just something really, really basic.

  • We're just gonna run through all our search queries, split by word and then just start looking at one of the most frequent words.

  • And then what we can do is, you know, you could do that overall, but there's like, five years of search history.

  • So then you could do like a daily moving one year window.

  • So, like every day along the way, what was the previous year's worth of search history?

  • And what are the most common top 10 words, for example?

  • And then And that should probably give us, like a good overall, like macro understanding of major interests of ours and major life changes and stuff like that.

  • And then we could go smaller, like month, a month window or a week window and figure out, you know, micro things that were going on just a TTE that time.

  • So anyways, pretty cool.

  • So that's the plan.

  • Later we could do we more advanced stuff like Dio like word vectors and get general concepts that you know I'm interested in or you are whatever we could do lots of really cool stuff.

  • One of the other things that I saw that I absolutely have to try is they also have all of your Google assistant, uh, data.

  • So your actual translation is mapped to the audio, which is basically a text to speech data set waiting to happen so and also a speech text so pretty cool but mainly so you can create a, uh, text to speech.

  • Uh, but this sounds like you, uh, Also take note.

  • Google could do the same.

  • Great.

  • So an you ate, Let's jump in.

  • So first of all, I'm just gonna make a new Ah, a new folder.

  • I'm gonna call this G data.

  • I'm gonna put take out into G data, and then I am going to figure out why my mouse isn't showing up.

  • Hopefully, well, I can't keep my mouth.

  • Okay.

  • Cool.

  • Um, all right.

  • And then what I'm gonna do is file I'm gonna save, and I am gonna put that in desktop G data.

  • And then for now, I'm just gonna call this search database stock pie search data.

  • Awesome.

  • So that's all we're gonna do.

  • We're gonna parse through this, save the times.

  • We're gonna convert those times to you next time because the date stamp is gonna be kind of a pain to work with.

  • Um, we're gonna split by word and then save that into our database and scratch all of our itches.

  • Okay, let's get started.

  • So, first of all, we're gonna need imports.

  • Sq light three for the database.

  • We're going to front?

  • Well, actually, probably this file.

  • We don't need the collections.

  • I was gonna bring that.

  • And, um, let's go ahead and bring We're gonna use from Tiki import teeth hue, T Q d m uh, a security.

  • Uh, and then we're gonna do import date.

  • You tilled up her, sir, and we're not gonna need Jason still scratches.

  • I'm sorry, guys.

  • I got, like, allergies or something.

  • Um, I don't know.

  • We'll just get started.

  • So first of all, let's get thes search activity activity, uh, location.

  • So I'll just do take out my activity.

  • Serge, take that copy.

  • Pasta slash my activity dot html Don't know if he's back.

  • Slashes will actually create a problem in this case, but I'm gonna just fix those manually really quick.

  • Then what?

  • We're gonna after this, we're gonna create our table.

  • Or actually, first we need to make the database, which is actually super simple with us.

  • Cute light.

  • You just connect.

  • And if it doesn't exist, boom, It's created.

  • So let's do that.

  • So we're gonna say, uh, con equals sq light.

  • Three dots connect, and we're gonna save.

  • This is my life dot database and then C equals conduct cursor.

  • Awesome.

  • Define make table.

  • Now what we're gonna do is seed on executes, and we're gonna create table.

  • If not exists, the table will be cold.

  • Words and basically words is gonna contain e.

  • I think we'll still we'll have a primary key.

  • Why not?

  • I d into juror.

  • That will be a primary key.

  • I d.

  • We're gonna be UNIX.

  • That will be a really We could also go with energy about to say riel.

  • Uh, and then the actual word itself, which will be a text type.

  • Um awesome.

  • OK, so that's our table.

  • Can I please listen?

  • Uh, what's what's Margie Indentation?

  • Canape?

  • I'm so glad that this thing is like, That's that's violating Pepe.

  • No tabs allowed.

  • Let's fix that really quick and then using spaces.

  • Fix this stupid idiot.

  • Awesome.

  • Okay, so make table.

  • So let's go ahead and we'll run.

  • Uh, make table.

  • We don't actually need to run that now, though.

  • And now what we want to do is work on search data.

  • So defined search data.

  • Uh, what we wanna do here?

  • The first thing is we're gonna split by word, but we don't want all the words like a the and these air words we don't care about those in the n l p community are called stop words.

  • These are just words we don't We just want to kind of toss him out.

  • So the first thing I'm gonna try is stop word list and list lt Kay's list of English stock words.

  • Heck, yeah.

  • That is not the format I wanted, but up.

  • Here we go.

  • Here's a list, and you wanna want the above list, as in Ray.

  • Here you go.

  • Yeah.

  • Thank you.

  • Because the other guy obviously isn't a programmer.

  • Union, huh?

  • This looks like a great list.

  • I'm taking it.

  • Boom.

  • Copy.

  • Nice.

  • Big long list s.

  • So we're going to say stop words equals bang.

  • Um, it's kind of a is not the greatest list of stop words.

  • Throw has meaning.

  • Wonder also has meaning.

  • Seriously Has meaning.

  • Can it get?

  • I think I'll use this list, but it depends on your depends on what you're trying to do.

  • I mean, some of these air actually meaningful words.

  • I'm not sure you'd want to toss them out.

  • Um, in our case, I think it will be fine, but it's interesting.

  • Some of these.

  • I'm just not sure I agree, but the show must go on.

  • So now we want to do is we wanna open.

  • We've already closed it, but we want to open that HTML file.

  • We want to split for searched for I gotta open it because I gotta figure Is it nice?

  • Um, we gotta split for, uh, because we don't care about this.

  • We care about searched for So we want to split for that.

  • And then we want to parse out each word in that link.

  • So pretty easy task.

  • So searched is so now we want to do is open that file.

  • So what we're gonna say is with open, um, the search activity with open search activity with the intention to read as f contents will be eagle toe f dot read.

  • Then we want to split those contents and iterated over those contents.

  • So we're going to say is four item in con tents.

  • Don split.

  • Um, and we want to split by searched for with a capital s for I am in cars.

  • Okay, cool.

  • Well, let's just print item, print item, and then we'll break because we're gonna do some development.

  • here.

  • We don't want Tol.

  • Well, we made the table, but we pride didn't do search data.

  • We did not search data straight.

  • One more time here.

  • Car map, blah blah blah, blah, blah.

  • Okay, fine.

  • We'll open it with some encoding.

  • I wish this was somehow easier to get back over here.

  • Uh, here, uh, and coding equals and we'll make that utf Edel's animals who have beautiful.

  • Not that beautiful, but this what we wanted.

  • Um oh, so when we split by search for actually the zero with element we don't actually care for.

  • So let's do one colon.

  • Try one more time.

  • Beautiful.

  • Exactly what we were hoping for.

  • So now we're gonna do some pretty ugly splitting here, so I'm going to say search, underscore shrinking equals item not split by that.

  • That escape that there.

  • Um And then we'll be the first if don split by the closing link tag here.

  • Split by boom.

  • And in that case, went zero with.

  • So now let me print Search string.

  • Beautiful.

  • Just beautiful.

  • Now we want a purse out the date.

  • Where is date?

  • Here.

  • So it should be the first break.

  • Someone's gonna be like this is so ugly in the comments.

  • Whatever, bro.

  • Uh, day equals item.

  • Don't split Break.

  • You did this so stupid.

  • Look, you could have done it this way.