Placeholder Image

字幕列表 影片播放

  • All right.

  • Hello, world.

  • This is CS 50 Live where we do all kinds of things.

  • We program things from scratch.

  • Look at the technical concepts.

  • We looked at Callie lyrics last week.

  • Yeah, that was super exciting.

  • Time ago when he was holding.

  • This is CS videos.

  • Nick.

  • Wanna go?

  • What we talked about today?

  • Yeah.

  • So today I think, as advertised, we're talking about images and machine learning and classifying some of the three like key buzz birds.

  • They're just kind of a fun, non buzzword, but machine learning's out there classification.

  • We'll actually be doing a little bit unsupervised learning as a post of supervised on things like that, I think.

  • Previously on the stream we talked about a binary classifier that could tell you if something was like thinking cartoon.

  • That was your first was the very 1st 1 Very first thing they did with that time.

  • The episode like four fire my, uh, time out.

  • I think the goggle joke is back on the people who actually put up a website.

  • That's fantastic.

  • Somebody somebody asked, we're gonna have captures for this video, and we are in the process of captioning.

  • All of our street was actually.

  • So this one will have captions eventually.

  • Maybe within a couple of weeks or so and shot us all the people that air currently, that's really appreciate it.

  • Jets very popping way.

  • Captured my incessant laughing say is Ha ha.

  • Laugh and be like your shirt, by the way, is very ethereal.

  • Thank you.

  • Yeah, I feel very light today.

  • You know, like, it's kind of a nice ish day.

  • There's a really nice and Sunday was awesome.

  • True.

  • True reason.

  • It has been very wise to me.

  • 70 messages is freezing, Dad, by the way, what I want?

  • Yeah, but sorry.

  • Sorry.

  • Oh, it is beautiful.

  • Uh, so we talked about pontification that ties into sort of a way.

  • Right.

  • So we classified kind of between two.

  • Just different groups that were completely non intersectional on.

  • And now we're trying Thio, given a set of data, can we figure out how many classes there should be?

  • More or less.

  • Is this acceptable?

  • Leading to it.

  • We sort of publicized K means clustering.

  • Yes s okay means, uh for those who are kind of unfamiliar with the term, it's a form of unsupervised machine learning.

  • Unsupervised meeting.

  • I don't tell it what the right answer is.

  • It just figures out what its guess as to the right answer is.

  • And in this case, that manifest itself as their K clusters, if you will, of data.

  • And so let's say I have my entire data set.

  • And actually, we'll just talk about the data set we use today, which is I think, around 40 or 50 images.

  • So really not that much.

  • Nothing much data, but it is 40 or 50 images.

  • And there I think four or five classes on those 45 classes are the different streams that we've done that had the same little box.

  • So you took multiple screenshots of one screen throughout one stream.

  • So within each cluster, they should be around 10 images.

  • And those 10 images should be all from the same stream.

  • Um, so essentially, what our classifier will want to do what a k means classifier will dio is it says Okay, I have this large dataset and Tuesday I'm going to just grab a bunch of random pieces from the deficit and those will be my initial clusters.

  • So the initial means air pretty random.

  • They're all roughly the same issue?

  • No depends on the distribution.

  • You got up there pretty reasonable.

  • So we just kind of grabbed Saman images, Throw him into a cluster and we say All right, now tell me which image?

  • Eyes, I guess.

  • Actually, for each image which cluster is that image closest to by some definition of distance.

  • So on the string today will probably use Euclidean distance.

  • It's pretty common.

  • It's really easy.

  • Geometrical x y I just do like X minus X will not squared.

  • Plus why minus why not squared?

  • And I take the square root of those sums Those some together and that works at a distance formula.

  • Right?

  • The linear just your favorite.

  • Like like some of the great concept for a collision detection.

  • Yeah, he's actually looking game circle.

  • Yeah, but same right.

  • Same concept on dso will basically be doing that on every iteration of the algorithm.

  • It says, Hey, you know, for each image what cluster you closest to And then I'm gonna stick you in that cluster.

  • And once you're in that cluster, I'm then going to re compute the new mean of the cluster on DSO things that were kind of naturally closer to some set of images, they're going to end up kind of clustering themselves, and those means will get closer to the images that are all in that cluster.

  • As you iterated over the algorithm on one of the benefits of K K means clustering is, it actually doesn't take that many iterative steps.

  • You can usually do it in a couple like 10 or so.

  • So you really just kind of straight over, you know, on the order of 10 times.

  • And after that, your loss doesn't really get any better.

  • You tend to stick around.

  • It depends on your situation.

  • Depends on what you built it and come in the random ization parameters you have in there.

  • But in general, you don't need that many iterations to get a pretty accurate representation of what things look like.

  • One of the kind of cool parts of K means is I have this kind of center of a cluster on DSO I would do generally just referred as, like let me an image for that cluster.

  • And those mean images are often representative of what is in that cluster.

  • So let's say one cluster is from our Callie, one extreme then if I take whatever the meaner the center of that cluster is, it should be pretty representative of what we on average, looked like in that stream.

  • Now I haven't We haven't tested these, right, So we were pretty sure that it'll work reasonably well, but we leave ourselves a little untested to kind of verify that things that kind of funny.

  • And it gives you guys some room Thio experiments, I guess.

  • And we get to experiment and show you guys cool things.

  • So you will see some kind of entertaining stuff with our K means clustering.

  • But our prediction is roughly that.

  • You should see what we looked like on average on a given stream and given a new image, it'll tell us which cluster it should have belongs to.

  • And it'll been classify for us.

  • Do we usually, like, manually choose the start of the cluster, or do we just kind of randomly or how how does how do we get to the beginning?

  • So that's a great question.

  • A lot of times we actually initialize it with just completely random beginnings.

  • So we just say, Hey, here's my data set.

  • I'm gonna use like a numpty application to randomly pick some set of images or randomly subdivide the images into these clusters so our initial can be random.

  • But there is another version of this that is called semi supervised learning.

  • So let's say you were able to label about half of your data before your research in turn, quit on and buy You labeled your dad.

  • I mean, your intern labeled half your data and then they quit.

  • So you only have you have the status set of, you know, a 1,000,000 images and 500,000 of them have been labeled manually by some poor college student, and the other 500,000 have not been labeled.

  • So this is where if everything was labeled originally, if all 1,000,000 images have been labeled, then we would call that supervised learning because we know the answer.

  • And then we have some test data set or some prediction kind of way of accumulating things that we want to predict upon and those air unlabeled.

  • But we're going to use our trained model to then predict on those now in semi supervised learning.

  • We have this where maybe half our dad has labeled and half of it is not.

  • And so we take maybe the half that's labeled and we assigned them to the right clusters.

  • So this is where we actually start with.

  • Like, um, all of this cluster.

  • We know this is the right class.

  • All of this cluster is all in the right class and so on, but the other 500,000 images, we don't know which cluster they belong to.

  • So well, then iterated out rhythm over those images and say, Hey, which cluster you closest to And so that can be a way of, like, tightening up what you're a k means classifier will actually do.

  • But it gives it more data, which allows it to spread over, you know, different kinds of variants or whatever the underlying distribution of those things are.

  • SGST before was saying even the second guy, you know, I was actually I just We had a bug with our Facebook deployment, So I was just focused on getting us stream to Facebook, which we now are, I guess, as we were initially published publishing this O B s sort of bugged out and caused the fate extreme.

  • No, no, I was sorry.

  • I was like zoomed in on getting that focus.

  • But he s air here.

  • She asked another question earlier.

  • Which was, um where was it?

  • I'm proficient, Python, but know nothing about him.

  • Is this dream for me?

  • What do you say?

  • Yeah, I think this is a very reasonable stream.

  • Even if you're not super great in pylon or if you're, like, very adequate, I would count myself as, like, decent in python.

  • This would still be pretty reasonable for you.

  • So our goal is to kind of a day beginning.

  • We're gonna do this kind of high level approach.

  • Like, what are we going to go through?

  • What kind of the concepts?

  • But we're gonna repeat those concepts pretty much throughout the stream so that you really get a sense of what we're actually talking about.

  • S So, for instance, if you're really like, uh, you want to do this yourself, then you goat along.

  • We should be going at a reasonable speed is always positive.

  • Video.

  • Go back.

  • Reiterate on DDE.

  • That should be easy for you.

  • If that's how you learn.

  • If you learn through listening to what we say, then maybe a different iteration will help you.

  • If this one doesn't work.

  • If this makes perfect sense to you, then that's great.

  • There are many different ways we're gonna try and attack the same problem on I saw somebody said, Please also talk about limitations of K means clustering.

  • As in, If there are two concentric circles kinds of data, then you're k means clustering might have some problems with, like identifying which one is which s O, for example?

  • Let's say I have some data set and then some of the data set that is a subset of the other of the original data set is it?

  • This clustering method doesn't really work super well for classifying between those two datasets right?

  • Eso Let's say I wanted to classify between, like, all kinds of dog versus like poodles, for example, right?

  • It might be a little bit tricky for me and actually that might not be the best example, but it's kind of the idea, which is that I have the subset of the data that I want to kind of classify as separate from the rest of the data.

  • And that could be really difficult for K means because K means by definition is really just looking at, like, what mean image?

  • Can I get out of some data set?

  • And how can I find things that are closest to that mean by some measure of distance on DSO for something that has kind of, on average, that the mean image of the entire data set is very, very close.

  • Thio the mean image of the subset data set then K means can't distinguish the two.

  • But maybe you could look at other feature discrimination algorithms, so it's a deep feeling.

  • There's a lot of things, a lot of complexity.

  • It's very interesting, but I'm very excited to learn.

  • I think a little bit more about I'm actually gonna transition thio your laptop.

  • You work to make sure on while you're doing that a TTE the very end of the stream.

  • We will hopefully have a little bit of time Thio kind of sneak peak.

  • What will go into next week, which is using an actual neural net?

  • Thio not only classify images and understand which parts go where, but also generate new ones based on what it knows.

  • And that's what we call a generative model and then potentially in degenerative adversarial network.

  • Yes and eventually we will hopefully get, I guess not.

  • Eventually, after, maybe that we will hop into a generative adversarial networks.

  • So this will do just a single part stream.

  • There have been many, many parts.

  • Yeah, Yeah, I think I think will help people to get a I think it's groundwork understanding of a lot of the pieces.

  • Yeah, And so I think that it's a very good points to recognize that there are limitations to this, that the next stream will kind of cover why there aren't necessarily as many.

  • This is a beautiful screens every day for the brand new right?

  • This is C matrix.

  • I piped it.

  • So on OS X, I piped it through wall cat wall.

  • Cat rain defies things.

  • See, Matrix makes things.

  • It makes kind of like a matrix style screensaver.

  • I'm a huge fan of using it.

  • I don't know who created it, but I guess we're kind of doing free advertising for him.

  • They're really cool.

  • It's a cool project.

  • I'm a huge fan of using it.

  • I think it looks techie and anybody wants to look like they're true fantasy.

  • My fantasy hacker.

  • You know, you have to coach just leave that on their people.

  • Think, you know.

  • So what is the first step?

  • I guess in an actual coding open tonight.

  • What are we gonna look at?

  • Rice.

  • So, you know, let's actually look at our data set.

  • Just exciting about school, too.

  • Thank you.

  • Actually, these are all from one splash.

  • All of my desktop images are usually just use the I'm learning react, and we use the stash.

  • A p a nice silver.

  • Well, very, very cleaning.

  • So we have this kind of four classes off.

  • Maybe I can make this big.

  • Is it a reviewer for important question?

  • Is that clear example?

  • Oh, this is a Snapple.

  • Eso were also advertising percent.

  • Um, this is raspberry tea.

  • I pretty much every stream grab a Snapple because a lot of times I'm running over and it helps me sound a little less awful.

  • Get parts on this tiny little bit of Cappie.

  • Exactly.

  • Just like it's, you know?

  • Yeah, in case I'm not pathetic enough, I'm not really Uh yeah.

  • I also do love revere, but I'm trying to avoid drinking too many service as a kind of interesting tangent to your K means Classifier s so we have these four different classifications of data.

  • Sorry, Father.

  • Might have been five.

  • They're only for I'm basically there's there's all I don't know that there is all it is unfortunate that a separate class like an umbrella umbrella.

  • Uh, Buster, I guess it's kind of like everything.

  • And so we basically we know the answers, but we're not going to use the answers, so to speak.

  • We're gonna do fully unsupervised learning start.

  • But if you kind of look through some of these images, they're literally burn way lately about what's that he's Yeah, way Look great in every image.

  • I unfortunately did not have the foresight to select for appealing images.

  • But there are images of us, and so you do have many different images pulled throughout a stream and is the same.

  • I know this is the exact same dimensions.

  • Yes.

  • Yeah.

  • So I normalized across dimensions to make our lives a little easier, But I didn't normalize across how we help much of the frame we occupy.

  • Sure.

  • So I did very minimal, pretty processing in that I literally only just made sure they were the same size, the same size, the same, like pixie quality, so we have the same resolution on all of them.

  • But other than that, nothing else was done to crazily.

  • Now, these 1st 2 are fairly easy to distinguish.

  • Were wearing very different colored shirts.

  • Um, and we're in other positions.

  • However, these last two, you'll notice that we're both wearing black.

  • Uh, wow.

  • And the background is not particularly easy to discriminate against S o.

  • We kind of picked two things that are pretty easy.

  • They'll be standout outliers.

  • Almost there means should be pretty different.

  • These ones, that means don't look that different now we move around a lot.

  • And how means translated stars like machines are concerned is a little different.

  • But you'll notice that the C T F stream and our Callie Stream we were both wearing black.

  • With the exception of the white, we're gonna offer enough of the machine of my four.

  • Yeah, it might literally be that way.

  • And your glasses on that.

  • Same for human beings.

  • There's Colton has glasses.

  • There's a very shiny quality to I think both of our foreheads in the one, um and I have a logo on my shirt that definitely helped.

  • The hair is pretty distinct.

  • I suppose today it looks like trash.

  • U i e uh, everyone never has those days and still be Oh, you know, not as any suggestions of books about.

  • That's a good question.

  • Um, so there is What is the name of that book?

  • Um, have a textbook on my desktop somewhere that I really like.

  • It is called.

  • Yeah.

  • Now you're gonna see all of my other weird things, eh?

  • So there's this.

  • Bishop had a recognition of machine learning.

  • Springer 2006 textbook.

  • I have a pdf version of it, but I think that that's a pretty great book for kind of going through.

  • It requires a decent amount of math background like stats, background.

  • But if you're comfortable with, like what a CDF is how to convert between CD S P.

  • D efs.

  • What, like standard CDF?

  • Some PDS might look like how to like go through different distributions.

  • House some basic linear algebra concepts work.

  • It's kind of like different matrix operations.

  • If you're familiar with, like The Matrix Cookbook by M i t.

  • Then this book is probably pretty useful for you.

  • If that all sounded really esoteric, then that's fine.

  • I would go online and kind of look at some kind of basic tutorials.

  • Medium is a good place to start with, like some very simple tutorials.

  • Kind of like what we're gonna do right now.

  • While the Internet is amazing.

  • You guys just found that so quickly.

  • And so there's those places out of Pantene by apartheid, says try scholarly articles for ML on Google Scholar.

  • That's definitely a great place to start.

  • Be aware that there is a lot of terminology and symbolism that my er notation really that can be really intimidating.

  • At first, I wouldn't get like, try not to be intimidated by them.

  • They eventually just take time.

  • But I would definitely recommend, kind of like looking through what things were out there.

  • There's all sorts of cool papers all the time, on different things.

  • And a lot of times these articles about those papers that are even easier to read, so kind of, you know, goes you will realize a curious games asking.

  • So what do you guys do?

  • We didn't really talk about our new viewers.

  • CS 50 is Harvard's intro to computer scientist topped by David Malin, who is not present here in the stream it physically but might pop in a little bit later.

  • That minus Colton Ogden, working full time technologist.

  • And I also do this twitch streaming and a bunch of other stuff program another related things on dhe.

  • And then I actually was on CS 50 staff for two years or so.

  • And that's how I ended up working with these guys.

  • And then the rest of my time is spent being a student.

  • I'm just a full time to remind me what your I don't remember 100% if you're doing si ESAs your concentration, right?

  • Yeah.

  • My major actually have joint or a double major in bio engineering and computer wasn't sure it was very difficult to remember which one s So I'm doing both and, uh, yeah, most of my time spent studying.

  • I think if I didn't have to spend all of my time doing just peace, that's constantly problems.

  • That's for those who aren't familiar.

  • Then I would probably produce much more high quality preparation on these streams.

  • But that's okay.

  • I think it's still pretty pretty entertaining.

  • This is before, and you've taken a mile course here.

  • Yeah, and I'm also currently in our amul course, actually.

  • Okay, so I've actually taste taken.

  • Mostly, uh, like, systems courses.

  • So I focus a lot of, like, systems.

  • I really like that.

  • A visualization when I haven't gotten a chance.

  • Take a course in it.

  • Actually.

  • Wow, I've really focused on systems, so I'm fairly familiar with systems.

  • If you want a really bad version of my C programming, go watch R C.

  • Perkins.

  • Three point of that street was not my C programming, but we did end up uh uh, kind of going through some C programming.

  • It's pretty bad.

  • But, you know, at least that way you get to see me code very poorly.

  • If you think that this is really good.

  • If you think this is really poor, don't go look at that.

  • Really?

  • Uh, yeah, I really don't know where I think I just kind of got interested in ml as actually in CS 50.

  • I was like, Oh, machine learning seems really cool.

  • My t f was like, Yeah, you should go.

  • Go for it, you know?

  • And I tried some very high level libraries and enough to understand much about theory and kind of from there I hopped into a bunch of articles and I started building little examples myself.

  • Um, yeah, and I think I built, like, a small room camera that upon someone from my family, entering my room, it kind of like identifies who they are notifies me, You know, just for fun.

  • A lot of times you just build projects and as Ugo things kind of pop out of the woodwork.

  • But yeah, the atrocity essential question was, Do you have any hot takes on whether the hyper goal ing in the future of Mandela's justified just beginning to learn the language And you're going to go work for you?

  • Yeah.

  • Yeah.

  • So it's kind of an appropriate question.

  • I actually really like going and I'm not super familiar with it yet, but I do.

  • I do intend on becoming like that on becoming much more familiar with it s o going.

  • Language invented by Google sometimes referred to us just go not to be confused with the board game and go Lang is a very interesting language in that it still allows you to have access to low level things.

  • C++ style see style.

  • I can still touch memory and access primitives, but it has a much nicer Well, a lot of people believe this will set off a firestorm somewhere.

  • But many people believe that it has a much nicer interface.

  • Kind of all.

  • I like python and javascript.

  • Then do like sables Boston.

  • See, this was a garbage collector, Doesn't it?

  • Does.

  • It has a very nice garbage collector s.

  • So for those of you who aren't familiar, garbage collection is something that's really popular in like Java, For example, really?

  • Any, like, modern ish language?

  • With the exception of like, basically, it kind of takes care of certain things for you.

  • There's a life cycle of a given object s o for like, object oriented language is you take something where you create an object in san she ate it.

  • And maybe you kind of all the references to that object disappear.

  • There's a garbage collector that prevents that object from just floating around forever in active memory or main memory.

  • Going has a nice garbage collector.

  • Go also has a really nice feature that I haven't explored a whole lot in it's like version of threads and threading there not as heavy weight as like a C plus, plus the right people.

  • I spend that up and it's got its own stack.

  • It's doing all these things, and this is pretty heavy.

  • It's heavy and memory usage just having TV usage.

  • Where is going?

  • Threads are actually little bit on what I mean.

  • It's the language, very much design for distributed systems that make sense.

  • It's like it's number 11 of its their whole design considerations s.

  • So they do a lot of very cool things in going and actually, yeah, I wouldn't recommend going and exploring.

  • It may be kind of like if you were looking at, like, Python go lying in c++.

  • You'd probably see a lot of the different differences and similarities if you wanted to kind of paradigm shift and test your ability to switch.

  • How you think, Then maybe looking at like something like python versus Something like Oh, camel or Closure.

  • Or I think maybe it's functional and I actually in industry were at least the industries that I've worked in.

  • I've generally preferred functional paradigms because it makes it really easy to test, whereas Odditorium paradigms, it's not that they're difficult to test, but I generally find that people are a little less stringent in their testing practices, and that tends to lead to more brutal practices.

  • And eventually no one condemned your coat.

  • I mean, we talked about it.

  • I really like closure.

  • I want to get, like, really good at closure.

  • Like I would too, you know, growing popularity.

  • I mean, I think functional programming is on always has anything since its inception.

  • Think languages have more, you know, recently especially started to adopt functional features, even Java, which is kind of crazy to think Go figure C Sharp is not so great a great with that with, like, link Berries too much, but But I'm more involved in the game, Of course.

  • Here, I mean, the game development recently been considering the idea of making a game completely in a functional like enclosures.

  • Yeah, which could be fascinating when you're dealing with, you're dealing with either either the idea of taking your entire frame and performing mutations on it as soon as your state object or taking just the what you had taken like a react Appert.

  • Just a general application, like all your entities, important information entering that is your state object performing operations on that, Also having sort of sequence old rendering logic.

  • So I think it would be an improper disclosure is about the best paying language to know.

  • According to the new SM, Insides knew about that too.

  • Cool doesn't happen to be the main reason that I want to learn closure, but it's a nice little little plus, you know, next bonus for it and get to hop in a nice, high paying industry.

  • But what you're saying is true.

  • I mean, like doing the state with object oriented programming with really large applications and games or crime.

  • Example of you don't know what's manipulating what, especially when you become so many side effects.

  • Like the nice thing about functional programming is that its scales to multi threaded systems incredibly easily because there's no state.

  • So just just like a GP operations out your entire cluster and back to the person's point talked about go lang.

  • And I think that might be an important reason as to why it has a future in ML.

  • Being ableto spread out something very easily, too, and very performance.

  • It takes the whole pipeline and put it out of machines.

  • That's a You could essentially model a neural network with just your machines.

  • So I think it's actually it's pretty important, like you probably want to stay abreast of, like, some of these kind of heftier languages that air sticking around.

  • Um, I mean problems a good example because it's so multipurpose.

  • But if you're looking to maybe have some more performance systems and people would be like, Oh, you can transport a python in to see it like this, all things you can do that's true.

  • But like I mean, if you wanted to just out of the box have a slightly more performances.

  • Well, actually immediately more forming system.

  • It'll always be worth your while will gets people's plus.

  • But going has many of the properties that I think we'll end up being important in the future.

  • Such a CZ distributed systems and, well, actually really just this, like distributed systems paradigm, where it's like we can spend many different systems on DDE.

  • That kind of surpasses our hardware limitations.

  • Yeah, Moore's law is like Grayson and at the transistor level is not going to old, not really continuing.

  • What they're saying is more will continue, but it will continue at scale.

  • It's going to be a horizontal scaling issue rather than individual cores, which is really it is a very interesting shift that we're starting to see, and we're all we're all in it.

  • So yeah, also true?

  • Kinesis asked his closure enclosure, saying these they do sound the same.

  • Sorry, a closure is like you have, like function closures.

  • It's more of like a related to programming across several languages.

  • Whereas closure, it's a play on the turn.

  • Yeah, it is a play on that.

  • It also uses closure.

  • Skip uses the Google Closure Troposphere Organization library.

  • So there's all these.

  • Yeah, Google one or both.

  • And you know that programming language at the end and Google, you'll figure it out?

  • Uh, yeah, Islamic Knight says.

  • Functional programming is very easy to verify things, and that's it's pretty much a exact Well, I was gonna say side effect of the fact that there are no side effects, right?

  • It's just super used.

  • And there's been articles written about this too.

  • Now, like theoretic witnesses, the reason for a software blow in what this Have you ever worked in industry?

  • And you've seen a piece of code that you're like, Why does this work, boy it does where the unit tests wire there.

  • No unit tests.

  • Who wrote this?

  • Chances are that they voted no pee and then didn't test it.

  • And they're like That'll probably work until it's like, not robust.

  • It doesn't know no standard behavior is all of it's undefined behavior is undefined and not documented in late stage pain on Dime.

  • Actually, it's kind of like language is a functional.

  • Languages tend to pride themselves on being self documenting.

  • So like that, I don't know how true that really is, but it is kind of a nice feature of the language that, like as you write the code documents, what it does if you kind of follow some standard seeing function has written, is written some clarity of Lee, and that's the word that it was going to react and react is actually very functional special with three ducks.

  • It kind of adopts that sort of functional paradigm.

  • But being able to write out explicitly how, like the operation that I think it's basically the verbs versus now his argument right with program, which is as a student of CS and you're switching between all these languages.

  • Julian what it gets the light is everyone write things like this.

  • Yeah.

  • Yeah.

  • You honestly have an entire stream?

  • I'm just discussing.

  • These isn't turning.

  • It's just way.

  • Will have back in the game.

  • Means for a couple of questions and then so gentle flavor.

  • Holiday guys.

  • I'm 28.

  • Yeah, I'm a 20.

  • We're getting on.

  • Nick's already better coating the night.

  • So that probably going does go.

  • Ling have parallel programming, I'm inclined to say almost certainly.

  • Yes.

  • Yes.

  • That is kind of private.

  • It is kind of its one.

  • They are one of its greatest features.

  • Eyes.

  • It's very solid.

  • Hello, Programming.

  • All right, cool.

  • Yeah.

  • Let's, uh, let's get we're gonna happen to meet you.

  • Notice I created a little file here.

  • There are some things in the file.

  • I promise.

  • It's not much, um, and we'll go from there, so I think we d'oh!

  • Yeah, that's the one.

  • And I'm gonna use visual studio code for those of you that have watched streams before, I've kind of amped up my visual studio code a little bit like the Rocky Horror Picture show.

  • Sort of pass everything there every time I make this little bigger.

  • I was actually doing this a little bit last night with my glasses.

  • Well, actually, I couldn't wear my glasses.

  • I had, like, stuffing my face, and I had to, like, make this an enormous screen.

  • I couldn't read it.

  • I was like, Oh, this will be the same, uh, same size of everyone.

  • Sees tomorrow s O.

  • Okay, So you're gonna have anything on the right side of your editor?

  • Probably not.

  • Actually, I'll keep things off the right side of it.

  • That way, people can see what's going on.

  • So we have this important, dumpy SNP very standard paradigm in python.

  • We also important plot live pipe lot.

  • I'm a huge fan.

  • Pie plots Very easy, simple to use.

  • These tools are typically assist, not necessarily machine learning, specific tools, but tools.

  • Numerical generally.

  • Yeah, computation.

  • Another one you might see is like Sipe.

  • I, um that one's pretty common to There's a lot of just like the standard libraries that are very easy to use, and you should become familiar with them.

  • The other one is pandas for, like data scientists.

  • I can't learn as well.

  • Yes, it was another big ml one.

  • And if you want it to happen to like even lower level and only not lower level, but other ml There's like pytorch is really common one and, um, tensorflow it will be using tensorflow.

  • It's something.

  • Yes, Yeah, we will actually see at the end of this stream will see something that's built entirely in tensorflow and all kind of walk through with the code does I'm pointing to the guy who built it.

  • I don't actually build that one, but that's okay.

  • And and then next stream will build a version of that ourselves that is a little bit more customizable.

  • And then we This is just kind of an example of, like how I can load in an image and display it s so that people are a little bit familiar with that.

  • And then we're going to hop into generalizing this and then we'll walk through my kind of steps for like when I'm working with images shot David J.

  • Male in the chat.

  • If anybody is brand into the stream, doesn't know.

  • David is the instructor for CSU to here at Harvard University.

  • So shout him out.

  • If you are new, and if you're not new and you've been here before, um, still say hi.

  • Yeah, thank you for thank you for stopping by.

  • And I see up there journal.

  • Flavor says, Have you guys been coding since you were kids?

  • Since I was, like, 14.

  • 15?

  • Yeah, I actually, I started here.

  • I started in CS 50.

  • So about 2.5 years ago, or so we're getting to, like, three.

  • We're getting a little, uh, but yeah, I messed up a lot of hardware, though.

  • As a kid, I really like, like putting transistors, that he's got the spark.

  • Yeah, and that was kind of a start.

  • And then I was introduced to cryptography in secrets and puzzles.

  • And then I was like, Well, you know what?

  • I'll take C custody.

  • And now it's half my degree.

  • Like David style.

  • He was like, you know, like computers and stuff.

  • But he wasn't, like, settled CS till they came right.

  • And then now he's instructor and many end up teaching.

  • Things are it's a wild time.

  • It's always interesting to ask people their stories for changing.

  • But you, um all right, so we grabbing image, I'm going to use my own standard naming conventions.

  • But if you have a problem with them feel free to let us.

  • I mean, who wants to write about lib?

  • This'll looks like Leah's Check on.

  • I stick to conventions instead of just my own conventions.

  • But like people conventions, they're really not conventional.

  • Ironically, you see, that's getting height, width and something channel.

  • Yeah, okay, So I don't think people actually repeat that A.

  • C very often.

  • I did come, Lizzy.

  • Generally don't use variable single letter variable names listed extremely obvious, like eye for Index is pretty clear.

  • I was thinking color, but I mean color channel.

  • It's kind of related related topics like kind of works, but if you want to be really clear your time out, channel on with tap complete.

  • It's really not that bad, but I'm super lazy.

  • And then agent W you probably should leave those like with those are pretty standard.

  • Pretty stubby, pretty standard, and you can into it.

  • We're talking about images.

  • It's like, OK, eso this grabs a single image.

  • So I used the mat plot library Thio, pull up that image.

  • It's a PNG, and then I grab its shape and I print that out to us so we can see what they look like, at least for a shape goes.

  • And then I wanted to demonstrate numb peas reshaping parameters because that's extremely useful to us.

  • A lot of times, when you're working with images, it's not super convenient to use.

  • Like the 28 28 by five images dimensions.

  • It's much simpler to just use like every row is an entire image, and it's just laid out in a meeting in a way that's meaningful to the machine but not us.

  • You know, we don't care.

  • Eso this reshapes it into exactly that.

  • It's just a single string, our side answering a single list or number e r a of just every piece of data that was in the original image really seriously.

  • And everything has been laid out a single dimensional list yet.

  • And then this just demonstrates putting that back into the original shape.

  • And then we display the thing that has been re pick back up so I can run this way's Python three k means I run this.

  • You'll notice for this kind of thean Midge original shape is 298 pixels by 632 by four, so that last parameter Matt plot live blows.

  • It loads its channels in last using another library.

  • Sometimes they load in channels first.

  • It kind of depends on how you like how it's organized, but just like, keep in mind what?

  • Your dad, it looks like every time you use it s o.

  • The images were originally loaded in like that.

  • I kind of linear rise this image.

  • So that's what that second print statement does is we have this just a single just boom numpty array of I was at 753,244 just entities.

  • And then when we reshape that, we get back to the original shape on Then I went and displayed that using a map lot live and you can see that that is our first image.

  • There's some axes here.

  • And you can you remove those if you'd like, But we don't.

  • We don't particularly care now if I wanted to kind of re mapped this a little bit so that everything was a little bit easier maybe for computers to deal with, I could do something like this.

  • Uh, whips are color mapping doesn't work here because our image pixels are not translated into a one odd scale, but we can do that shortly.

  • So essentially, what's going on here?

  • Aside?

  • Saw someone asked in the in the chat.

  • What does channel mean in context of an image?

  • That's a great question.

  • So if we have images that are in like black and white or gray scale those air single channel images.

  • So every pixel, the intensity of grey is represented as some number from zero toe one usually or zero nativity five.

  • And that allows us to just scale across one thing.

  • But if I had an equivalent RGB imager color image, it's 1/3 the size because you can think of an image is really being a composite of on their different ways of organizing this.

  • This is just a very classically taught one, where I have actually three versions of the image kind of over laid, and the first version is the Red Evers, and the 2nd 1 is the amount of green, and the 3rd 1 is the amount of blue.

  • And then when I combine those on a computer screen, I can then see what the actual colors originally worked.

  • Actual colors being an approximation.

  • There are other ways of doing it using like Hugh, and, uh, I think that one's for Channel.

  • So this is what we're using.

  • But it doesn't really matter in our case, weaken kind of shuffle everything over and then re adjust accordingly, kind of as needed.

  • So, uh, hopefully that clarifies what reshaping things does.

  • Here I am just going to try and stick to these two libraries.

  • I might also throw in the T Q D M import TDM, which I don't actually know that stands for, but it's a really, like, pretty way of printing out a CZ.

  • You generate over something what it looks like, and it's just kind of a good way of keeping track of stuff.

  • So we know we're working with images.

  • We now have a way to reshape free, make readjust images, depending on what we need to do.

  • But we don't necessarily have a script that there's anything meaningful.

  • So I'm going to start my scripts using kind of the, if name, main paradigm.

  • Um oh, thank you, someone in the Chances T.

  • Q.

  • D M means progress in Arabic.

  • I don't want a bunch of that word, but I would otherwise try and pronounce it.

  • That's really cool.

  • Didn't know that.

  • So it displays pretty product Progress Bar's It's really useful.

  • Uh, it just helps me kind of mentally keep track of where we are in the execution of coat on.

  • Then you'll see that the paradigm that we're using is kind of like if this is intended to be a script, let's use it, uh, a script.

  • So what we're gonna do is if name is main than what I want to do is I want to say my data is some method where I call like, load data on gonna probably have to give this some directory.

  • So if I go into my terminal on my list out which directories I have, let's have it load data from the stream, all directory.

  • So it seems relative path thing, and that loads in my data.

  • I want to pre process my daughter.

  • So we might be very clear about this and say pretty process of data, and from there I now have all of my pre process data.

  • So well, I might want to say is Okay, we have a process data.

  • Let's build a K means classifier.

  • So we'll call that Kay um, we'll call that k means.

  • And that's going to be equipment.

  • To some K means classifier, which isn't Stan.

  • She ated with some number of classes, so we'll say K equals I don't know.

  • Well, that would be a binary classifier to start, but we'll change that shortly already.

  • We're starting to see the somatic numbers of some functions that we really need.

  • All sorts of interesting things that you need to know.

  • Eso from there we have this k means classifier.

  • We're going to use kind of the standard like s K learn methods and we'll say, have it fit on the processed data eso Once it has been fit to that process data, we should have some set of images.

  • So we're gonna do this one deviates a little bit from standard FBI, but we're going to take a means display means so this will display the mean images to us.

  • And then, um, yeah, actually, we'll stop from up there.

  • So let's work through this A one line at a time again.

  • So low data stream Also where What is it?

  • Loading the images from the stream.

  • Old director?

  • Yes, I handed a directory and it assumes that there's some set of images in that directory and snags all of them for us as numb Pierre raise.

  • So this, uh, this kind of way of coding things is something that I really like doing.

  • It kind of lets me template out my code.

  • I kind of know how things will work, and it gives me kind of a set of objectives to then complete s o.

  • I say, OK, load data.

  • And basically, it's all documents, uh, our rights, it for me.

  • So this should take in some, like, directory.

  • And it should return to us some numpty array containing sort of Haskell Lee.

  • So syntax.

  • Yeah.

  • So I am reading a little bit and kind of functional style language over here in our comments to clarify these comments.

  • Make him a little bit easier to read.

  • Will do that.

  • Yeah, that thing is a bunch of about the code time extension.

  • Oh, yeah, that is also an extension.

  • I'm a huge fan of using extensions to kind of I don't make it feel a little bit more homey.

  • And someone asked which I d is that That is an incendiary question for some people, but this is visual studio code.

  • I'll leave it to you.

  • Has to hash out where, How much of an idea it is?

  • Mind functions very much like an I.

  • D.

  • I would count it for all intensive purposes is untidy.

  • It's definitely more so than a lot of this, like competitors like Adam and Sublime, which need I say more senators.

  • So some i d features that gives you Yeah, especially.

  • I'm a huge fan of the way the terminal integrates, whereas an Adam, you have to, like, drag the terminal.

  • And I haven't quite figured out when I started as an extension to think this is its native I'm a huge fan visual studio code very well built.

  • I have not any problems, pretty easy to use on DSO.

  • Then we have this pre processing step, and so any time you're working with images or audio files or really any kind of data, but we're going to say just kind of those two in particular, they're easy to contextualize.

  • We might wanna like manipulate that data somehow s O In the case of images, a lot of times that means extending our data set because it is kind of difficult to obtain images that are meaningful.

  • But as we mentioned a little bit before and we're gonna mention probably a couple more times, um, image of, like, Colton and I here, and we're probably gonna do this in the stream.

  • But it was just a little bit more time a new image of Colton and I, uh, sitting here as opposed to an image of us in reverse.

  • We're still in the same stream, right?

  • So there are ways for me to extend the amount of data that I got out of something, even though I never I never like acquired new data on DSO.

  • That's kind of an image augmentation process.

  • Audio files.

  • It's a little bit.

  • It's a little bit more difficult for us to kind of imagine what is a meaningful, like audiophile augmentation.

  • Rhys is not, but image is pretty easy for us to understand that, like if I swap us around, if I make this black and white versus color, they're all still the same.

  • So and there are different arguments as to which one's arm or the same than others on DSO.

  • Pretty processing is usually a pretty important step on your data now there is a huge need and kind of problem in the data science world at the moment where we have many different versions of data sets, and it can be really difficult for us to keep track of each of those versions, know who created it, how it was, created, what it means on. 00:

All right.

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級

K-MEANS在PYTHON中進行分類!- CS50直播,第53集 (K-MEANS CLASSIFIER IN PYTHON! - CS50 Live, Ep. 53)

  • 2 0
    林宜悉 發佈於 2021 年 01 月 14 日
影片單字