字幕列表 影片播放 列印英文字幕 [MUSIC PLAYING] DAVID MALAN: This is CS50 and this is lecture 6. And you'll recall that last week we introduced web programming by way of HTML and CSS, or at least the building blocks because we don't actually have the ability to program yet. It's just markup, HTML and CSS with stylization thereof. But we introduced this metaphor last week of a protocol called TCP/IP. And we related it to, of course, an envelope. And on this envelope, virtually, on the front was at least two pieces of information. And if anyone remembers what were those two pieces of information in the to field? Someone else who we didn't hear from recently? Yeah? AUDIENCE: An IP address. DAVID MALAN: Yeah. An IP address, a numeric address that uniquely identifies your computer and someone else's computer. And one other thing, if you remember. Oh, come on. It was like two minutes ago. OK. Yeah. AUDIENCE: A port number. DAVID MALAN: A port number. So another number, shorter number, that's just a number like 80 or 443 referring to HTTP or HTTPS, or other numbers, like 25 for email and the like. And so together these unique addresses allow you to send information to not only a specific computer, but a specific service running on that computer. And in order to actually request information from that server, there's this other protocol called HTTP, Hypertext Transfer Protocol. This is what's inside of the envelope. So when the server opens it up, metaphorically, looks inside, this is the command that that server reads in order to decide what it should actually respond with. And so this request here is telling the server-- otherwise known as www.example.com in this particular example-- to send back what exactly in its own envelope to me and my laptop if I were to request this? AUDIENCE: A specific web page. DAVID MALAN: A specific web page. And someone else, which web page specifically, presumably? AUDIENCE: Index. DAVID MALAN: Yeah, so index.html, which we said last week just tends to be the default file name on a server for a web page that's just selected by default. And it doesn't have to be called this, but it's a human convention. And the rest of this is just a verb saying, literally, get me that file. This is just telling the server what version of HTTP I speak so that humans can improve it and upgrade it over time. But this would tell the server to return index.html. Meanwhile, we saw more sophisticated get queries when we started talking about Google, and any website that has not just a front end, like HTML and CSS, but also a back end. And a back end is where the logic is, where the server is, and the interesting work, ultimately. And so this slash search indicates some kind of software running on Google servers as of last week that's simply responds to requests. And what did question mark q equals cats do or represent in that demonstration? AUDIENCE: User input. DAVID MALAN: Yeah, user input. So the question mark just says, that's it for the file name or the URL. Here comes the user's input. Q is just literally the HTTP parameter or input that Larry and Sergey, founders of Google, 20 years ago decided would represent the user's input, q for query. Equal just means that query that the human typed in was cats. But the human doesn't even have to type this in. Once you understand HTTP, if you really wanted to be kind of a nerd, you could go to www.google.com/search?q=cats and it would induce the search for you because at the end of the day, that's all the browser is doing. When you have these web forms that you now have the ability to create, it's just automating the process of generating these HTTP messages. Now, the server hopefully responds with a message you never, ever actually see, HTTP 200, which literally means OK. Of course, many of us have seen numbers other than 200 appear, like what? 404, which means? File not found. Now, why the humans decided years ago to tell other humans what that numeric code is, I mean, that is an uninteresting detail. But the world, for whatever reason, has revealed in many web sites 404. But it just means the same thing. Everything is not OK. A file was not found. You might see something else like this. We saw this with Harvard, in fact, curiously, that Harvard had moved permanently. Now, Harvard was responding to certain queries with HTTP 301s in order to achieve what feature or effect? Why? Yeah. AUDIENCE: Redirections. DAVID MALAN: Redirections. So this is kind of a low-level way of describing it. But 301, even though it says moved permanently, that's a more technical hint to the browser saying, Harvard moved not to whatever URL you just came from, but to this URL specifically. And now Harvard was probably, if you recall, redirecting me from what URL? If I wasn't already at that URL, where might I have been? Maybe dot com, if they actually own multiple domains and were redirecting. That could work. What else? Yeah. AUDIENCE: Just HTTP. DAVID MALAN: Yeah. Maybe I just typed in HTTP, and Harvard, in the interest of security, wants to force my browser to request this page again via HTTPS. Sometimes a website might prepend the www if you haven't typed it in, or you can be redirected most anywhere. In fact, if you go to CS50's own website by just typing CS50.harvard.edu, watch the URL. You'll be redirected to a more specific page, depending on the time of year. So we use these tricks, as well. 404 not found might look like this, but inside deeper of that metaphorical envelope is the actual contents of the web page. So you get back not only these HTTP headers, as they're called, in the top of the response, so to speak, but you also get back HTML, yet another language we looked at, this one actually a language, but not a programming language. These tags tell the browser exactly what to do and to render. We introduced this style tag, though. What did that allow us to do that HTML alone did not? Yeah. Use CSS to beautify the site and just make it nicer. HTML, for the most part, is about structure and about tagging the contents of your web page in a way that the browser finds helpful. But CSS is really for the user's benefit, at the end of the day, and his or her eyes, because it really lets you control font size and positioning and lower-level stuff that you might have started tinkering with with the most recent problem set. Now, we'd proposed that you probably shouldn't just start typing CSS inside of your HTML page because it's just a little harder to maintain as your examples get more sophisticated. So you might factor it out. And odds are you did this for the problem set because when making a home page, if you have the same CSS styles across multiple files, it would be pretty silly and inefficient to copy and paste them again and again when you can factor them out like this. Lastly, we looked at JavaScript, last time, another programming language that's super similar to see, at least at first glance. But it actually gets rid of a lot of the lower level headaches like pointers and memory addresses and that that we've struggled with in recent weeks. But most important was how we used it. So you can consider a web page like this as once it's loaded by your browser as just being a tree structure. Thinking back a couple of weeks to our discussion of data structures and each of these nodes in the tree we saw in JavaScript can be manipulated. And via that very simple principle, writing code that modifies this existing tree in the browser's memory, means you can make much more dynamic things like Gmail and Facebook and any number of websites that are constantly changing. You did not do this yet for the problems set. You made static web pages just by hard coding HTML and CSS. But starting next week, once we have, thanks to this week, the vocabulary of Python will you start to make things more dynamic and then even bring back into play JavaScript, bringing all of these various threads together. And to include the JavaScript, recall, we used either a script tag at the top or refactored it out to a file. Or in some cases, it's necessary or beneficial to move it down to the bottom of the file or factor it out like that, but more on that down the road. So any questions on last week or on HTTP, HTML, CSS, or TCP/IP? No? Anything at all? Oh, yeah? AUDIENCE: So in what case would you put the script tag up at the top [INAUDIBLE] DAVID MALAN: Good question. So in what cases would you put the script tag up at the top versus at the bottom? If the code you're writing in JavaScript manipulates the DOM, the tree that I had on the screen just a moment ago, the catch is that that tree needs to exist when your code is executed. So if you, for instance, have JavaScript code up here in the head of your page, but the nodes in the tree, the tags that you want to manipulate in changing things to red to green to blue like we did last week, or making things blank, are down here in the page, you can't write your code up here and have it change things in the page down here because it's happening out of order. So similar in spirit to C where things have to happen in the right order, if you want to change something down here, your code needs to at least be down here, or you need to use some fancier techniques to say, I'm going to write my code up here but wait a few seconds before executing it until the whole webpage is loaded. So for most of the examples we looked at, this was not an issue. But we'll come back to this perhaps before long. All right, so let's now take the same approach that we did last time of introducing one language by way of another. You'll recall, of course, that we started the whole semester with Scratch and then we transitioned a few weeks back now to C. Last week we made some comparisons with JavaScript. Let's do the same thing briefly with Python but then spend more time at the keyboard comparing the two to see what actually is different about these. So why in another language, though, first? We have Scratch, C, JavaScript, Python, not to mention HTML and CSS for different purposes. Like, why do we have all of these darn languages already? Why didn't humans just decide, that's it, we're all using Scratch? We're all using C or JavaScript or Python? What's, perhaps, the intuition behind that? Why are there so many damn languages, not to mention in this one course? Yeah? AUDIENCE: [INAUDIBLE] DAVID MALAN: Say once more? AUDIENCE: Different ones are good for different things. DAVID MALAN: Yeah, different ones are good for different things. And this probably goes without saying for something like Scratch, right? It's so visual. It's so graphical and animated. It makes sense that the puzzle pieces-- or that the language itself is based on puzzle pieces and dragging and dropping. So maybe languages are tailored to certain applications. But is that true for C, Python, and JavaScript, which are all text-based languages we'll see? AUDIENCE: [INAUDIBLE] for example, they're different levels of abstraction. DAVID MALAN: OK. Different levels of abstraction. AUDIENCE: C is very [INAUDIBLE] actually dealing with a lot of things that you don't have to think about in Python-- DAVID MALAN: Good. AUDIENCE: --where these sort of things are taken care of for you, such as memory allocations and so on. And so depending on what level of abstraction you want to work on and what parts you want to manipulate. DAVID MALAN: OK, good. Bringing it back to abstraction does make sense. C is, indeed, very low level, literally having the ability to manipulate memory and via pointers and so forth. And that's great because you can do anything you want with the computer. But it comes at great risk and great cost. One, the cost is human time. It's just painful to write that kind of code sometimes. Two, it's also very risky because if you make a mistake, even a simple mistake, the whole computer can crash. And we didn't see examples of this, but you can make your code vulnerable to a hacker if he or she is able to somehow exploit a memory-related bug and read all of the passwords in your program, or something like that. So with great power comes great responsibility is kind of the mantra of C down here. But JavaScript we saw allows us to do things a little more high-level. There were no pointers. There was no memory. We didn't talk about things at that level. We talked about things at the level of a tree, a DOM in memory and changing colors and positioning of things on the screen. And that's, indeed, a higher level. Now, Python is not necessarily even web-centric. It's more of a multi-purpose language. People use Python to write command-line programs, like we will soon, at the keyboard, like we've been doing with C. You can also, though, use it, as we'll see next week, to generate other languages. So next week we will write code in Python, the language we're about to see, to generate another language, HTML and CSS. Some of you probably noticed in your homepages that you had some redundancy. You probably had similar tags or similar structure, maybe a similar menu across pages. Python and other languages will let us factor that out and generate those commonalities a lot more easily, among many other things. And it's also arguably easier and faster to write because it comes with so many more features, as we will soon see. So in fact-- you know what? Let me do this. Let me go ahead and open up CS50 IDE. Let me go ahead and create a new file. And out of curiosity, of our recent problem sets, what was maybe among the most challenging programs you've written? AUDIENCE: Crack. DAVID MALAN: OK, crack was a good one. What else? AUDIENCE: Resize. DAVID MALAN: Resize, recover. Yeah, definitely the forensics ones. And more people probably did recover and resize. So let's take resize, for example. So let me go ahead and write a program in a file called resize.py for Python, instead of .c, and see if we can't spend, what, few hours, couple days, as you probably did in C, implementing resize. Well, let me go ahead and do this. I'm going to go ahead and-- let's see. First I'm going to import some features that just come with Python. And I'm going to go ahead and say from sys import argv. And I'm going to go ahead and also do from pil import image. Don't know yet what these are. We'll tease this apart in a moment. But then let me just do a check. If the length of-- rather, if the length of argv does not equal 4, I'm going to go ahead and exit for the user and say the usage of this program is Python resize.py and in file, out file. So even though some of this should look cryptic at the moment, there's some commonalities-- argv, you recall, from C, and this usage string that we printed out whenever anything went wrong. That looks very similar in spirit to C. And what did we do in resize? If you implemented resize, like the less comfy version, to increase the size of things, you probably declared a variable like an and got sys-- or rather, argv bracket one to get access to it. I'm going to go ahead and convert that or cast that to an int. You probably had an infile variable that gave you access to argv two. You probably had an out file variable that gave you access to argv three, and so forth. And it turns out in Python, you know what? I can actually use a library, code that other people have written. Let me come up with a variable called in image, like infile. This is my input image. And that's going to equal image.open because I want to open this thing called infile. And then the width-- let me get the width and the height of the existing image by doing input image.size. And then let me go ahead and make a new image-- out image, I'll call it-- which is going to equal the input image calling a resize function and doing the width times n, which is the number the human probably typed in, and height times n, which is the number the human typed in. Then let me go ahead and just save the outfile as follows. Outfile, OK. Done. Problem set three. Tada. OK, either really exciting or really, really disheartening perhaps. So with the right language, as you say, can you solve problems so much more easily. Now, I'm being a little disingenuous because I'm also leveraging what's called a library. And we had access to these in C. And undoubtedly we could have dug a little deeper on the internet into other people's available code and found maybe a library for bitmap files. But notice that there is no dealing with padding now. There's no dealing with arrays. There's no dealing with memory because I'm using the right tool for the job. And if I wrote this code correctly-- and let me cross my fingers that I didn't make any typos. Let me go ahead here and get myself a copy of smiley, which I brought with me. So that was the tiny little image from last week. Let me go ahead and open this in the IDE. Smiley, super small. Just a few pixels there. And let me go ahead now and run Python, which we'll see why in a moment, resize. Let's increase this by a factor of 10, increasing Smiley, and call it out.bmp. Now let me go ahead and open out.bnp and voila, it indeed seems to work. Right, no funky colors. No weird sizes. No padding. No padding of all things. It's just now Python. So you can probably glean some of the logic that's going on here. But some of it certainly should and probably does look magical. So let's use today to tease this apart and appreciate not only what you can do with another language like Python, but how it's similar and different and how it actually is built upon something like C. So let's do some comparisons first so that we can see that it's not a huge stretch to introduce yet another language so quickly. So recall that in Scratch if we wanted to set a variable, like counter, to zero, you might simply do something like this, setting it equal to zero at left. In C, we would do the same thing here at the right. In JavaScript, this instead looked a little different. What did we do in JavaScript? Yeah, we used let instead because we don't specify explicitly the type. But we do need to tell the computer, let me have this variable called counter. In Python, it's going to be that. So we've gotten rid of the type still. We've gotten rid of any mention of let or another keyword. And we've gotten rid of-- perhaps most gratifyingly-- semi-colons are gone. No more semi-colons. And no more curly braces in the way you've seen them thus far. So that was C, JavaScript, and now Python. So how about something like this? In Scratch, if you wanted to increment a counter by one, you would use a block like this. In C, we would do the same on the right here in code. In JavaScript, did it look any different on the right? No. You haven't had occasion to use this yet. But one of the sort of revelations of JavaScript was that's also JavaScript. It was identical. Something like this, though, is Python. So it's almost the same. But I've gotten rid of the semi-colon. But the logic is exactly the same-- set counter on the left equal to whatever it is on the right plus one additional value. What about this? This in C had what effect? Incrementing the variable. So this is exactly the same. It's sort of a nice shorthand notation for doing counter equals counter plus 1, which just gets a little tedious to type. We had that same syntax in JavaScript. And you can probably guess in Python, what's it going to look like? AUDIENCE: Same thing without the-- DAVID MALAN: Same thing minus the semi-colon. So pretty nice pattern so far. Languages just keep getting trimmer and trimmer, if you will. In C, recall that we could just do plus plus, which was another trick for automating that same process. JavaScript allows for the same. And if you really like this syntax, I can't show you a slide for Python. Doesn't exist. Can no longer do plus plus. So we're paying a price. The author of Python did not include this in the language. But that's OK. We at least have this one, which is not too horrible. So what else did we look at last time? An if condition like this, comparing if x is less than y, in C it looks like this. In JavaScript it looks like this same thing. In Python, it looks like this. So gone are the curly braces. Added is a colon. And what you don't see yet is that indentation is going to be important. So any of you have been a little fast and loose with style 50 and, like we've seen at office hours, all of your code, however many lines you've written for whatever reason is all aligned on the left and nothing is actually indented. Now Python is not going to tolerate that. Python requires indentation for logic. And so this is actually a stylistic feature of the language. It forces you to adopt good visual stylistic habits because the code just won't run if you haven't indented it properly. So anything that's going to happen if x is less than y needs to be indented, say, four spaces underneath that colon. What else have we seen? In C or in Scratch we had this block for if's and elses. In C it looks like this. In JavaScript it looks like this. In Python it's going to look like this, albeit with indentation below each of those colons. How about this? When we had three-way a fork in the road-- if else, if else-- in C it looks like this. JavaScript looked the same. In Python, looks a little funky. It's going to look like this-- elif but three colons, this time two. What else? We also looked at forever loops in Scratch, in C, and in JavaScript. You could use exactly the same syntax in Python, almost the same. Gone are the curly braces, added is the colon. And the slight subtlety, if you noticed, true and false are now proper nouns, if you will. Capital T capital F is necessary to write. How about a for loop? So in Scratch, we could very easily say, repeat this 50 times. C and JavaScript is a little pedantic in that you have to initialize and increment and check. Both C and JavaScript take that same approach, although in JavaScript we of course use let instead of int. Python is a little more succinct although a little less explicit step by step. You just do this. For i in range of 50 is the way of saying start iterating at 0, count all the way up to but not including 50, thereby giving you a range of values. So this is the one that's perhaps the most weird thus far, but still a little more succinct to write. So in C, we had so many data types-- bool, char, double, float, int, long, string-- the last of which, of course, came from the CS50 library. And there's others that you can use in C, as you might recall, from problem set 3, perhaps. In Python, we're going to shorten this list, at least initially, to just these data types. In Python, we're going to have bools for true-false, floats for real numbers, ints for integers, and then strs for strings. Just a little more succinct, but it does actually exist. str in Python is a real thing. It is not a CS50 addition. There are other data types that come with Python. In fact, this is where the language gets powerful. And those of you who came from a Java background or C++, the subset of you who have programmed before, you have more features in Python just like you do in those other languages that we did not have in C. In Python, you have dictionaries or hash tables. You have lists, which are arrays, but that can automatically resize. You don't have to decide in advance how big or small they are. Range we just saw, it's a range of values, like 50 of them, set in the mathematical sense. It's a collection of things that ensures you don't have duplicates in that collection. And then tuple is a combination of things kind of like for math when you have x comma y or latitude comma longitude. Any time you have pairs or triples or more of things, those are called tuples. And those are common in math courses and higher-level CS theory classes, as well. But we do give you, at least in this first week of our look at Python, a few functions from CS50, among them getFloat, getInt, and getString, which behave exactly like their C counterparts. And this is just going to allow us to start writing code very reminiscent of what we did the last few weeks. But let's consider what's going to change as we're about to start writing our own programs. In C, when you wanted to use the CS50 library, you of course included its header file. That syntax is going to change in Python so that for this first week when you want to use the CS50 library, you're going to instead say from CS50 import and then a comma separated list of the functions that you want to import or use in your code. So it's a little more precise. This syntax is not saying give me everything. Give me this, this, and this other thing. And if you want to use one or more, you can just separate them by commas. As an aside, especially those of you who have seen Python before, there's other ways to do this. There are several approaches. This is, perhaps, the most comparable for our purposes today. What else are you're going to have to know? In C you had to compile your code. And you did so with clang, like this. And then you ran your program with dot slash hello. Or more simply, you did make hello and then we'd figure out the command for you in the IDE or the sandbox or lab. In Python, you're going to skip the compilation step. When you want to run a program in Python, you're going to do just what I did quickly before. You're just going to run the command Python and then the name of the file that you want to run. And the reason for this is as follows. In the world of C, recall that we had this sort of pipeline process where we have our source code as our input. And then we wanted to get to the point of machine code, the zeros and ones. And what was standing in between source code and machine code, just to be clear? What process? Yeah, so compiling. So we had a compiler in the middle whose purpose in life is by definition to translate one language to another. It happens to be an English-like language to a computer-like language, but a compiler is a general term that just converts one thing to another. And so this pipeline for C looked like this. And that's why you had to run Clang explicitly, or make. You had to induce that middle man operation to convert the language to something the computer understands. Python and other languages are not typically compiled in the same way. They're generally said to be interpreted, whereby you don't compile them into zeros and ones and then run the program. You instead run a program that someone else wrote called Python. And that program is, by definition, an interpreter. And that interpreter's purpose in life, as the word implies, is to read your code top to bottom, left to right, and just do exactly what you tell it to do, step by step by step, without doing the upfront work of converting things to zeros and ones. So in the human world, if I speak English and someone there speaks Spanish and we don't speak each other's language, we might put a third human in between us, obviously a human interpreter. The role is very similar. The interpreter listens to me and then translates that to something the computer understands. But it doesn't get into zeros and ones. It just goes from one directly to the other. So the difference here in Python is that you still are going to write source code, like I quickly did for resize. And ultimately, we want to actually get it into a program called an interpreter. And so the step ideally just looks like this. But as an aside, Python is a pretty sophisticated language. And even though we have the pleasure of running it just with one step instead of these two steps, there actually is, as an aside, some magic going on underneath the hood. And for the curious, there actually is, for performance reasons, a compiler built into Python that actually converts it to something intermediary called bytecode. And bytecode is what's actually interpreted. And so this is why Python, while potentially slower than C at certain tasks because you're not going to the low level zeros and ones, can actually be used in business applications and popular websites and such. And that didn't really work very well. And so it can be highly performing, as well. But more on that in a little bit. So with that said, if these are the differences not only syntactically but also mechanically, let's go ahead and actually write a program. So let me go ahead and go into the IDE. Let me close our examples from before. And let's start more simply because resize was a mouthful all at once. Let me go ahead and create a file called hello.py. And instead of writing this program in C, let me go ahead and just write hello world. So let's go ahead and do this. Print hello world. Done. That's my first program in Python, and truly my first program in Python, not sort of coming out swinging with resize. So what is not present in this file that was in something like hello.c? There is no main function necessary here. What else is missing? AUDIENCE: Printf. DAVID MALAN: There is no mention of printf. It's instead print, which is a little more human friendly. AUDIENCE: Libraries. DAVID MALAN: There is no mention of header files or libraries at the top of the file. I just dived right in and got to it. Yeah? AUDIENCE: No semi-colons. DAVID MALAN: No semi-colons. What else? What else? Yeah? AUDIENCE: No backslash n. DAVID MALAN: No backslash n. I probably-- I haven't run it yet, but I think I will get that for free this time with Python. I don't have to be so explicit. Was there another hand here? AUDIENCE: There's no f in printf. DAVID MALAN: There's no f in printf, yep. Something else? There's no indentation. Though to be fair, there's only one line. But there's no indentation. That's fair. That's fair. There's no curly braces, as well. There's no mention of int. There's no mention of void. I mean, my God. Why didn't we just do this last time? And so this is why languages evolve. People realized years ago, gee, C is serving us well. Once I understand pointers and the syntax, OK, I got it. But my God, it's just so tedious to write even the simplest of programs because I have to do hash includes, standard io.h, int main void, I mean, all of this syntactic overhead that's getting in the way of you just doing the work you care about, which in simplest form here is just printing hello world. So Python and a lot of more modern languages-- among them, Ruby and PHP and others-- just get rid of a lot of that overhead so that you can just get down to work more quickly right away. So how do I go ahead and run this? In C, recall, I would have done dot slash hello.py. But we just said a moment ago that's not the right approach. How do I go and run this program? Yeah, so I run literally a program that is coincidentally called Python itself. That is the interpreter. That's the man in the middle between me and my Spanish-speaking friend that just has to convert hello.py into whatever the computer itself understands. And so there, indeed, we have hello world. And as you notice, there's no backslash n on my code. But I am moving the cursor to the new line. So Python just decided, you know what? It's so damn common to have new lines, let's just add those by default. You know, the price we're going to pay is it's a little annoying to get rid of them. But we'll see that in a little bit, too. So just a tradeoff. All right, let's do another one. That's just a simplest of possible programs. Let's go ahead and do, say, something a little fancier that allows us to do something more than that. So let's go ahead, say, and compare not just that, but let's actually go get some user input. So for user input, there's a few ways to do this. We'll do it the CS50 way initially, but these are training wheels this week that we'll use for just a week before we take them off, just bridging us from C to Python. Let me go ahead and call this string zero.py because I'm dealing with strings. And let me go ahead and do s to give me a variable. Get string. Let me prompt the human for his or her name like this and then let me go ahead and say hello. And so and now I just have to consider how to print out their name. And in Python, I can actually just do this. I don't need to do percent s. I don't need to put a second-- or, I do need to put a second comma here. But I can just do this, which is a little simpler. And this is not correct. I'm not practicing what I preached. Get rid of the f. Just print what you want to print, indeed. So s, notice, is apparently a variable because I'm assigning it a value from right to left. But notice that I'm not specifying the type. So Python does have type. str we said is the string equivalent. But you don't have to mention it. Python, like JavaScript, will just figure it out, even without a keyword like let. But I do need to add one thing. What's that? AUDIENCE: You need to import the getString? DAVID MALAN: Yeah, getString is a CS50 thing. And we're only going to use it for a week, but I do need to import it. And the syntax with which to do this is to say, from the CS50 library, import a function called get string. I don't need to import any more with commas. That one suffices for this program. Yeah. AUDIENCE: Would you want to-- instead of saying hello your name, would you want to first getName that says [INAUDIBLE]? You're not indicating where the error is [INAUDIBLE].. DAVID MALAN: Sure, let me come back to this in one second. Let's run this program first to demonstrate that it indeed does what we saw it do last week. And let me go ahead here and do this time Python of string 0. Let me go ahead and it's just waiting for my name. So I'll type in David. Hello, David. But as you propose, what if you wanted to flip this around? Well, suppose I wanted to say the person's name and then something like hello because I'm just excited to see them, instead. Let's see what this does. Let me go ahead now and run Python of string 0. Type in my name. And it's almost what I think you intended. But there is a bug-- an aesthetic bug, at least. So it seems with Python's print function you don't need to use the placeholder like percent s. But it would seem to presumptuously add a space for you after everything you're passing in as an input to print itself. So notice print is taking how many arguments according to this highlighted portion? How many arguments might you infer? AUDIENCE: S space and then the thing. DAVID MALAN: Two? Yeah, so two. One is s, comma, and then the rest is what's highlighted in green here. Yes, there's a second comma there, but it's inside of the string. So just like in C, that's sort of a red herring. There's only two arguments here. But it seems that the print function-- and you would know this by reading that documentation-- if you pass in two or three or more arguments, it prints all of them. But separates them with a single space. So this isn't quite right. So this is actually a great motivation for cleaning this up. If I want to actually improve this program and tidy it up a little bit, let me do that in version one here. Let me create another file called, say, string1.py. Let me start where we started a moment ago. And let me actually use a placeholder akin to C. So if I want to do, for instance, hello so-and-so, it turns out you can actually say, hey Python, put a variable called s right here. However, if I run this as is, there's still going to be a bug. It's not quite solved yet. But when I hit Enter now and type in my name-- all right, this is obviously stupid looking. So it seems that I need to tell Python that this string that I'm passing in, hello comma so and so, is a formatted string. It's a placeholder string that it should make some changes to. And this is a little weird, cryptic syntactically in Python. But the way you do this in Python is you put an f before the string itself. So I'm sorry, we got rid of the f a moment ago. So we just called it print. Now we're reusing a different f here. And it's stupid-looking syntax, admittedly. But this just means hey, Python, the following double quotes or single quotes that you're about to see should be formatted by you in a special way. And it literally goes at the beginning of the string even though that does admittedly look weird. But if I now rerun this Python string one and type in my name now, now it does the substitution. So I can flip it around logically much more flexibly now and do something like hello because now I'm passing in one argument that print will format for me. So when I type in my name now, I'm not going to get that superfluous space. And now I have complete control over the formatting of the string. So you know, sort of two steps forward, one step back, perhaps, syntactically. But it does allow us to do what we want this to do. We could write the same program using ints and floats using getInt and getFloat. Would look exactly the same. You don't need to worry about percent s versus percent i versus percent f. You just type in the variable name inside of those curly braces. All right, let me go ahead and do some quick math. Let me go ahead and do this. Let me go ahead and create a new file. We'll call this ints.py for integers. And let me go ahead and get this access to-- how about the CS50 library's get int method or function which exists. Then let me go ahead and declare a variable called x and get an int from the user and just prompt him or her for x. Then let me go ahead and do the same thing and just get y from them, as well. And then down here, let me just do some simple math. And we did this way back in week one by printing as follows. Let me go ahead and just print out x plus y equals-- and this is what's cool now about this curly brace feature. You can actually do not just variable's names, but you can do simple operations in there, too. I can literally do math inside of those curly braces and print out that value. But of course, this alone is just going to literally print the curly braces. What do I have to add? Yeah, so it looks a little weird. But this now will solve that problem. It will print literally x plus y equals whatever the actual sum is. AUDIENCE: Just following up, what does f mean? DAVID MALAN: Format. Format the following string for me. Good question. Let's do just a few copy/paste but change the operator here. So x minus y, I want to see what this looks like. X, say-- what did we do last time? Multiplying by y. I want to do that math, too. I can divide as well. And then we had one more, which was modulo, or modular arithmetic, which, recall, was the percent sign. So syntactically, it's identical to see. We're just adding this curly brace notation just for the print function right now. Let me go ahead and run this. Python of ints.py. And let me go ahead and do one and say two. So 1 plus 2 is 3. 1 minus 2 is negative 1. 1 times 2 is 2. 1 divided by 2 is 0.5. And 1 then divide by 2 and take the remainder is 1. So I think this checks out mathematically. But you should be a little surprised by one of these outcomes. Say again? AUDIENCE: You're getting a float. DAVID MALAN: Yeah, I'm getting a float. Like, Python itself seems to have fixed a bug in C itself. What happened in C when you divided 1, an integer, by 2, an integer, in C? You would get another integer. And what's the closest integer you can represent that doesn't have a decimal point? 0, because the C would truncate everything after the decimal point. And yet, Python seems to have fixed this problem. And this is actually a somewhat recent phenomenon. And this a huge religious debate as to whether or not you should just keep the historical definition of division, which is floor division, so to speak, or we should make it truly division, like we all grew up learning in school. Python took the latter approach and made division mean division, true division, where if you divide two ints you get back a float. Of course, this is a problem if people want to write code that assumes that it's going to be truncated. That can actually be a powerful feature. So it turns out, and you won't have terribly many occasions to use this, but the compromise in the world was, all right, if you really want the old behavior of the division in Python, we will give it back to you. You have to use two slashes. So again, another one of these two steps forward, one step back. But it's there, so problems can still be solved in the same way. And this, if I save it and rerun that same code, 1 and 2, now I get back 0, just as I would in C, which does have some applicability. Let's do one other example now involving some numbers. And let me go ahead and call this floats.py. And let me do the same thing, from CS50 import getFloat this time. So I can deal with floating point values. Let me declare a variable x and get a float and we'll ask the user for a variable x. Then let's go ahead and get another float, and just as before, call it y. But this time both of them are, indeed, floats. Then let me go ahead and do some math, x plus y equals z. Let's give myself a third variable. And then let me just go ahead and print out a similar message-- x divided by y equals z. All right, and let me go ahead and save this, clear my terminal, and do Python of floats.py. 1 divided by 10 this time. And I get-- dammit, bug. How do I fix this? All right, so just a simple f. Make it a format string. No big deal. So let's rerun this, 1, 10. OK, hoo, hoo. That's a new one. What is going on there? AUDIENCE: [INAUDIBLE] DAVID MALAN: I did define z in the line above it, and what was your comment? AUDIENCE: You used x plus y. DAVID MALAN: I did use x plus y, but I think I-- oh, wait, OK. I'm sorry. Let's-- OK, so we can fix that. Let's-- sorry. There. OK, so 110. Hmm, still wrong. Good catch, thank you, though. Why is 1 plus 2 11-- or 1 plus 10, 11? Yeah? AUDIENCE: [INAUDIBLE]. DAVID MALAN: Wait, wait, wait. Sorry. AUDIENCE: [INAUDIBLE] [LAUGHTER] DAVID MALAN: This brings me back to my earlier point as to how tired I am. So this is correct. So Python does math correctly. But-- OK, horrifying. All right, so now let's do division and try to make the point I think I meant to make late last night where I if I do 1 divided by 10, OK, 1 divided by 10, as expected, does actually work here. So 0.1, that's correct. But remember in C-- let me dig myself out of this hole-- remember in C what happened if we dug a little deeper and we looked a little past the first decimal point. So how do I do this in Python? It's actually pretty similar. Let me go ahead and not just show myself z but go ahead and print out to, let's say, two decimal places that same value. The syntax here is weird. It's different from C. But you literally take the variable that you want to format, you put a colon and then a dot-- because you want to adjust the dot-- and then you want to say something like 2f. So this is saying, hey, Python, format the variable that's to the left of the colon using two decimal points. And by the way, it's a floating point value. So this f has a different meaning. This is f as in float. The f to the left is in format. So let me go ahead and run this. 1 divided by 10. And OK, still looking pretty good. Let's do maybe three decimal places, save that, rerun it. 1 divided by 10. Still pretty good. Let's get a little ambitious. Let's do it 50 decimal places out, 1 divided by 10, and damn it. Python has not fixed this fundamental problem. So we describe this problem as what? What's the sort of buzzword here to sort of explain or forgive this issue? AUDIENCE: [INAUDIBLE] DAVID MALAN: This is an integer overflow, related in spirit. Integer overflow literally happens when you're doing lots of addition and something's rolling over from a big value to a small or even a negative. Similar in spirit. Yeah? AUDIENCE: [INAUDIBLE] DAVID MALAN: Yeah. If you want to have an infinite amount of precision all the way out, you need an infinite amount of memory. And no Mac or PC or phone has an infinite amount of memory. At some point, a line is drawn in the sand and you can only be so precise. And so imprecision was the analog in the floating point world to overflow, recall, where if you only have a finite number of bits you can do really well up to a point. But eventually, the computer's got to estimate that value for you because you can't represent an infinite number of values. So this is to say Python is just as limited, fundamentally, as some other languages like C. So we've not gotten rid of all of those problems. But frankly, in the world of data science and analytics, it's certainly important precise mathematics. So there are solutions to this problem. But it requires special libraries, typically, importing something that allows you to use as much memory as you want more than just the default amount of memory. So that problem there still exists. Let me go ahead and open up one other example here. And in fact, in C, you'll recall that we had this example here. In C we had a program called overflow.c. And notice that this code in C from a few weeks back just multiplied i by 2, by 2, by 2. So it was doing exponentiation, so to speak-- 1 to 2 to 4 to 8, 16, 32, 64, and so forth. What happened if we waited long enough and watched this program a few weeks back? AUDIENCE: You go to 5 billion instead of-- DAVID MALAN: Yeah, we hit roughly 5 billion or 4 billion-- or rather, we technically hit, I think, 2 billion, and then it rolled over. And it actually created a problem. So let me actually do this. Let me go ahead and make overflow so we can demonstrate the points that you made earlier about integer overflow, which is, indeed, this one. Let me go ahead now and run overflow. I'll expand my window just so we can fit a little more in the screen. And as this runs-- whoops, let me fix this. Here we go. Let me go ahead and make overflow. And now 1, 2, 4, 8, 16, 32, and so forth. It's a little slow to start, but doubling and doubling is going to get us up to a big value pretty quickly. This is indeed going to overflow once we hit roughly 2 billion. Why? Why two billion, give or take? Why that value in C? Yeah? AUDIENCE: [INAUDIBLE] DAVID MALAN: Yeah, that's how much an integer can store because we're calling C. An int is typically 32 bits or 4 bytes. And with 32 bits, you can represent four billion possible values. And if half of those values are positive and half of them are negative, it stands to reason that the highest you can count is roughly 2 billion. And indeed, once we try to count up just doubling one billion, we overflow. So to your point earlier, overflow is still an issue, but in the context of integers. But now let's try a Python version of this. Let me go ahead now and open up overflow.py, which is a program I wrote in advance. It's on the course's website, as always, if you want to take a look more closely. And if I go into this file in weeks one, overflow.py, we see this code. So it's almost the same. But notice I'm using another library that we've not seen before, from time import sleep. It's kind of cute. So this allows me to sleep for a second. That's going to get tedious quickly, but that's OK. Let's do this real fast. If I go into the source six directory, weeks one, and run Python of overflow.py, it's the same function-- or same program, functionally. But honestly, this is getting a little tedious. Let's go ahead and not sleep for a second every time, save and reload. Let's just run the thing. Whew, look at it go. Only up there. Look up there. What's it doing differently? It's counting a lot higher than 2 billion. So what might you infer about integers in Python? AUDIENCE: [INAUDIBLE] DAVID MALAN: Say again? AUDIENCE: An integer is defined to be quite a number of bits. DAVID MALAN: OK, an integer is defined to be quite a number of bits. And indeed, that's the case. Python is not actually this slow. It's because we're running a web based IDE and the internet itself is a little slow. And so what's happening here is just the internet is getting in the way. But suffice it to say that Python is counting up way, way higher than C was. And that's the power you get by just using larger data types. We could have done this in C. We could have used longs, for instance. But notice that with Python you just get more by default out of the box. Let's go ahead and take a five minute break here. And when we resume, we'll introduce some more syntax and solve some more problems. All right, so let's take a look at a few other examples that are comparable to what we did back in week one and look at a few from week two and three and really take a look not just at the syntax, ultimately, but some of the features of Python. And of course, we need the ability to express ourselves conditionally or logically with control flow. And so let me propose a quick program here that we'll just call conditions.py, reminiscent of conditions.c some time ago. Let me go ahead and import from CS50 getInt this time and get myself another x with getInt x from the user. Then let me go ahead and ask them for getInt y from the user. And then let me go ahead and just compare them. And so per our comparison with Scratch a bit ago, I can simply say if x is less than y, then go ahead and print out, for instance, print x is less than y, just as we did weeks ago. Elif if x is greater than y, we can go ahead and print out x is greater than y. And then we can still have a third condition, else, just like in C, where we print out, for instance, the logical conclusion. x is equal to y. So just to point out some of the differences, indentation is ever so important now. And it's got to be consistent. You can't have four spaces and three. You've got to have, for instance, four all the way. Notice that I've got the colons consistently there. But notice that I don't need the parentheses, either, anymore. And with Python, there's sort of a buzzword, Pythonic. There is a Pythonic way of doing things. You can have parentheses around x, less than y, or x greater than y, just like in C. But it doesn't add anything logically, arguably. And if it doesn't make your code more readable, don't clutter your code with additional characters. And so that's a general rule of thumb now. Python is much more trim when it comes to syntax, only introducing it when it really solves a problem, which in this case, it doesn't really. Yeah? AUDIENCE: Quick question, the lines [INAUDIBLE],, those are grouped right together, one to the next, one to the next, and one to the next. If you were to put an additional line between them, would that break the code? DAVID MALAN: No, not at all. I can have as much whitespace vertically as I want if. I want to add some comments, indeed, I can do that. And why don't we do that, in fact, because the commenting syntax for Python is a little different. In C, we were in the habit of doing slash slash. Python, it's actually a little more succinct. You can just use a single hash. And you can say gets x from user here. I can say get y from user here. And then I can say something like compare x and y. And if I really wanted to, I could put comments in here. That is perfectly fine. But I'll just keep it more compact with this particular example. So any questions on the conditional syntax or what we've just done here? All right, let Me whip up another example, this time doing some comparisons. This time, let me create a file called answer.py, which is reminiscent of a quick example we did weeks ago called answer.c. Let me go ahead and from CS50 import getString. And this time, let me go ahead and declare a variable, C. And let me go ahead and get a string from the user-- whoops-- get a string from the user for their answer to whatever question it is we care about. And then if it's meant to be a yes/no answer, let's check for that. If c equals equals y or c equals equals little y, then go ahead and say, just for the sake of demonstration, yes, because the human presumably meant that. Elif c equals equals capital n or c equals equals little n, then go ahead and print out, for instance, no. So a short program, but what are some of the takeaways? Well, what's different clearly among these lines, 5 through 8, versus C, weeks ago? Yeah. AUDIENCE: For or you have to do-- DAVID MALAN: Yeah, none of those stupid vertical bars or the ampersand ampersand. If you want to do something or or and it together, just say and and or, much like Scratch, actually, some weeks ago. Notice, too-- how are we comparing strings? Turns out Python does not have chars, per se. C did have chars, single characters. Python only has strings. It has strings, ints, floats, and then some fancier things, but it doesn't have chars. So that's why I am deliberately using string. But when we use strings in C, how did we compare two strings? Str comp, right, because of the whole annoying pointer comparison thing. Well, it turns out now in Python if you want to compare two strings character by character by character, equal equals is back. And it does exactly what you expect it to do, even if it's a full word. So if you're actually checking for, for instance, yes or yes from the human, you can still use equal equals, as well, even though it's more than now one character. So that's a wonderful feature, too. And it just makes the code more readable and a lot easier to write right out of the gate. All right, so now recall that in C we spent a little while, as well as in Scratch, taking a look at a few examples about coughing, of all things. And in fact, in Python and C-- rather, in Scratch and in C-- we did a zero example that looked a little like this. If you want to simulate the notion of Scratch the cat coughing, you might, of course, do this. And then if he's going to cough three times, you might do this. And we ran this and it just did cough, cough, cough on the screen. I won't bother running it because it will just do that. But this was bad design we claimed weeks ago. What was the gist of why this is bad design? I mean, I literally copied and pasted. And the odds are if you're ever doing that in CS50 or in programming more generally, you're probably being a little lazy and there's a better way to do it. And it's a more maintainable way to do it. So of course, we introduced weeks ago, both in Scratch and in C, the ability to in cough one, this time, do a loop. And I can do a loop slightly differently in Python and in C. But for i in the range of 3, go ahead and print out cough. So the syntax for the for loop is a little different. But it's pretty straightforward, nonetheless, once you remember that you use for, variable name, then the preposition in, and then the word range with a parenthesis and its-- parentheses and the value you want to care about. But then we saw an opportunity, recall, to actually abstract coughing away. Coughing, at least in our textual form, is just the act of printing something. So we introduced in version two some time ago, the following approach in cough two. I instead defined a function called cough that did the coughing for me. And we've not seen this yet in Python. So how do you define a function in Python called cough? Put another way, how do you make your own custom puzzle piece, just as we did in Scratch? Well, you define it with def. And then you have it do exactly what you want it to do by just indenting the lines of code that belong to that function. So there's no return value. There's no need for an input at the moment. But we do have the colon. And we have the indentation. No curly braces, nothing else. How do I now use this function? Well, here's where we have a few options stylistically in the program. The simplest way to call this function would be quite simply like this. Go ahead and for i in range 3, go ahead now and cough. And this should look a little weird. It looks, indeed, a little sloppy. But let's see if it works. So if I go ahead and run Python of coughtwo.py, it seems to cough, cough, cough. But I say this is a little weird because what am I doing that's very different now from C? There's no what? There's no main function. I just have some code right here on the left of the screen. And yet, I do have a function here. And in Python, this is OK. Because you're using an interpreter and reading the file top to bottom, left to right, you don't strictly need a function called main. It's just going to interpret all of your code. And when it's seen the definition of a function, OK. It's going to say, OK, got it. I now know what the verb cough means. I will do this anytime I see it down here. But we're going to run into a problem. And if, indeed, I did what my first instinct was, which was to put the logic, the main part of my program at the top and to define cough down here, let's see what happens. Let me zoom out. Let me go ahead and rerun coughtwo.py. And now we start to see the first of our error messages. And they're going to look just as cryptic at first glance as is clang and make were. Arrested assured that help 50 can help with Python error messages, as well. But let's just try to parse what I do understand. cough2.py, line two in module whatever that is, name error. Name cough is not defined. So what's your gut here? What is that really-- what's the explanation for that error? Because cough is clearly defined-- literally with the define def verb-- right there on line four now. What-- AUDIENCE: You're calling cough before it's defined. DAVID MALAN: Yeah, I'm trying to call it before it's defined. Python is trying to take me very literally. And it's going to do top to bottom, left to right. And if it doesn't see until the bottom something it's supposed to be doing at the top, it's just not going to work. So there is a solution to this and it starts to get a little ugly. But it's a more generalized solution. It turns out that even though main is not required in a Python program, many programmers just create one nonetheless to address this particular problem. And they specifically do something like this-- def main-- and then below it they indent everything there. And then you need one specific feature to solve this problem now. I've now defined main and I've defined cough, which theoretically solves this problem just as it did in C. There is no notion of a prototype in Python. That is not the solution to copy paste the name of the function up above. But when I do this now, literally nothing happens. But I did get rid of the error. So just reason through this, perhaps. Especially if you've never programmed Python before, why might nothing now be happening? AUDIENCE: Not calling main? DAVID MALAN: I'm not calling main, yeah. So whereas in C-- and frankly, in Java, C++, and a few other languages-- main is special. It just gets called by default. In Python, main is not special. I've chosen this name main just because so many other languages use it, but it has no special significance. If you want to call main, you have to do it yourself. And so this is a little weird, admittedly. But you can literally do this down here because your code will be executed top to bottom, left to right. By the time line 10 is reached, both main has been defined and cough has been defined, which means you're good to go. So if I now go down here and run Python of cough2, now it actually works. Now, as an aside, this is not Pythonic, if you will. Most people would actually do this if the name equals equals main, then do this. This is for lower level reasons that let me wave my hand out for today. But long story short, the addition of this cryptic-looking line solves other problems that we're just not going to trip over this week and probably next. So this is the common way to do it. But if you just ignore that, the effect of this cryptic-looking code is just to call main yourself at the very bottom of your file. So when we start writing more interesting programs, this is just going to become conventional. If you want to start writing functions and so forth, odds are you'll benefit by writing a main function and putting more code in there. So let's do one final example with cough that actually now parameterizes the code, just as we did weeks ago in Scratch and C. This will be cough3.py. Let me start as I did just a little bit ago. But suppose I want to achieve this effect. I want the computer to cough three times by passing in an input. I now do need to modify cough to take an input. And in C, I would have said something like int n. But you don't have to specify data types in Python, you just have to specify the parameter name or the argument name. So that's nice and simple. And now down in here, in cough is where I should probably say for i in the range of 3, do this. But this isn't quite right. What fix do I want to make here? Yeah. Now I can just pass in n. So range is just a function that takes an argument that I've been hard coding as three just because. But you can generalize it with n, as well. So now again, per our discussion of abstraction weeks and weeks ago, do we have a sort of beautiful version of coughing, even though it's looking way more cryptic. But by step by step by step did we get to the point of having a main function that takes an abstraction, cough. Do it this many times. Now the implementation details are hidden in this custom puzzle piece, if you will. And the two lines at the bottom just kick off the whole execution of the program. But that's the only stuff that's really Python-specific now. Yeah? AUDIENCE: Can we use the cough function on line 11 [INAUDIBLE]?? DAVID MALAN: Could use the cough function on line 11? Yes. You could absolutely just do this, for instance, and get rid of main again. It's just a convention. Once you start writing more sophisticated programs with functions, you should probably introduce main just to keep it tidy. AUDIENCE: With the [INAUDIBLE]. DAVID MALAN: You could do that. Then you're starting to be non-Pythonic. Like, yes, you could do cough3 but people would look askew at you because it's just not done that way. That's what Pythonic means. Yeah, other questions? AUDIENCE: You need to have the [INAUDIBLE] come after the for i in range n so that it knows what the cough is? DAVID MALAN: Not in this case. So the order now is OK because first Python is seeing here's the definition of main. OK, I got it. And then it's saying, here is the definition of cough, OK, I got it. But it's not actually calling those functions yet. The Python errors are thrown only at what's called runtime, the running of the program's time, which means only when main is called does Python actually execute line 4 and then see, ooh, I need to call a function called cough. But that's OK because it saw it earlier when it first read the file top to bottom. So it matters when the functions are called, not where they appear, per se, in the file, the order in which they're called. Other questions? All right, yes? AUDIENCE: I don't know where you [INAUDIBLE] from. How do you define n as an integer? DAVID MALAN: How did I define n as an integer? This is what's nice about Python. If you want a variable or a parameter, just start using it without mentioning its data type. So the fact that I put n in parentheses in this function means, hey, Python, let this function take an input called n. And it can actually be any data type-- int, float, string, or even something else. It's up to me to use it responsibly as a number and to call it responsibly with a number. Good question. Yeah? AUDIENCE: So it's possible for a variable to change type? DAVID MALAN: It is, indeed, possible for a variable to change type, a good observation. So yes, Python is not as strongly-typed language, so to speak. C is strongly-typed in that if you make something an int, it is staying an int forever. Python is loosely typed, whereby x can be an int initially. But if you really want to turn it into a string, you can. But the convention there would be, yes, you can do that, but don't do that. So Python has the, frankly, the sort of arrogance of being sort of an adult language. Yes, you could do that, but just don't. Why do we have to protect you from yourselves? And so in that sense, you need to be a little more responsible about it. But again, there are arguments both ways. That induces potential bugs that C would catch for you. And this is where humans start to disagree about the upsides and downsides of languages, whether a language should be strongly or loosely or not even typed at all. A good observation. So let's look at a paradigm that was super common in C when we wanted to do something again and again to see how it actually is a little differently done in Python now. Let me go ahead and create a file called positive.py and go ahead and write a program a little quickly here. So from CS50, let me go ahead and import getInt, so we can get integers from the user. Let me go ahead and define a main function that simply does i, which will be my variable, gets a positive int, and asks the user, just as we did weeks ago, if you'll recall, for a positive integer. And then just goes ahead and very boringly prints it out. So that's all this program does. And let me go ahead and just from recollection-- though it's totally fine to copy/paste this cryptic-looking string, we would just be remiss in not showing you how most people do this. So if I do this, this is a complete program, except for the fact that what does not exist yet? Get positive int probably does not exist, just as it didn't in week one, because we have to invent it ourselves. Get int exists, but get positive int does not. And just for demonstration's sake, let's try this. Python of positive.py, notice we have name error get positive int not defined. OK, so we can fix that. We can literally define, or def, it. So get positive int. It's going to take a prompt from the user, just as it did weeks ago, the string that you want to show to him or her. And now let me go ahead and get a positive integer. What type of programming construct did we use in C to do something again and again and again? AUDIENCE: Loop. DAVID MALAN: A loop, for sure, but more specifically, to do something at least once and then maybe again and again and again if they don't cooperate? AUDIENCE: While. DAVID MALAN: Do while. No do while in Python. So that handy feature for user input does not exist. So that's fine. We need to solve this just differently. And honestly, in C, you could have solved that problem differently. You don't need do while. We could have taken it away from you. C could take it away. You could still solve every problem that we have in the past weeks using a for loop or a while loop. Do while just is a nice handy feature. But we can simulate it. And the Pythonic way of doing this is as follows. Deliberately induce an infinite loop, because you do want to loop potentially. But the logic is going to be, give me an infinite loop and I will break out of it when I'm ready to break out of it. This would be the convention. So while the following is true do this. Go ahead and declare a variable called n. Get an int from the user and pass in that same prompt. So get int, we wrote-- the staff-- prompt is whatever I typed in up here. So just copy/paste from the C version. And then under what circumstances do I want to break out of this infinite loop if the function is to be called to get positive int? AUDIENCE: [INAUDIBLE] DAVID MALAN: Yeah, so if n is greater than 0, then I do have the keyword break still, just as I did in C. I can break out of this loop. And then once I do that, I can go ahead and just return n. Or for that matter, I could condense this a little bit. I could just return n immediately and tighten it just a little bit. So multiple ways to do this. Otherwise it's just going to loop and loop forever. So let me go ahead now and run positive.py through Python, positive integer like negative 1, maybe negative 2, 0, OK, 1. And now it, indeed, co-operates. So this is just a common paradigm. This is the kind of thing when learning a new language that honestly tends to hang people up initially. You need to learn the JavaScript way of doing things. You need to learn the Python way of doing things. But then you start to notice these so-called design patterns. Anytime in Python you want to do something again and again, yes, you want to loop. But if you want to do something definitely once and maybe again? You still just use a loop, but you deliberately induce, typically, an infinite loop, and just break out of it when you're ready. So a very common approach. So not everything translates literally from C back and forth. Any questions then on that? Yeah, in the back? AUDIENCE: Is that something you just did with the while for loop, is that [INAUDIBLE] initializing a variable called [INAUDIBLE] to a negative number and then do while n is less than 0-- DAVID MALAN: Really good question. Is this approach preferable to instead declaring, maybe in here, a variable that is equal to some known value, like zero or whatnot, and then updating it? Short answer, yes, because your approach, while correct, is not as well-designed, arguably because it's just not necessary. And the Pythonic way, and really the well-designed way to do most things would be use as few lines as you can so long as it's still readable and understandable, which I would argue this is once you're comfortable with the syntax. But this does bring up an interesting point about one other topic in C. Scope has now gone out the window, at least as we previously saw it. Scope referred to where a variable lives. And we defined it essentially casually between two curly braces, the most recently opened curly braces. Well, no curly braces anymore so it turns out that variables by default have function scope here. So when you declare n on line 9, you can use it in Python on line 10. And you know what? You can even use it on line 12, even though it was declared inside of this loop higher up. So once you declare a variable on this line, you can use it anywhere on a subsequent line within that same function. So in some sense, it's a little sloppy that you're allowed to do this. But on the other hand, it's very convenient because you don't have to deal with those things like declaring the variable up here just to use it down here. So it's one less thing to think about. All right, let's take a look just a few examples from week two wherein we introduced arrays and strings more generally to see what has changed now, as well. You'll recall that in week two, perhaps, we had an example about capitalization. And let me go ahead and look at the third version of that, capitalize too, but convert it to Python. The purpose in life was to take input from the user and just capitalize every character therein. So if I type in my name in all lowercase, it should come back as all uppercase. So from the CS50 library, let me go ahead and import getString so that I have some input from the user. Then let me go ahead and just get a string from the user, like their name. And then I want to go ahead and capitalize everything. So let me go ahead and do this. And this is a fancy feature. In C I would have done a for int i is zero i less than strlen. I mean, you perhaps remember the paradigm for iterating over a string. Python is just so much more pleasant. For c in s-- that will induce a loop over the string s, giving you access to every character at a time, calling that variable c. And so what is it I want to do, just as a preliminary step, a baby step, if you will, let's just print out c, just to see what happens. Let me go ahead down here and do Python of capitalize two. Let me go ahead and type in my name, all lowercase. All right, and why is it showing up vertically like that, one character per line? Yeah, you get the free line-- free new line this time. So let's see how you can disable that. It's stupid looking, honestly. But you say end equals quote unquote, thereby revealing a new feature of Python that C does not have. It turns out that Python has not only positional arguments, as it's called, whereby you just pass in arguments between commas. That's what we've been doing in C. But Python also has named arguments, whereby you can specify the name of the argument, then an equals sign, then the value. And the power of named arguments, even though this is a tiny example, means that you can sometimes pass in your arguments in any order. You don't have to remember. You don't have to pull up CS50 manual or the man pages to remember what is the order of all these darn arguments. You can pass them in in any order, but by specifying the name of the argument, an equals sign, and its value. And in Python 2, you can have optional arguments. Obviously, in all of the examples thus far, I have never typed the word end and an equals sign yet. But what Python does support is default values for arguments. And so if you look in the documentation for Python, this is equivalent-- this cryptic looking sequence-- this is equivalent to the default behavior, which is to type none of that at all. End implies, for the print function, that you should end every line with that default character. Therefore, if you want to override it, you can just change it to the empty string, quote unquote. So if I now run this again and run it through with my name, now I get it like that, one character at a time. But you can do weird things, like ha ha ha ha ha-- not that you would. I don't know why I went with that. But I mean, that does the exact same thing because you're just changing the line ending. So don't do that, but do something else like this with it, instead. So suppose I want to now capitalize the first character. It turns out that strings in Python are more powerful than strings in C. In C, there is no string. That was a lie. It's just a sequence of characters as referenced by an address in memory. In Python, a string is an actual object. It's a data structure. And if you think about C, we had structs toward the very end of our look at C, nodes and structs and student structures and the like. A string in Python is like this container inside of which somewhere are all of those characters. But in that container or structure is also built-in functions, features of a string that you can just call. So in C, we would have said something like toUpper and then passed as input to a function called toUpper the character that we care about. Python kind of flips the logic around. Strings come with built-in functionality that allow you to operate on the given character automatically. So in Python, the syntax is actually the character itself. Use the dot notation because it's a structure. And then you can literally do-- oops. You can literally do upper. So this is to say, built into the string type in Python is a bunch of features, one of which is a function called upper. And the syntax with which you call it is the name of the variable or the name of the string dot name of the function open paren, close paren. And that's just now the paradigm. There's no C type library. There's no to upper or to lower. Those features now built into the strings themselves. And this is an example of encapsulation, or more generally, object oriented programming, something you'll explore if you take a class like CS51 that bakes into the data types itself all of the relevant functionality. It does not relegate them to another library. So if I clean this up by just moving the cursor to the next line, now hopefully you'll indeed see David typed out in all caps, the same idea as before. What about this length of a string? This one is pretty trivial, but if I go in here, let me go ahead and create a file called str len of .py. If I want to see the length of a string, from CS50 import getString, just as we did before. Let me go ahead and get a string for myself, like my name again. And then here, if I want to print the length of the string, in Python-- in C, you would say strlen. In Python, it's a little different. You actually just say len for length. So if I go ahead and run this through strlen-- strlen-- type in my name. Hopefully I, indeed, see five. And there's no notion that you need to care about the backslash zero in order to terminate the string. Yeah? AUDIENCE: So this upper [INAUDIBLE] DAVID MALAN: No, in fact. So that's a really good observation. Let's rewind and actually improve upon this rather than just translate it from what was our comparable example in C. Let me go ahead here and actually say, you know what? S gets s upper. And then let me just print s, perhaps. Let's see what happens. Let me go back here and run Python of capitalize 2. Enter David. And it operates on the whole string. Good intuition. And honestly, I don't need to do this. I could just say upper here and really trim this down and do Python of capitalize, type in my name. That still works. And if I really want to be fancy, I don't even need s at all. I can take this, get rid of that, put this here, immediately call upper on the user's input and whittle this down to one line, type in David, and that, too, works. So you just get lots and lots and lots of more expressiveness. Good question. So how do you even know that things like this exist? Well, quick aside. Google will truly be your friend in cases like this. And you'll want to know at this point, there's different versions of Python. The world is kind of holding out and is still using, a lot of people, version 2 of Python, which is older by many years now. We are using version 3. And this is where the world is going. And indeed, Python 2 will be officially deprecated or phased out in a couple of years, theoretically. So when you Google, you just want to be mindful of this so that you don't accidentally make your way to old tutorials, old documentation and the like. So let me go ahead and Google Python 3 string, or str, and upper, just to see if I can get to the documentation. Here you have a number of tutorials. But if we focus down here, what you're generally going to want to look for, at least for the official documentation, is docs.python.org. You see in the URL it's version 3, and that's where we want to go. So let me go ahead and click on this, common string operators. And I will disclaim this-- I think, personally, Python's documentation is not terribly newbie-friendly. Like, it's written fairly arcanely and you kind of have to really dig to understand certain things. That's fine. You'll get comfortable with it over time. But if you're feeling a little overwhelmed by, oh my God, I just want to know about upper, everyone feels this way too. So control F or Command F is your friend, upper. Let me go ahead and search for this. And it's not actually on this page, is it? String-- string methods. Here we go. String methods. OK, so under string methods, let me go ahead and search for upper. And down here, indeed, is the documentation. So the convention will be the name of the data type in question-- str for string-- the name of the function here. It would tell you in parentheses if it takes any arguments, but it doesn't. And so it returns a copy of the string with all of the cased characters converted to uppercase-- that just means the letters of the alphabet essentially-- and then some additional documentation, and so forth. It gets pretty low-level pretty quickly. These are the equivalent of the man pages. And there is no CS50 reference for Python. That was just for C. So just realize that there's this documentation available. And you'll notice there's bunches of functions. Strip is actually kind of a popular one, or L strip or R strip. If you have whitespace at the beginning or end of a line because your human got a little sloppy or there's new lines in a file, you can call strip on a string and get rid of whitespace to the left and right to kind of clean it up. Terribly useful for things like data science applications and analysis of data where you just kind of clean up messy data. So many functions like that are built in for you. All right, so let's take a look at a few other examples reminiscent of features we did have in C, such as this one here. Suppose I want to write a program that takes command line arguments, much like resize, with which we started today's story. Let's not even use the CS50 library. Let's do this. If you want access to argv, recall in C it looked like this-- int, argc, string, argv. It looked like this in C. Well, unfortunately, if you're not using main, it would be nice if you can still use command line arguments. And you can, but you have to import them. It's a library that provides you with access. From the sys or system library, you can import argv in Python. And that gives you access to command line arguments as a feature. Then you can say something like this. If the length of argv-- which is just an array, recall, in C-- equals equals 2, then go ahead and say hello. And let's go ahead and print out whatever the user typed in, argv 1. Else, let's just by default say hello world. So in English, what's happening? If the user typed in a command line argument-- say, hello so-and-so. Else if the human did not type in exactly one command line argument, just say, by default, hello world. So let me save this. Do Python of argv1, or rather zero. Enter. OK, I didn't type in a word after the command. So now let's do it again and I'll type in Brian's name. Enter, hello Brian. Let's do it again. Veronica, enter. Now, there's something that's not quite the same as C. How many words did I just type at the prompt? 3. So that would suggest that this is argv 0, argv 1, and argv 2. And yet, I'm printing argv 1, not argv 2. So how do I think about this? The code is correct, but it's different from C. What does argv technically store when you run a command like these? Remember, let's rewind. In C, argv 0 stored what? AUDIENCE: Name of the file. DAVID MALAN: The name of the file or the name of the program you just ran. Notice, though, the program I just ran is called Python. And so you would think that argv 0 would have Python in it, but it doesn't because notice if I'm printing argv 1, you would think that's 0, 1. You would think I just said hello argv 0 .py, But I didn't. argv 1 clearly prints Veronica or Brian. So it stands to reason argv 0 is this, which means this is, like, argv negative 1. Python is excluded from the argument vector, as it's called. The command line arguments do not include the name of the interpreter. But otherwise, it works exactly the same as it did once upon a time. And notice, too, with this new for construct, notice what you can do whenever you have access to an array of things. If I go into argv1.py and import argv again, let me go ahead now and just-- you know what? For s in argv, go ahead and print out s. It's really succinct. What is this going to do? Let me go ahead and do Python of argv1, enter. And it just prints out the name of the file. If I go ahead and say foo, bar, baz, three random words, it prints out all of those words. And so what's powerful about Python is honestly this for loop. There's no int i, less than, plus plus, any of that. You just say, give me a variable called s and iterate over the entirety of the thing on the right, which is presumed, in this case, to be an array. You can be even more powerful than that. If I-- just like in C weeks ago-- look at characters in these strings-- let me do argv2.py-- suppose that this iterate over each string in argv, and then here iterate over each character in s, I can do for c in s and now print out the character. So now when I run this same command but on argv2.py, notice what's going to happen. Let me raise this a little bit. Enter. It prints every character from every word one at a time. But it did so this time based on using these two for loops. So what does this mean? When you have an array, as we've called it, you can iterate over everything in the array. When you have a string, you can iterate over every character in the string. And this is where Python just gets wonderfully flexible to do this again and again. All right, let's take a look at-- let's see-- compared strings already. We copied strings. Let's go ahead and do this in Python. Recall that we ran into a fundamental limitation of C, and it would seem programming, when we had example called swap and no swap back in the day where I was just trying to swap two values, x and y. And recall that I hardcoded something like x is 1 and y is 2. And the whole goal was simply to first say, x is such and such, y is such and such. Let me go ahead and make that a format string. Then I wanted to print this again. But somewhere in here, I wanted to swap x and y. So to punctuate our sort of exploration of just what Python can do, if you want to swap two variables, x and y, that's fine, just do it. And it's this magical shell game that just works in Python. Now, technically these are what are called tuples on the left. It's a x comma y pair. It's latitude comma longitude. So there's an actual underlying mental model for what's going on here. But in effect, you're literally switching them and you don't need the temporary variable. Python the language takes care of that for you. All right, let's look at a more powerful feature still, this time using what's actually called a list. So a moment ago I was using argv 0, 1, 2, as our examples. And I was calling them arrays. They're not arrays anymore. Python does not have arrays. Python has lists. And lists sounds reminiscent of linked lists. And indeed, they are. In Python, you have lists that are resizable. You don't have to decide in advance how big they are or how small they are. They will just grow and shrink for you just like a linked list will, but you don't have to write the linked list yourself. Yeah? AUDIENCE: [INAUDIBLE] DAVID MALAN: Sure. AUDIENCE: [INAUDIBLE] DAVID MALAN: Oh, sure. Let me open that file up in argv1. This one here? AUDIENCE: No, it was, like, [INAUDIBLE]. DAVID MALAN: Oh, this one here. AUDIENCE: Yeah. [INAUDIBLE] bracket notation [INAUDIBLE].. DAVID MALAN: Yes, you can still-- so argv, I called it an array, but that was a white lie a moment ago. It's actually a list, a linked list. But whereas a linked list in C does not allow you to use square brackets, you have to use a for loop or a while loop to iterate over the whole thing to find what you're looking for, in Python, if something is in a list, you can just use, yes, the square brackets to get at that specific element. AUDIENCE: Or I'm saying you could use the f right before-- DAVID MALAN: Oh, I could have, yes. I didn't use the F, just because frankly it just gets ugly eventually. But yes, I could have also done this to achieve the exact same effect. It just starts to look cryptic. OK, so let's actually introduce a list, which itself is a data type in Python, as well as in languages like C++ and Java, if some of you have that background, as well. So here, in list.py, let me go ahead and do the following. Let me first import from the CS50 library getInt so that we can get some ints from the user. Let me give myself an array, a.k.a. now a list in Python. So in C you can't really express quite this idea. In Python, if you want a variable called numbers and you want to initialize it to an empty list, you just literally do open bracket, close bracket. No number in between them. And as before, no semi-colon. Let's now do the following forever until I break out of this. Let me go ahead and get a number from the user, just by asking them for some number. Then let me say, if not number, go ahead and break out of this. This is going to, as an aside, just let me quit out of this by hitting Control D as we discussed ever so briefly a while back. But that's just a UI feature. So this is what's kind of cool. Suppose I want to implement the notion of checking if the number the user's typed in is in the list already, and if so, not add it. I'm going to go ahead and do that. But first, let's just do this-- numbers.append number. And this is a new feature. So what do I want to do here? For number in numbers-- I'll explain this in a second-- let me go ahead and print number. So what is this program aspiring to do? At the very top, I'm importing getInt. At the very top below that, I'm just giving myself an empty array, now called a list, called numbers. Then I do the following forever. Go ahead and get the number from the user. If he or she did not actually type in a number, just break out of this. The program is done. But here's the new feature. Just as with strings, they are objects, so to speak. They are data structures that have functions built in. So do lists have functions built in. There is literally a function inside of every Python list called append that literally does that. You call append and it appends whatever its input is to whatever the list itself is. So in C, you might have had to use realloc. You might have had to add something to the end of the list. None of that happens anymore. Just at a high level, you say append this to the list and let the language take care of it for you. Then down here, left-aligned all the way at the end, is just saying, for number in numbers. Like, iterate over all of the numbers in the list and print out one at a time. So let's try this. Let me go down here and do Python of-- this is list.py-- and let me go ahead and type in a number like 13, 42, 50. And I'm going to hit Control D, which means that's it, I'm done. And there we see the three numbers. It looks a little stupid because you know what? I think I need a print here. Let's fix this. Let me rerun this. 13, 42, 50, Control D, there we go. One per line. But what this program has is honestly kind of a bug, potentially. Suppose I want unique numbers, now I have three 13s. But I'd ideally just want one copy of every number for whatever reason. I want uniqueness. Well, notice how easily you can express that. If my goal is to only conditionally add a number to the numbers list if it's not already there, how would you do this in C? You have an array called numbers and you want to first check is a number in that array. What would you do in English? AUDIENCE: A for loop. DAVID MALAN: A for loop, right? You'd probably start at the left, iterate over the whole array looking for the number and then conclude true or false, it's in there. It's not hard but it's a little annoying. You have to write more code, a couple of lines, four lines for a for loop. In Python, just say what you mean. If number not in numbers, append it. And it reads much more like English. At the end of the day, some human wrote the for loop that does that operation. But we, the more modern programmers, can just now say, if number not in numbers, append it. And so it is meant to read more English-like. So let's try this now. 13, 13, 50, done. Now I just get one copy of the 13 because it's checking that for me. Now, running time is still an issue. Consider this, theoretically, you're still wasting some time looking for a number because someone wrote code that's probably linear search. Maybe it's binary search if it's sorted. But someone wrote that code. But the point is, with these higher level languages, these more modern languages like Python, that is not our problem, necessarily. It only becomes our problem if the program is just too slow for some reason and we really need to get into the weeds of why. All right, let's look at a final feature syntactically before we try this to a more generalized problem. Let me go ahead and save a file called struct0.py, which is reminiscent of struct0.c a few weeks back. And let me go ahead and from the CS50 library import getString. Let me go ahead and give myself an array this time called students that's empty, or a list called students. And then let me just get three students for the sake of discussion. So for i in range 3, that just iterates three times, let me go ahead and ask the user for their name. So getString, ask them for their name. Then let me go ahead and ask them for their dorm and go ahead and get string for dorm. And then that's enough. Let me now go ahead and append the student to my list. So students dot append. But I don't really have a student structure yet. Now, there's many ways we can solve this, but let me propose the simplest one. It turns out in Python you can declare hash tables so wonderfully simply. A hash table is just a collection of key value pairs. And I would argue at this point in my example I have keys and values. I have a name which is a key and the value, like David or whatever, another key called dorm, and then a value which is like Matthews or wherever. And so keys and values. So it would be kind of nice if I could create for myself a hash table-- or even a try, for that matter-- that allows me to store this data. Well, it turns out in Python, I can do just that. I can go ahead and create an object called student using curly bracket notation. And you can literally do this. The name shall be one key. And now it's going to take on that value. Dorm shall be another key and it's going to take on that value. So I could call this anything I want-- x and y and have the values David and Matthews or whatever it is I'm going to type in. But if you want a very generalized data structure that isn't just a list of values from left to right, but has metadata-- a key, or if you think of a spreadsheet, a column name called name and a column name called dorm, each of which has values-- you just use curly braces. And you put the keys in quotes and then a colon. And then if you've got multiple keys, you just put a comma. So it's a little cryptic, but this is just like a container, a hash table, that contains words and values. Now, in p set 4, when you implemented speller, you actually just said yes or no, is the word in the dictionary? But you certainly could have stored more information instead of just Boolean values. You just tended to not need to do that. So what does this mean for me? At this point in the story, I have an object, as it's called in Python, that stores these keys and these values. So if later on I want to iterate over them, I can do this. For student in-- oh, you have to append it-- so student.append student. Let's add the student to the list. So for student in students, which is just how you iterate over every one of the things in that list. Let me just go ahead and say a sentence like, I want to say so and so is in this dorm. So how do express that? Well, so and so, I need to get access to the student's name. And the way I can do this is as follows. I could say, let's go ahead and say curly brace student bracket name close bracket. And then here, I can go ahead and say-- oops, let me put quotes in here-- and then here I can say student bracket quote unquote dorm. So this is admittedly the most cryptic example we've done thus far. But let's tease it apart as a format string. So if I zoom in on this, what am I doing? The curly braces and the f just means format this string. So you can ignore the curly braces as part of our story from earlier. Student is the name of the variable in the for loop. So it's the current student. The square brackets are new. In C, the only time we used square brackets was in what context? AUDIENCE: Arrays. DAVID MALAN: Arrays. And what did we always put in those square brackets? A number. Yeah, so 0, 1, 2. You can index into an array. What's cool about an object-- or a hash table more generally, as we're now defining it-- is you can index into the variable using not numbers, but words. So you could think of student as being like a list or an array with two values-- name and dorm. But it's nice to be able to refer to those not as zero and one or some stupid arbitrary number, but rather by keys-- name and dorm. So this syntax here, though cryptic, says go inside the student object and get me the value of the key called name. And this says the same thing about dorm. So an object in Python-- or more generally a hash table-- allows you to associate keys with values. And this is quite simply the syntax you use for that. So let me go ahead and run this. Struct0.py, type in my name. Let's say Matthews. Let's do, like, Veronica, Weld. Let's do Brian. Brian, where did you live? AUDIENCE: Which year? DAVID MALAN: Freshman year. AUDIENCE: Pennypacker. DAVID MALAN: Pennypacker, enter. Not that these specifics really matter, but now we have expressed all of these sentences. So the short of it now is we didn't quite see this in C, but we did see a hint of this when we implemented our own hash table in C so that we can actually access keys and values arbitrarily. So let's do a-- actually, let me pause here for any questions before we bring back Mario. All right. So let's now not just do examples for the sake of demonstration, but rewind to an old friend that we've seen a few times and just look at a few different screens. So in Super Mario Bros, running left to right you might recall or have seen that there's stuff like this in the sky. And Mario's supposed to run under it and jump up and he gets coins or whatever by jumping up and hitting these question marks. So this is mostly a very contrived way of saying, suppose we want to print out four question marks on the screen just like Super Mario Bros, how could we do it? It's going to be a little black and white, a little textual, but how do I print out four question marks? Well, let me go over here and let me create a file called, let's say, Mario0.py. And how do I do this? What's the simplest way to do this, print four question marks? OK, I heard print. OK, four question marks. Very good. So let's go ahead and run Mario0. Correct, that's right. So this is not bad. It's one string, not a huge deal. Let's do it at least with a loop, as we've been often doing, just to improve the design, even though this is a very tiny, tiny, tiny example. So Mario1.py, let's go ahead and print this out with a loop, for instance. So how do I do this? How do I print four question marks, but one at a time? For i in range four, print, question mark. Save, all right. So Python, Mario. Does anyone want to yell out, no, don't do that? OK, thanks. That's great. All right, so why did you not want me to do that? Because they're all vertical. So we did have a fix for this how. Do I tell print, don't end your lines with the default new line? So and equals just quote unquote to override the default backslash n value. So now I can rerun this. All right, it's a little buggy. So how can I fix this and only put a newline after the last one? AUDIENCE: [INAUDIBLE] DAVID MALAN: Yeah, honestly, just do print nothing. And that will have the effect of printing a new line for free. So let's do this. OK. Now we've got a good example there. All right, so it turns out we actually printed along the way a separate example, which looked like this, albeit with four blocks. So we won't-- let's go ahead and do this now vertically, not with question marks, but with hashes like bricks. So if we want to print out those three hashes, allow me to draw some inspiration from this and let's say in Mario2.py, let me go ahead and just say for i in range of three, go ahead and print out just one block. And as you've been advising, just do this-- or rather, no, let's use the default to print out a vertical bar of three blocks. So this is Mario2.py. And now we've done something reminiscent of that. But now things get a little interesting if we go underground. And let's focus on this square. So three by three, for instance, because we've not quite seen something like this. So in our last example here, let's see. Could we get maybe a brave volunteer to come on up, tie some of these ideas together? Is that a hand back there? Come on down. So this will be Mario3.py, the goal of which is to print a brick, a bigger brick-- it's like 3 by 3-- hello again. ANDREA: Hello. DAVID MALAN: For the audience, what's your name? ANDREA: Andrea. DAVID MALAN: Andrea, nice to see you. ANDREA: Nice to see you. DAVID MALAN: All right, so the goal at hand is to print a three by three grid of just hashes reminiscent of those bricks. All right, you're in charge. ANDREA: All right. Should I do, like, a loop or something? DAVID MALAN: Whatever gets the job done. All right, for. OK, good. OK, interesting. OK, print, quote unquote, print, yeah, OK. ANDREA: OK. Oh, right. DAVID MALAN: Key detail. ANDREA: What was it, a hash? DAVID MALAN: A hash is fine, yeah. ANDREA: OK. DAVID MALAN: All right. And before we do this, does everyone want her to run this program and be correct? AUDIENCE: Don't do it. DAVID MALAN: No, why? Someone who claims no, what? What's your concern? AUDIENCE: N equals-- it'll do it [INAUDIBLE] DAVID MALAN: Good, OK. So you fixed that. Good. Any other concerns? Yeah? AUDIENCE: [INAUDIBLE] DAVID MALAN: OK. Is it going to go up and down? Well, let's see. Can you walk us through verbally-- do we have-- can you walk us through what the program does? [LAUGHTER] ANDREA: For i in range 3, so this will happen three times, then j in range three, the next thing will also happen three times. So we print a hash. And then we another hash and another hash because the end is the quotation marks. DAVID MALAN: OK. ANDREA: And then that happens and then we print a new line. And then it should execute that three times. DAVID MALAN: All right. What do you think? Do you-- the duck is convinced. All right, why don't you go ahead and save the file. Let's try. No harm in trying, so right or wrong, let's see. This is called Mario3.py, and I think we have round of applause if we could. Very nicely done. All right. So let's-- and if you'd like one more. So let's take a look at one final example, coming full circle from where we began. We of course looked at resize. And let's open that up, just to see how I got away with writing so little code and actually getting that job done. So in resize.py, which is where we began, notice that I had a few lines that hopefully look a little more familiar now. But we didn't exactly introduce all of these features ourselves. So it turns out in line one and line two we have one unfamiliar and one familiar line. Line two just gives us access to a command line arguments, which we needed for resizing the bitmap. Line one is where a lot of the power is coming from. It turns out there's a library in Python called pillow that you can install by typing a certain command at your terminal. It doesn't necessarily come with your Mac or PC. You have to download it and install it with a command. And then if you read its documentation, it will say, from pill for pillow import image. Now, that's not a specific image. That's the name of a library called the image library that comes with that software that someone freely made available. So that's just saying, give me access to an image-related library. And undoubtedly, there could exist similar things in C. But we of course did things very hands-on low-level. All right, if the length of argv is not 4, yell at the user with the usage. And that's just if they don't cooperate by typing in as they should, this. It's a little more verbose now because we have Python and we have the file extension. But we could technically clean that up if we really wanted. Lines 7, 8, and 9, there's nothing really new there. I'm just declaring three variables implicitly typed. I don't have to bother saying int or string. I'm accessing argv 1, 2, and 3, which is 1, 2, and 3. And then I'm doing one thing line 7. What is line 7 doing that's important? AUDIENCE: [INAUDIBLE] DAVID MALAN: I'm changing the argument from what is technically a string by default-- because indeed, it came from the human hands at a keyboard-- and converting it into a number. Now, as an aside, if the user does not provide a number like 2 or 10, this code could break. To be fair, I should really have some error checking to make sure if the user typed in hello and not 2 or 10, I need to catch that error. So I'm being a little sloppy. But it was really meant to demonstrate succinct code. So now we have infile and outfile defined exactly as before. So we have just three lines left that actually implement most of the magic. Yeah. AUDIENCE: [INAUDIBLE] DAVID MALAN: Wait, say the last part again. AUDIENCE: [INAUDIBLE] DAVID MALAN: Yes. AUDIENCE: There was almost [INAUDIBLE] DAVID MALAN: Good observation. So this is not just converting the user's input to the equivalent ASCII value because that's not what we want. This int used here is actually converting it as via a2i, a function that you've probably used a couple of weeks ago, it's just named a little more succinctly. There is a function via which you could convert a character or a string to its ASCII equivalent. But that's not what's going on here. It does the more intuitive turn this into an integer without using a cryptically named function like a2i. So let's scroll down just a little further to these last few lines and see what's going on. Some of them you would only know how to do from having read the documentation just as we did. This says give me a variable called in image. Could have called it anything. I'm just trying to be consistent with in file. This says, use the image library. Use its open function that comes with it. So image is some kind of structure, inside of which is some useful image-related functionality. So call its open function on the name of the file, then go ahead and extract its height and width. So turns out this is another tuple, if you will. Tuples, again, are like x comma y, latitude comma longitude. You'd only know that it is a tuple from the documentation. So when I say width comma height, this is taking what's technically a list of size two-- or really, a tuple-- and it's just extracting for me the width and the height. But let me wave my hands at that particular syntax. The rest of this just says the following. Give me a new variable called out image. Call the input image's resize function, another piece of functionality built into it, just like open, and change it by this width and this height-- the original width times n, the original height times n. No padding manipulation, that's all the responsibility of the library. Some other human dealt with all of that for us. And this last line, perhaps not surprisingly, saves the output image to that file name. So in just, what, 15 lines of code and fewer if we get rid of some of the whitespace can you implement the entirety of resize. But really focusing on the logic of the problem, I want to take an input from the user. I want to scale it up by a factor of n. And I want to save out the file. That's what you care about. You don't necessarily care about getting into the weeds of exactly what it was you had to do when you did it in C. So let's do one final example here. You'll recall from problem set four you implemented your own spell checker. And odds are you did a try or a hash table or the like. And it turns out that is non-trivial, certainly in C. And it's non-trivial certainly for the first time in any language. But let me take a stab at doing this now in Python. Let me go into source 6 where I have a speller example. And notice that in this folder today I've brought a few files with me. So I've brought a copy of the dictionaries from p set four, a copy of the text files, like la-la land and the like in text. And then I brought two files-- dictionary.py and speller.py-- the latter of which is an implementation of speller.c in Python. And I'm not going to pull that one up because we wrote that one entirely for you. But let me go ahead and write, for instance, just my own dictionary. So dictionary.py is the analog of dictionary.c. And let's go ahead and set this up. Let me go ahead and create this file in a separate folder for now, so dictionary.py. And there's a few functions in dictionary.c which we should probably get around to implementing. What are those functions? AUDIENCE: Load. DAVID MALAN: Load was one, and load takes the name of a file or a dictionary. So let's do this. And I'll just say to do. Come back to that. What other functions were in dictionary.c? Check, so def check. And what did check take as an input? A word, yep. So we'll come back to this and just come back to that to do. What other functions? AUDIENCE: Size. DAVID MALAN: Size was one, so def size. This did not take input, but it just returned the size of the structure. So we'll come back to that. And lastly? AUDIENCE: Unload. DAVID MALAN: OK, so unload. All right, so this is the Python version of the distribution code for speller for your dictionary file. So unload also didn't take an argument. So that's something for us to do, too. So what's the gist of making a spell checker? You are loading words in your load function from a dictionary file. And the goal is to load those somehow into memory. You had a design decision for the p set in C, where you could make a hash table or a try or even a linked list or even an array. But odds are the first of those two were probably more efficient. So it turns out that in Python, you have the ability to store words pretty readily in any number of data structures. You have not just ints and floats and strings, but you clearly have lists, as we've seen. We call them objects or hashes, hash tables. And there's other things, too, even called sets, where a set is kind of just a collection of words which would be very nicely searchable. And so you know what? If I want to ultimately load some words, let me give myself a global variable called words and just initialize it to an empty set. So I have a global variable called words and nothing is in it just yet. But it's a set of words. How do I go about loading words into that dictionary? Well, let's go ahead and implement load here. So let me go ahead and declare a variable called file and open this dictionary in read mode, just as in C. And then how do I integrate over the lines in a file? We've not seen that. But I do know how to iterate over the strings in an array and the characters in a string. So let me go with my instinct for line in file. Indeed, this will do exactly what you want it to do. Then let me go ahead and add to my words data structure the following line. And then let me close the file. And then let me just say return true because all is well. Done. All right, so I'm cutting a few corners, technically. Let me use that function I alluded to earlier. Let me go ahead and call r strip and strip off the new line because in the file, technically, when you're reading in those words, every line ends with a backslash zero. That's now part of the word. So a minor correction there that I'm stripping off the line. But that's it for load. How do I now check if a given word is in that set? Well, I can just say, if word in words return true. Else, return false. Done with check. How do I return the size of this data structure? How about I just return the length of that structure, words, and then unload-- heck, Python's doing this all for me-- done. Let me shrink this. And you know what? This is a little verbose. I don't actually need to do this if else. I could just return word in words and that will return a Boolean for me. And honestly, if I want to lower case it, that's easy. I can just do this and take care of that. Now it's even better. That's p set 4. Excited? Wish we had done this in C? So what is the whole point of all of this, because the goal wasn't to create sort of great angst and wonder now. But the whole point of having introduced C over these past few weeks is to, one, none of this now do you take for granted. I mean, you might be longing for having implemented this in Python. And you might have had to read some documentation and figure out the various syntax. But my God. We whittled down what probably took most of you hours into just seconds once you're more comfortable with the language. But also, to our very earliest point today, once you have the right language and the right tool for the job. Now, it's not to say that this is perfect, because in fact, let's go ahead and do some tests. Let me go into my terminal window here. And I actually brought my own solution in my C folder here. Let's see. I have my own code to speller implemented in C here. And let me go ahead and run a test. Let me go ahead and run speller on, say, the text Shakespeare. That's a pretty big input. Let's go ahead and hit Enter. And this is my spell checker running. And all the words are outputting. And the time total to run speller in C was, say, 0.9 seconds. So that's actually pretty good. In a second window, let me go up here in another terminal window. And let me go into today's code and into the speller folder where I have a Python version that I'm going to run as follows-- speller.py-- let me go ahead and run it on Shakespeare. So we've not looked at speller.py. But it is essentially line for line a port, a translation, from C to Python. But you're welcome to look at that online. And it's using my dictionary.py file. Let me go ahead and run that. It's running through all the words. Top is Python, bottom is C. Here we go. Here we go. Here we go. Now, this is a bit misleading because again, the internet is the way. We're using a web-based IDE, and so it's funny that that appears so many times. And you'll see it's not 10, 20 seconds, however long that was. That was just the internet being slow. And all we're timing is your functions in both C and Python. But what's the takeaway between Python and C? Same inputs. What do you see? Yeah? AUDIENCE: Be more concise [INAUDIBLE]. DAVID MALAN: Yeah, I wouldn't say concise. That's more aesthetic. It's more-- AUDIENCE: Specific [INAUDIBLE]. DAVID MALAN: Well, not even that, I think. These are correct. Both of them are correct. All the important numbers at the top are identical. But what is clearly different, though? It's slower. So Python seems to be slower, right? It takes in total-- if we just look at two numbers-- 1.55 seconds in Python, if you ignore the internet speed and just look at the code performance, versus 0.9. So it's almost twice as slow as C. So what's the takeaway there? Well, yes, it took me, what, 10, 20, 30 seconds to write the code. But it's taking me twice as long to run it. Now, not a big deal, of course, when we're talking a few seconds here and there. But if this were a big data set that you're analyzing for some project or for work or for any kind of analysis project and the data is much larger than even this-- especially in the medical field or the like-- maybe you don't want to use Python. Sure, you can bang out the code in just a few minutes, maybe a few hours. But once you run it, damn, it's slower than using something like C. Whereas in C, might take you more time upfront. And you might not even have the comfort with C anymore so it's going to take an even longer because you have to go relearn the language. But when you run it, wow, it runs twice as fast. You therefore need less RAM, potentially, less hardware or less expensive hardware because you can get away with more. So again, this theme we keep seeing in data structures and algorithms is trade-offs. Like, developer time is a resource and it is wonderful that I and now you would be able to write code so much more quickly. But you do have to pay a price somewhere. And there's clearly a price with Python. And it's not because Python is poorly implemented. But what is the fundamental difference between the paradigm of programming in C versus in Python as we've seen it today? What's different? Yeah? AUDIENCE: [INAUDIBLE] line by line, whereas C, it essentially-- [INAUDIBLE] optimize running it, it will run [INAUDIBLE].. DAVID MALAN: Indeed. And let me flip it around. So with C, you're compiling down to zeros and ones. And that compiler is super smart. And it's going to move things around in memory. It's going to talk the computer's native language of zeros and ones. Python is, indeed, reading your code, by contrast, line by line, top to bottom, left to right. And even though technically underneath the hood there is a compilation step, there is nonetheless some overhead involved. The mere fact that we're no longer running clang and then getting 0's and 1's or running make and getting zeros and ones, that's great. But we have to pay the price somewhere. So this is going to be thematic. Like, there is no holy grail among languages or tools or techniques. There's going to be trade-offs among your comfort, your familiarity or recollection of a language, how easy it is to use, how succinctly you can type it, and then how efficiently you can actually run it on the screen. And with C, hopefully now-- we will not write any more C-code-- you have an appreciation in Python of when you create a hash-- or a list, rather-- or if you create a set or a hash table or the like, what you're really getting access to is someone else's implementation of p set four and p set three and p set two and p set one, in some form, but now exposed to you in a more powerful and more modern language. So let's end there officially today. And next week, we'll do the same thing, but in the context of web programming.
B1 中級 CS50 2018 - 閱讀6 - Python (CS50 2018 - Lecture 6 - Python) 16 0 林宜悉 發佈於 2021 年 01 月 14 日 更多分享 分享 收藏 回報 影片單字