Placeholder Image

字幕列表 影片播放

  • >> [MUSIC PLAYING]

  • >> DAVID J. MALAN: All right.

  • This is CS50 and this is the start of Week 2.

  • And you'll recall that over the past couple of weeks,

  • we've been introducing computer science and, in turn, programming.

  • >> And we started the story by way of Scratch, that graphical language

  • from MIT'S Media Lab.

  • And then most recently, last week, did we

  • introduce a higher-- a lower-level language known

  • as C, something that's purely textual.

  • And, indeed, last time we explored within that context

  • a number of concepts.

  • >> This, recall, was the very first program we looked at.

  • And this program, quite simply, prints out, "hello, world."

  • But there's so much seeming magic going on.

  • There's this #include with these angle brackets.

  • There's int.

  • There's (void).

  • There's parentheses, curly braces, semi-colons, and so much more.

  • >> And so, recall that we introduced Scratch

  • so that we could, ideally, see past that syntax, the stuff that's really not

  • all that intellectually interesting but early on

  • is, absolutely, a bit tricky to wrap your mind around.

  • And, indeed, one of the most common things early on in a programming class,

  • especially for those less comfortable, is to get frustrated by

  • and tripped up by certain syntactic errors, not to mention logical errors.

  • And so among our goals today, actually, will

  • be to equip you with some problem-solving techniques for how

  • to better solve problems themselves in the form of debugging.

  • And you'll recall, too, that the environment that we introduced

  • last time was called CS50 IDE.

  • This is web-based software that allows you to program in the cloud,

  • so to speak, while keeping all of your files together, as we again will today.

  • And recall that we revisited these topics here,

  • among them functions, and loops, and variables, and Boolean expressions,

  • and conditions.

  • And actually a few more that we translated from the world of Scratch

  • to the world of C.

  • >> But the fundamental building blocks, so to speak,

  • were really still the same last week.

  • In fact, we really just had a different puzzle piece, if you will.

  • Instead of that purple save block, we instead

  • had printf, which is this function in C that

  • allows you to print something and format it on the screen.

  • We introduced the CS50 Library, where you

  • have now at your disposal get_char, and get_int, and get_string,

  • and a few other functions as well, via which you can get input

  • from the user's own keyboard.

  • And we also took a look at things like these- bool, and char,

  • and double, float, int, long_long string.

  • And there's even other data types in C.

  • >> In other words, when you declare a variable to store some value,

  • or when you implement a function that returns some value,

  • you can specify what type of value that is.

  • Is it a string, like a sequence of characters?

  • Is it a number, like an integer?

  • Is it a floating point value, or the like?

  • So in C, unlike Scratch, we actually began to specify what kind of data

  • we were returning or using.

  • >> But, of course, we also ran into some fundamental limits of computing.

  • And in particular, this language C, recall

  • that we took a look at integer overflow, the reality

  • that if you only have a finite amount of memory

  • or, specifically, a finite number of bits, you can only count so high.

  • And so we looked at this example here whereby a counter in an airplane, ,

  • actually, if running long enough would overflow and result in a software

  • an actual physical potential error.

  • >> We also looked at floating point imprecision, the reality

  • that with only a finite number of bits, whether it's 32 or 64,

  • you can only specify so many numbers after a decimal point, after which you

  • begin to get imprecise.

  • So for instance, one-third in the world here, in our human world,

  • we know is just an infinite number of 3s after the decimal point.

  • But a computer can't necessarily represent an infinite number of numbers

  • if you only allow it some finite amount of information.

  • >> So not only did we equip you with greater power in terms

  • of how you might express yourself at a keyboard in terms of programming,

  • we also limited what you can actually do.

  • And indeed, bugs and mistakes can arise from those kinds of issues.

  • And indeed, among the topics today are going to be topics like debugging

  • and actually looking underneath the hood at how things were introduced last week

  • are actually implemented so that you better

  • understand both the capabilities of and the limitations of a language like C.

  • >> And in fact, we'll peel back the layers of the simplest of data structure,

  • something called an array, which Scratch happens to call a "list."

  • It's a little bit different in that context.

  • And then we'll also introduce one of the first of our domain-specific problems

  • in CS50, the world of cryptography, the art of scrambling

  • or in ciphering information so that you can send secret messages

  • and decode secret messages between two persons, A and B.

  • >> So before we transition to that new world,

  • let's try to equip you with some techniques with which you can eliminate

  • or reduce at least some of the frustrations

  • that you have probably encountered over the past week alone.

  • In fact, ahead of you are such-- some of your first problems in C. And odds are,

  • if you're like me, the first time you try to type out a program,

  • even if you think logically the program is pretty simple,

  • you might very well hit a wall, and the compiler is not going to cooperate.

  • Make or Clang is not going to actually do your bidding.

  • >> And why might that be?

  • Well, let's take a look at, perhaps, a simple program.

  • I'm going to go ahead and save this in a file deliberately called buggy0.c,

  • because I know it to be flawed in advance.

  • But I might not realize that if this is the first or second or third program

  • that I'm actually making myself.

  • So I'm going to go ahead and type out, int main(void).

  • And then inside of my curly braces, a very familiar ("hello, world--

  • backslash, n")-- and a semi-colon.

  • >> I've saved the file.

  • Now I'm going to go down to my terminal window

  • and type make buggy0, because, again, the name of the file today is buggy0.c.

  • So I type make buggy0, Enter.

  • >> And, oh, gosh, recall from last time that no error messages is a good thing.

  • So no output is a good thing.

  • But here I have clearly some number of mistakes.

  • >> So the first line of output after typing make buggy0, recall,

  • is Clang's fairly verbose output.

  • Underneath the hood, CS50 IDE is configured

  • to use a whole bunch of options with this compiler

  • so that you don't have to think about them.

  • And that's all that first line means that starts with Clang.

  • >> But after that, the problems begin to make their appearance.

  • Buggy0.c on line 3, character 5, there is a big, red error.

  • What is that?

  • Implicitly declaring library function printf with type int (const char *,

  • ...) [-Werror].

  • I mean, it very quickly gets very arcane.

  • And certainly, at first glance, we wouldn't

  • expect you to understand the entirety of that message.

  • And so one of the lessons for today is going

  • to be to try to notice patterns, or similar things,

  • to errors you might have encountered in the past.

  • So let's tease apart only those words that look familiar.

  • The big, red error is clearly symbolic of something being wrong.

  • >> Implicitly declaring library function printf.

  • So even if I don't quite understand what implicitly declaring library function

  • means, the problem surely relates to printf somehow.

  • And the source of that issue has to do with declaring it.

  • >> Declaring a function is mentioning it for the first time.

  • And we used the terminology last week of declaring a function's prototype,

  • either with one line at the top of your own file or in a so-called header file.

  • And in what file did we say last week that printf is quote,

  • unquote, declared?

  • In what file is its prototype?

  • >> So if you recall, the very first thing I typed, almost every program last time--

  • and accidentally a moment ago started typing myself-- was this one here--

  • hash-- #include <stio-- for input/output-- dot h And indeed,

  • if I now save this file, I'm going to go ahead and clear my screen,

  • which you can do by typing Clear, or you can hold Control L,

  • just to clear your terminal window just to eliminate some clutter.

  • >> I'm going to go ahead and re-type make buggy0, Enter.

  • And voila, I still see that long command from Clang,

  • but there's no error message this time.

  • And indeed, if I do ./buggy0, just like last time,

  • where dot means this directory, Slash just means,

  • here comes the name of the program and that name of the program is buggy0,

  • Enter, "hello, world."

  • >> Now, how might you have gleaned this solution

  • without necessarily recognizing as many words

  • as I did, certainly, having done this for so many years?

  • Well, realize per the first problem set, we introduce you to a command

  • that CS50's own staff wrote called help50.

  • And indeed, C does specification for the problem set as to how to use this.

  • >> But help50 is essentially a program that CS50's staff

  • wrote that allows you to run a command or run a program,

  • and if you don't understand its output, to pass its output to help50,

  • at which point the software that the course's staff wrote

  • will look at your program's output line by line, character by character.

  • And if we, the staff, recognize the error message that you're experiencing,

  • we will try to provoke you with some rhetorical questions, with some advice,

  • much like a TF or a CA or myself would do in person at office hours.

  • >> So look to help50 if you don't necessarily recognize a problem.

  • But don't rely on it too much as a crutch.

  • Certainly try to understand its output and then learn from it

  • so that only once or twice do you ever run help50 for a particular error

  • message.

  • After that, you should be better equipped yourself

  • to figure out what it actually is.

  • >> Let's do one other here.

  • Let me go ahead, and in another file we'll call this buggy1.c.

  • And in this file I'm going to deliberately--

  • but pretend that I don't understand what mistake I've made.

  • >> I'm going to go ahead and do this-- #include , since I've

  • learned my lesson from a moment ago.

  • Int main(void), as before.

  • And then in here I'm going to do string s - get_string.

  • And recall from last time that this means, hey, computer,

  • give me a variable, call it s, and make the type of that variable a string

  • so I can store one or more words in it.

  • >> And then on the right-hand side of the equal sign

  • is get_string, which is a function in the CS50 Library

  • that does exactly that.

  • It gets a function and then hands it from right to left.

  • So this equal sign doesn't mean "equals" as we might think in math.

  • It means assignment from right to left.

  • So this means, take the string from the user and store it inside of s.

  • >> Now let's use it.

  • Let me go ahead now and as a second line, let me go ahead and say "hello"--

  • not "world," but "hello,%s-- which is our placeholder, comma s,

  • which is our variable, and then a semi-colon.

  • So if I didn't screw up too much here, this looks like correct code.

  • >> And my instincts now are to compile it.

  • The file is called buggy1.c.

  • So I'm going to do make buggy1, Enter.

  • And darn-it, if there isn't even more errors than before.

  • I mean, there's more error messages it would

  • seem than actual lines in this program.

  • >> But the takeaway here is, even if you're overwhelmed

  • with two or three or four more error messages,

  • focus always on the very first of those messages.

  • Looking at the top-most one, scrolling back up as need be.

  • So here I typed make buggy1.

  • Here's that Clang output as expected.

  • >> And here's the first red error.

  • Use of undeclared identifier string, did I mean standard in?

  • So standard in is actually something else.

  • It refers to the user's keyboard, essentially.

  • >> But that's not what I meant.

  • I meant string, and I meant get_string.

  • So what is it that I forgot to do this time?

  • What's missing this time?

  • I have my #include , so I have access to printf.

  • >> But what do I not have access to just yet?

  • Well, just like last time, I need to tell the compiler

  • Clang what these functions are.

  • Get_string does not come with C. And in particular, it

  • doesn't come in the header file, .

  • It instead comes in something the staff wrote,

  • which is a different file name but aptly named .

  • >> So simply by adding that one line of code-- recall from last time

  • that when Clang runs, it's going to look at my code top to bottom,

  • left to right.

  • It's going to notice, oh, you want .

  • Let me go and find that, wherever it is on the server,

  • copy and paste it, essentially, into the top of your own file

  • so that at this point in the story, line 1, the rest of the program

  • can, indeed, use any of the functions therein, among them get_string.

  • So I'm going to ignore the rest of those errors,

  • because I, indeed, suspect that only the first one actually mattered.

  • And I'm going to go ahead and rerun, after saving my file make buggy1.

  • And voila, it did work.

  • And if I do ./buggy1 and type in, for instance, Zamyla, I now will get hello,

  • Zamyla, instead of hello, world.

  • >> All right.

  • So the takeaways here then are to, one, try to glean as much as you can

  • from the error messages alone, looking at some of the recognizable words.

  • Barring that, use help50 per the problem set specification.

  • But barring that, too, always look at the top error only, at least

  • initially, to see what information it might actually yield.

  • But it turns out there's even more functionality built

  • into the CS50 Library to help you early on in the semester

  • and early on in programming figure out what's going wrong.

  • So let's do another example here.

  • I'm going to call this buggy2, which, again, is going to be flawed out

  • of the gate, by design.

  • >> And I'm going to go ahead and do #include .

  • And then I'm going to do int main(void).

  • And then I'm going to do a for loop.

  • For (int i _ 0.

  • i is less than or equal to 10.

  • i++, and then in curly braces, I'm going to print out just a hashtag symbol here

  • and a new line character.

  • >> So my intent with this program is quite simply

  • to iterate 10 times and on each iteration

  • of that loop each time through the cycle,

  • print out a hashtag, a hashtag, a hashtag.

  • One per line because I have the new line there.

  • And recall that the for loop, per last week--

  • and you'll get more familiar with the syntax

  • by using it with practice before long-- this gives me

  • a variable called i and sets it to 0.

  • >> This increments i on every iteration by 1.

  • So i goes to 1 to 2 to 3.

  • And then this condition in the middle between the semi-colons

  • gets checked on every iteration to make sure that we are still within range.

  • So I want to iterate 10 times, so I have sort of very intuitively just

  • put 10 as my upper bound there.

  • >> And yet, when I run this, after compiling it with make buggy2--

  • and it does compile OK.

  • So I don't have a syntax error this time.

  • Let me go ahead now and run buggy2, Enter.

  • And now scroll up.

  • And let me increase the size of the window.

  • >> I seem to have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11.

  • So there's 11 hashtags, even though I clearly put 10 inside of this loop.

  • Now, some of you might see immediately what the error is because, indeed, this

  • isn't a very hard error to make.

  • But it's very commonly made very early on.

  • >> What I want to point out, though, is, how might I figure this out?

  • Well, it turns out that the CS50 Library comes

  • with not only get_string and get_int and get_float and other functions.

  • It also comes with a special function called eprintf, or, error printf.

  • And it exists solely to make it a little bit easier for you

  • when debugging your code to just print an error message on the screen

  • and know where it came from.

  • >> So for instance, one thing I might do here with this function is this--

  • eprintf, and then I'm going to go ahead and say i is now %i, backslash, n.

  • And I'm going to plug in the value of i.

  • And up top, because this is in the CS50 Library,

  • I'm going to go ahead and include

  • so I have access to this function.

  • But let's consider what line 9 is supposed to be doing.

  • I'm going to delete this eventually.

  • This has nothing to do with my overarching goal.

  • But eprintf, error printf, is just meant to give me some diagnostic information.

  • When I run my program, I want to see this on the screen temporarily

  • as well just to understand what's going on.

  • >> And, indeed, on each iteration here of line 9

  • I want to see, what is the value of i?

  • What is the value of i?

  • What is the value of i?

  • And, hopefully, I should only see that message, also, 10 times.

  • >> So let me go ahead and recompile my program,

  • as I have to do any time I make a change. ./buggy2.

  • And now-- OK.

  • There's a lot more going on.

  • So let me scroll up in an even bigger window.

  • >> And you'll see that each of the hashtags is still printing.

  • But in between each of them is now this diagnostic output formatted as follows.

  • The name of my program here is buggy2.

  • The name of the file is buggy2.c.

  • The line number from which this was printed is line 9.

  • And then to the right of that is the error message that I'm expecting.

  • >> And what's nice about this is that now I don't have to necessarily count

  • in my head what my program is doing.

  • I can see that on the first iteration i is 0,

  • then 1, then 2, then 3, then 4, then 5, then 6, then 7, then 8, then 9, then

  • 10.

  • So wait a minute.

  • What's going on here?

  • I still seem to be counting as intended up to 10.

  • >> But where did I start?

  • 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 10.

  • So 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10-- the 11th finger

  • is indicative of the problem.

  • I seem to have counted incorrectly in my loop.

  • Rather than go 10 iterations, I'm starting at 0,

  • I'm ending at and through 10.

  • But because, like a computer, I'm starting counting at 0,

  • I should be counting up to, but not through, 10.

  • >> And so the fix, I eventually realized here, is one of two things.

  • I could very simply say count up to less than 10.

  • So 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, which is, indeed, correct,

  • even though it sounds a little wrong.

  • Or I could do less than or equal to 9, so long as I start at 0.

  • Or if you really don't like that, you can count up through 10 but start at 1.

  • But again, this just isn't that common.

  • In programming-- albeit not so much in Scratch--

  • but in programming in C and other languages,

  • like JavaScript and Python and others, it's

  • just very common for our discussion of binary

  • to just start counting at the lowest number you can, which is 0.

  • All right.

  • So that's eprintf.

  • And again, now that I've figured out my problem, and I'm going to go back to 0

  • through less than 10, I'm going to go in and delete eprintf.

  • >> It should not be there when I ship my code or submit my code

  • or show it to anyone else.

  • It's really just meant to be used temporarily.

  • But now I've fixed this particular problem as well.

  • >> Well, let's do one more example here that I'm going to whip up as follows.

  • I'm going to go ahead and #include . $50

  • And I'm going to go ahead and #include .

  • >> And I'm going to save this file as buggy3.c.

  • And I'm going to go ahead and declare int main(void).

  • And then inside of there I'm going to do int i _ --

  • I want to implement a program with a get_negative_int.

  • This is not a function that exists yet.

  • So we're going to implement it in just a moment.

  • But we're going to see why it's buggy at first pass.

  • And once I've gotten an int from the user,

  • I'm just going to print %i is a negative integer, backslash, n, comma, i.

  • In other words, all I want this program to do

  • is get a negative int from the user and then print out

  • that such and such is a negative int.

  • >> Now I need to implement this function.

  • So later in my file, I'm going to go ahead and declare a function called

  • get_negative_int(void)-- and we'll come back to what that line means again

  • in a moment-- int n; do-- do the following-- printf n is:.

  • And then I'm going to do n - get_int, and do this while n is greater than 0.

  • And then return n;.

  • >> So there's a lot going on in this but none of which we didn't

  • look at last week, at least briefly.

  • So on line 10 here I've declared a function called get_negative_int,

  • and I've put (void), in parentheses, the reason being this

  • does not take an input.

  • I'm not passing anything to this function.

  • I'm just getting something back from it.

  • >> And what I'm hoping to get back is an integer.

  • There is no data type in C called negative_int.

  • It's just int, so it's going to be on us to make sure

  • that the value that's actually returned is not only an int

  • but is also negative.

  • >> On line 12 I'm declaring a variable called n and making it of type int.

  • And then in line 13 through 18 I'm doing something while something is true.

  • I'm going ahead and printing n is, colon, and then a space,

  • like a prompt for the user.

  • >> I'm then calling get_int and storing its so-called return value

  • in that variable n.

  • But I'm going to keep doing this while n is greater than 0.

  • In other words, if the user gives me an int and that number is greater than 0,

  • ergo, positive, I'm going to just keep reprompting the user,

  • keep reprompting, by forcing them to cooperate and give me a negative int.

  • >> And once n is actually negative-- suppose the user finally types -50,

  • then this while loop is no longer true because -50 is not greater than 0.

  • So we break out of that loop logically and return n.

  • >> But there's one other thing I have to do.

  • And I can simply do this by copying and pasting

  • one line of code at the top of the file.

  • I need to teach Clang, or promise to Clang,

  • explicitly that I will, indeed, go and implement

  • this function get_negative_int.

  • It might just be lower in the file.

  • Again, recall that Clang reads things top to bottom,

  • left to right, so you can't call a function if Clang

  • doesn't know it's going to exist.

  • >> Now, unfortunately, this program, as some of you might have noticed,

  • is already buggy.

  • Let me go ahead and make buggy3.

  • It compiles, so my problem now is not a syntax error, like a textual error,

  • it's actually going to be a logical error that I've deliberately

  • made as an opportunity to step through what's going on.

  • >> I'm going to go ahead now and run buggy3.

  • And I'm going to go ahead and not cooperate.

  • I'm going to give it the number 1.

  • It didn't like it, so it's prompting me again.

  • >> How about 2?

  • 3?

  • 50?

  • None of those are working.

  • How about -50?

  • And the program seems to work.

  • >> Let me try it once more.

  • Let me try -1, seems to work.

  • Let me try -2, seems to work.

  • Let me try 0.

  • Huh, that's incorrect.

  • Now, we're being a little pedantic here.

  • But it's, indeed, the case that 0 is neither positive nor negative.

  • And so the fact that my program is saying that 0 is a negative integer,

  • that's not technically correct.

  • >> Now, why is it doing this?

  • Well, it might be obvious.

  • And, indeed, the program is meant to be fairly simple

  • so we have something to explore.

  • >> But let's introduce a third debugging technique here called debug50.

  • So this is a program that we've just created

  • this year called debug50 that will allow you

  • to use what's called a built-in graphical debugger in CS50 IDE.

  • And a debugger is just a program that generally lets you run your program

  • but step by step by step, line by line by line, pausing, poking

  • around, looking at variables so that the program doesn't just blow past you

  • and quickly print something or not print something.

  • It gives you an opportunity, at human speed, to interact with it.

  • >> And to do this, you simply do the following.

  • After compiling your code, which I already did, buggy3,

  • you go ahead and run debug50 ./buggy.

  • So much like help50 has you run help50 and then the command,

  • debug50 has you run debug50 and then the name of the command.

  • >> Now watch what happens on my screen, on the right-hand side in particular.

  • When I hit Run, all of the sudden this right-hand panel

  • opens up on the screen.

  • And there's a lot going on at first glance.

  • But there's not too much to worry about yet.

  • >> This is showing me everything that's going on inside of my program

  • right now and via these buttons up top is then

  • allowing me to step through my code ultimately step by step by step.

  • But not just yet.

  • Notice what happens.

  • At my terminal window I'm being prompted for n.

  • And I'm going to go ahead and cooperate this time and type in -1.

  • And albeit a little cryptically, -1 is a negative integer, as expected.

  • >> And then child exited with status 0 GDBserver exiting.

  • GDB, GNU Debugger, is the name of the underlying software

  • that implements this debugger.

  • But all this really means, the debugger went away because my program quit

  • and all was well.

  • If I want to truly debug my program, I have to preemptively tell debug50,

  • where do I want to start stepping through my code?

  • >> And perhaps the simplest way to do that is as follows.

  • If I hover over the gutter of my editor here,

  • so really just in the sidebar here, to the left of the line number,

  • notice that if I just click once, I put a little red dot.

  • And that little red dot, like a stop sign, means, hey,

  • debug50, pause execution of my code right there when I run this program.

  • >> So let's do that.

  • Let me go ahead and run my program again with debug50 ./buggy3, Enter.

  • And now, notice, something different has happened.

  • I'm not being prompted yet in my terminal window

  • for anything, because I haven't gotten there yet in my program.

  • Notice that on line 8 which is now highlighted,

  • and there's a little arrow at left saying, you are paused here.

  • This line of code, line 8, has not yet executed.

  • >> And what's curious, if I look over here on the right-hand side,

  • notice that i is a local variable, local in the sense

  • that it's inside the current function.

  • And its value, apparently by default, and sort of conveniently, is 0.

  • But I didn't type 0.

  • That just happens to be its default value at the moment.

  • >> So let me go ahead and do this now.

  • Let me go ahead and on the top right here, I'm

  • going to go ahead and click this first icon which

  • means step over which means don't skip it but step over this line of code,

  • executing it along the way.

  • >> And now, notice, my prompt has just changed.

  • Why is that?

  • I've told debug50, run this line of code.

  • What does this line of code do?

  • Prompts me for an int.

  • OK.

  • Let me cooperate.

  • Let me go ahead now and type -1, Enter.

  • And now notice what has changed.

  • On the right-hand side, my local variable i

  • is indicated as being -1 now.

  • And it's still of type int.

  • >> And notice, too, my so-called call stack, where did I pause?

  • We'll talk more about this in the future.

  • But the call stack just refers to what functions are currently in motion.

  • Right now it's just main.

  • And right now the only local variable is i with a value of 1.

  • >> And when I finally step over this line here, with that same icon at top right,

  • -1 is a negative integer.

  • Now it's pausing over that curly brace.

  • Let's let it do its thing.

  • I step over that line, and voila.

  • >> So not all that terribly enlightening yet,

  • but it did let me pause and think through logically

  • what this program is doing.

  • But that wasn't the erroneous case.

  • Let's do this again as follows.

  • >> I'm going to leave that breakpoint on line 8 with the red dot.

  • I'm going to rerun debug50.

  • It's automatically paused here.

  • But this time, instead of stepping over this line,

  • let me actually go inside of get_negative_int and figure out,

  • why is it accepting 0 as a valid answer?

  • >> So instead of clicking Step Over.

  • I'm going to go ahead and click Step Into.

  • And notice that the line 8 that's now highlighted now suddenly

  • becomes line 17.

  • >> Now, it's not that the debugger has skipped lines 14 and 15 and 16.

  • It's just there's nothing to show you there.

  • Those are just declaring variables, and then there's the word Do

  • and then an open curly brace.

  • The only functional line that's juicy really is this one here, 17.

  • And that's where we've paused automatically.

  • >> So printf("n.is: ");, so that hasn't happened yet.

  • So let's go ahead and click Step Over.

  • Now my prompt, indeed, changed to ("n is: ").

  • Now get_int, I'm not going to bother stepping into,

  • because that function was made by CS50 in the Library.

  • It's presumably correct.

  • >> So I'm going to go ahead and sort of cooperate by giving it

  • an int, but not a negative int.

  • So let me go ahead and hit 0.

  • And now what happens here when I get down to line 21?

  • I've not iterated again.

  • I don't seem to be stuck in that loop.

  • In other words, this yellow bar did not keep going around,

  • and around, and around.

  • >> Now, why is that?

  • Well, n, what is n right now?

  • I can look at the local variables in the debugger.

  • n is 0.

  • All right, what was my condition?

  • >> 20-- line 20 is, well, 0 is greater than 0.

  • That is not true.

  • 0 is not greater than 0.

  • And so I broke out of this.

  • >> And so that's why on line 21, if I actually continue,

  • I'm going to return 0, even though I should have rejected 0

  • as not actually being negative.

  • So now, I don't really even care about the debugger.

  • Got it, I don't need to know what more is going on.

  • >> So I'm going to go ahead and just click the Play button,

  • and let this finish up.

  • Now, I've realized that my bug is apparently on line 20.

  • That's my logical error.

  • >> And so what do I want to do to change this?

  • If the problem is that I'm not catching 0, it's just a logical error.

  • And I can say while n is greater than or equal to 0,

  • keep prompting the user again and again.

  • >> So, again, simple mistake, perhaps even obvious when you saw me

  • write it just a few minutes ago.

  • But the takeaway here is that with debug 50,

  • and with debugging software more generally,

  • you have this new found power to walk through your own code, look

  • via that right hand panel what your variables values are.

  • So you don't necessarily have to use something

  • like you eprintf to print those values.

  • You can actually see them visually on the screen.

  • >> Now, beyond this, it's worth noting that there's another technique that's

  • actually super common.

  • And you might wonder why this little guy here has been sitting on the stage.

  • So there's this technique, generally known as rubber duck debugging,

  • which really is just a testament to the fact

  • that often when programmers are writing code,

  • they're not necessarily collaborating with others,

  • or working in a shared environment.

  • >> They're sort of at home.

  • Maybe it's late at night.

  • They're trying to figure out some bug in their code.

  • And they're just not seeing it.

  • >> And there's no roommate.

  • There is no TF.

  • There is no CA around.

  • All they have on their shelf is this little rubber ducky.

  • >> And so rubber duck debugging is just this invitation

  • to think of something as silly as this as a real creature,

  • and actually walk through your code verbally to this inanimate object.

  • So, for instance, if this is my example here--

  • and recall that earlier the problem was this,

  • if I delete this first line of code, and I go ahead and make buggy 0 again,

  • recall that I had these error messages here.

  • So the idea here, ridiculous though I feel at the moment doing this publicly,

  • is that error.

  • >> OK, so my problem is that I've implicitly declared a library function.

  • And that library function is printf.

  • Declare-- OK, declare reminds me of prototypes.

  • >> That means I need to actually tell the compiler in advance what

  • the function looks like.

  • Wait a minute.

  • I didn't have standard io.h.

  • Thank you very much.

  • >> So just this process of-- you don't need to actually have a duck.

  • But this idea of walking yourself through your own code

  • so that you even hear yourself, so that you

  • realize omissions in your own remarks, is generally the idea.

  • >> And, perhaps more logically, not so much with that one but the more involved

  • example we just did in buggy 3.c, you might walk yourself through it

  • as follows.

  • So all right, rubber ducky, DDB, if you will.

  • Here we have in my main function, I'm calling get negative int.

  • >> And I am getting the return value.

  • I'm storing it on the left hand side on line 8 in a variable called i.

  • OK, but wait, how did that get that value?

  • Let me look at the function in line 12.

  • >> In line 12, we have get negative int.

  • Doesn't take any inputs, does return an int, OK.

  • I declare on line 14 a variable n.

  • It's going to store an integer.

  • That's what I want.

  • >> So do the following while n is-- let me undo what the fix I already made.

  • So while n is greater than 0, print out n is, OK.

  • And then call get int stored in n.

  • And then check if n is 0, n is not-- there it is.

  • So, again, you don't need the actual duck.

  • But just walking yourself through your code as an intellectual exercise

  • will often help you realize what's going on,

  • as opposed to just doing something like this, staring at the screen,

  • and not talking yourself through it, which honestly is not

  • nearly as an effective technique.

  • So there you have it, a number of different techniques

  • for actually debugging your code and finding fault, all of which

  • should be tools in your toolkit so that you're not late at night,

  • especially, you're in the dining halls, or at office hours,

  • banging your head against the wall, trying to solve some problem.

  • Realize that there are software tools.

  • There are rubber duck tools.

  • And there's a whole staff of support waiting to lend a hand.

  • >> So now, a word on the problem sets, and on what we're hoping you

  • get out of them, and how we go about evaluating.

  • Per the course's syllabus, CS50's problem sets

  • are evaluated on four primary axes, so to speak-- scope, correctness, design,

  • and style.

  • And scope just refers to how much of the piece have you bitten off?

  • How much of a problem have you tried?

  • What level of effort have you manifested?

  • >> Correctness is, does the program work as it's supposed to per CS50 specification

  • when you provide certain inputs or certain outputs coming back?

  • Design is the most subjective of them.

  • And it's the one that will take the longest to learn

  • and the longest to teach, in so far as it boils down to,

  • how well written is your code?

  • >> It's one thing to just print the correct outputs or return the right values.

  • But are you doing it as efficiently as possible?

  • Are you doing it divide and conquer, or binary

  • search as we'll soon see that we did two weeks ago with the phone book?

  • Are there better ways to solve the problem than you currently have here?

  • That's an opportunity for better design.

  • >> And then style-- how pretty is your code?

  • You'll notice that I'm pretty particular about indenting my code,

  • and making sure my variables are reasonably named. n,

  • while short, is a good name for a number, i for a counting integer,

  • s for a string.

  • And we can have longer variable names style.

  • Style is just how good does your code look?

  • And how readable is it?

  • >> And over time, what your TAs and TFs will do in the course

  • is provide you with that kind of qualitative feedback

  • so that you get better at those various aspects.

  • And in terms of how we evaluate each of these axes,

  • it's typically with very few buckets so that you, generally,

  • get a sense of how well you're doing.

  • And, indeed, if you receive a score on any of those axes-- correctness, design

  • and style especially-- that number will generally be between 1 and 5.

  • And, literally, if you're getting 3's at the start of the semester,

  • this is a very good thing.

  • It means there's still room for improvement,

  • which you would hope for in taking a class for the first time.

  • There's hopefully some bit of ceiling to which you're aspiring to reaching.

  • And so getting 3's on the earliest pieces,

  • if not some 2's and 4's, is, indeed, a good thing.

  • It's well within range, well within expectations.

  • >> And if your mind is racing, wait a minute, three out of five.

  • That's really a 6 out of 10.

  • That's 60%.

  • My God, that's an F.

  • >> It's not.

  • It's not, in fact, that.

  • Rather, it's an opportunity to improve over the course of the semester.

  • And if you're getting some poors, these are an opportunity

  • to take advantage of office hours, certainly sections and other resources.

  • >> Best is an opportunity, really, to be proud of just how far you've

  • come over the course of the semester.

  • So do realize, if nothing else, three is good.

  • And it allows room for growth over time.

  • >> As to how those axes are weighted, realistically you're

  • going to spend most of your time getting things to work, let alone correctly.

  • And so correctness tends to be weighted the most, as with

  • this multiplicative factor of three.

  • Design is also important, but something that you don't necessarily

  • spend all of those hours on trying to get things just to work.

  • >> And so it's weighted a little more lightly.

  • And then style is weighted the least.

  • Even though it's no less important fundamentally,

  • it's just, perhaps, the easiest thing to do right,

  • mimicking the examples we do in lecture and section,

  • with things nicely indented, and commented,

  • and so forth is among the easiest things to do and get right.

  • So as such, realize that those are points

  • that are relatively easy to grasp.

  • >> And now a word on this-- academic honesty.

  • So per the course's syllabus, you will see

  • that the course has quite a bit of language around this.

  • And the course takes the issue of academic honesty quite seriously.

  • >> We have the distinction, for better or for worse,

  • of having sent each year more students for disciplinary action

  • than most any other course, that I am aware of.

  • This is not necessarily indicative of the fact

  • that CS students, or CS50 students, are any less honest than your classmates.

  • But the reality that in this world, electronically, we just

  • have technological means of detecting this.

  • >> It is important to us for fairness across the class

  • that we do detect this, and raise the issue when we see things.

  • And just to paint a picture, and really to help something like this sink in,

  • these are the numbers of students over the past 10 years

  • that have been involved in some such issues of academic honesty,

  • with some 32 students from fall 2015, which

  • is to say that we do take the matter very seriously.

  • And, ultimately, these numbers compose, most recently, about 3%, 4% or so

  • of the class.

  • >> So for the super majority of students it seems that the lines are clear.

  • But do keep this in mind, particularly late

  • at night when struggling with some solution to a problem set,

  • that there are mechanisms for getting yourself better

  • support than you might think, even at that hour.

  • Realize that when we receive student submissions, we cross

  • compare every submission this year against every submission last year,

  • against every submission from 2007, and since, looking at, as well,

  • code repositories online, discussion forums, job sites.

  • And we mention this, really, all for the sake

  • of full disclosure, that if someone else can find it online,

  • certainly, so can we the course.

  • But, really, the spirit of the course boils down

  • to this clause in the syllabus.

  • It really is just, be reasonable.

  • >> And if we had to elaborate on that with just a bit more language,

  • realize that the essence of all work that you submit to this course

  • must be your own.

  • But within that, there are certainly opportunities, and encouragement,

  • and pedagogical value in turning to others-- myself, the TFs, the CAs,

  • the TAs, and others in the class, for support, let alone friends

  • and roommates who have studied CS and programming before.

  • And so there is an allowance for that.

  • And the general rule of thumb is this-- when asking for help,

  • you may show your code to others, but you may not view theirs.

  • So even if you're at office hours, or in the D hall, or somewhere else

  • working on some piece set, working alongside a friend, which

  • is totally fine, at the end of the day your work

  • should ultimately belong to each of you respectively, and not

  • be some collaborative effort, except for the final project where

  • it's allowed and encouraged.

  • >> Realize that if you are struggling with something

  • and your friend just happens to be better at this then you,

  • or better at that problem than you, or a little farther ahead than you,

  • it's totally reasonable to turn to your friend and say, hey,

  • do you mind looking at my code here, helping me spot what my issue is?

  • And, hopefully, in the interest of pedagogical value

  • that friend doesn't just say, oh, do this, but rather,

  • what are you missing on line 6, or something like that?

  • But the solution is not for the friend next to you

  • to say, oh, well, here, let me pull this up, and show my solution to you.

  • So that is the line.

  • You show your code to others, but you may not

  • view theirs, subject to the other constraints in the course's syllabus.

  • >> So do keep in mind this so-called regret clause

  • in the course's syllabus as well, that if you commit some act that

  • is not reasonable, but bring it to the attention of the course's heads

  • within 72 hours, the course may impose local sanctions that

  • may include an unsatisfactory or failing grade for the work submitted.

  • But the course will not refer the matter for further disciplinary action,

  • except in cases of repeated acts.

  • In other words, if you do make some stupid, especially late night, decision

  • that the next morning or two days later, you wake up and realize,

  • what was I thinking?

  • You do in CS50 have an outlet for fixing that problem

  • and owning up to it, so that we will meet you halfway and deal

  • with it in a matter that is both educational and valuable for you,

  • but still punitive in some way.

  • And now, to take the edge off, this.

  • >> [VIDEO PLAYBACK]

  • >> [MUSIC PLAYING]

  • >> [END PLAYBACK]

  • DAVID J. MALAN: All right, we are back.

  • And now we look at one of the first of our real world domains

  • in CS50, the art of cryptography, the art of sending and receiving

  • secret messages, encrypted messages if you will,

  • that can only be deciphered if you have some key ingredient that the sender has

  • as well.

  • So to motivate this we'll take a look at this thing here,

  • which is an example of a secret decoder ring that

  • can be used in order to figure out what a secret message actually is.

  • In fact, back in the day in grade school,

  • if you ever sent secret messages to some friend or some crush in class,

  • you might have thought you were being clever

  • by on your piece of paper changing, like, A to B, and B to C, and C to D,

  • and so forth.

  • But you were actually encrypting your information, even

  • if it was a little trivial, wasn't that hard for the teacher to realize,

  • well, if you just change B to A and C to B,

  • you actually figure out what the message was,

  • but you were in ciphering information.

  • >> You were just doing it simply, much like Ralphie here

  • in a famous movie that plays pretty much ad nauseum each winter.

  • [VIDEO PLAYBACK]

  • -Be it known to all that Ralph Parker is hereby

  • appointed a member of the Little Orphan Annie Secret Circle

  • and is entitled to all the honors and benefits occurring thereto.

  • >> -Signed, Little Orphan Annie, counter-signed Pierre Andre, in ink.

  • Honors and benefits, already at the age of nine.

  • >> [SHOUTING]

  • -Come on.

  • Let's get on with it.

  • I don't need all that jazz about smugglers and pirates.

  • >> -Listen tomorrow night for the concluding adventure

  • of the black pirate ship.

  • Now, it's time for Annie's secret message

  • for you members of the Secret Circle.

  • Remember, kids, only members of Annie's Secret Circle

  • can decode Annie's secret message.

  • >> Remember, Annie is depending on you.

  • Set your pins to B2.

  • Here is the message.

  • 12, 11--

  • >> -I am in, my first secret meeting.

  • >> -14, 11, 18, 16.

  • >> -Pierre was in great voice tonight.

  • I could tell that tonight's message was really important.

  • >> -3, 25, that's a message from Annie herself.

  • Remember, don't tell anyone.

  • >> -90 seconds later, I'm in the only room in the house where a boy of nine

  • could sit in privacy and decode.

  • Aha, B!

  • I went to the next, E.

  • >> The first word is be.

  • S, it was coming easier now, U, 25--

  • >> -Oh, come on, Ralphie, I gotta go!

  • >> -I'll be right down, Ma!

  • Gee whiz!

  • >> -T, O, be sure to-- be sure to what?

  • What was Little Orphan Annie trying to say?

  • Be sure to what?

  • >> -Ralphie, Andy has got to go, will you please come out?

  • >> -All right, Ma!

  • I'll be right out!

  • >> -I was getting closer now.

  • The tension was terrible.

  • What was it?

  • The fate of the planet may hang in the balance.

  • >> -Ralphie!

  • Andy's gotta go!

  • >> -I'll be right out, for crying out loud!

  • >> -Almost there, my fingers flew, my mind was a steel trap, every pore vibrated.

  • It was almost clear, yes, yes, yes.

  • >> -Be sure to drink your ovaltine.

  • Ovaltine?

  • A crummy commercial?

  • Son of a bitch.

  • [END PLAYBACK]

  • DAVID J. MALAN: OK, so that was a very long way

  • of introducing cryptography, and also ovaltine.

  • In fact, from this old advert here, why is ovaltine so good?

  • It is a concentrated extraction of ripe barley malt, pure creamy cow's milk,

  • and specially prepared cocoa, together with natural phosphatides and vitamins.

  • It is further fortified with additional vitamins B and D, yum.

  • And you can still get it, apparently, on Amazon, as we did here.

  • >> But the motivation here was to introduce cryptography, specifically

  • a type of cryptography known as secret key cryptography.

  • And as the name suggests, the whole security of a secret key crypto system,

  • if you will, a methodology for just scrambling

  • information between two people, is that only the sender and only the recipient

  • know a secret key-- some value, some secret phrase, some secret number, that

  • allows them to both encrypt and decrypt information.

  • And cryptography, really, is just this from week 0.

  • >> It's a problem where there's inputs, like the actual message in English

  • or whatever language that you want to send to someone in class,

  • or across the internet.

  • There is some output, which is going to be the scrambled message that you

  • want the recipient to receive.

  • And even if someone in the middle receives it too,

  • you don't want them to necessarily be able to decrypt it,

  • because inside of this black box, or algorithm,

  • is some mechanism, some step by step instructions, for taking that input

  • and converting it into the output, in hopefully a secure way.

  • >> And, in fact, there is some vocabulary in this world as follows.

  • Plain text is the word a computer scientist would

  • use to describe the input message, like the English

  • or whatever language you actually want to send to some other human.

  • And then the ciphertext is the scramble to the enciphered, or encrypted,

  • version thereof.

  • >> But there's one other ingredient here.

  • There's one other input to secret key cryptography.

  • And that is the key itself, which is, generally,

  • as we'll see, a number, or letter, or word, whatever

  • the algorithm it is actually expects.

  • >> And how do you decrypt information?

  • How do you unscramble it?

  • Well, you just reverse the outputs and the inputs.

  • >> In other words, once someone receives your encrypted message,

  • he or she simply has to know that same key.

  • They have received the ciphertext.

  • And by plugging those two inputs into the crypto system,

  • the algorithm, this black box, out should come the original plaintext.

  • And so that's the very high level view of what cryptography is actually

  • all about.

  • >> So let's get there.

  • Let's now look underneath the hood of something

  • we've been taking for granted for the past week, and for this session

  • here-- the string.

  • A string at the end of the day is just a sequence of characters.

  • >> It might be hello world, or hello Zamyla, or whatever.

  • But what does that mean to be a sequence of characters?

  • In fact, the CS50 library gives us a data type called string.

  • >> But there is actually no such thing as a string in C.

  • It really is just a sequence of character, character, character,

  • character, back, to back, to back, to back, to back inside

  • of your computer's memory, or RAM.

  • And we'll look deeper into that in the future when we look at memory itself,

  • and the utilization, and the threats that are involved.

  • >> But let's consider the string Zamyla.

  • So just the name of the human here, Zamyla,

  • that is a sequence of characters, Z-A-M-Y-L-A.

  • And now let's suppose that Zamyla's name is being stored inside of a computer

  • program.

  • >> Well, it stands to reason that we should be able to look at those characters

  • individually.

  • So I'm just going to draw a little box around Zamyla's name here.

  • And it is the case in C that when you have a string, like Zamyla-- and maybe

  • that string has come back from a function like get string,

  • you can actually manipulate it character by character.

  • >> Now, this is germane for the conversation at hand, because

  • in cryptography if you want to change A to B, and B to C, and C to D,

  • and so forth, you need to be able to look at the individual characters

  • in a string.

  • You need to be able to change the Z to something else, the A

  • to something else, the M to something else, and so on.

  • And so we need a way, programmatically, so

  • to speak, in C to be able to change and look at individual letters.

  • And we can do this as follows.

  • >> Let me go head back in CS50 IDE.

  • And let me go ahead and create a new file

  • that I'll call this time string0, as our first such example, dot c.

  • And I'm going to go ahead and whip it up as follows.

  • >> So include CS50.h, and then include standard io.h,

  • which I'm almost always going to be using in my programs, at least

  • initially.

  • int main void, and then in here I'm going to do strings gets get string.

  • And then I'm going to go ahead and do this.

  • I want to go ahead and, as a sanity check,

  • just say, hello, percent s, semi-colon, makes string 0.

  • Uh oh, what did I do here?

  • Oh, I didn't plug it in.

  • So lesson learned, that was not intentional.

  • >> So error, more percent conversions than data arguments.

  • And this is where, in line 7-- OK, so I have,

  • quote unquote, that's my string to printf.

  • I've got a percent sign.

  • But I'm missing the second argument.

  • >> I'm missing the comma s, which I did have in previous examples.

  • So a good opportunity to fix one more mistake, accidentally.

  • And now let me run string0, type in Zamyla.

  • OK, hello Zamyla.

  • >> So we've run this kind of program a few different times now.

  • But let's do something a little different this time.

  • Instead of just printing Zamyla's whole name out with printf,

  • let's do it character by character.

  • >> I'm going to use a for loop.

  • And I'm going to give myself a counting variable, called i.

  • And I'm going to keep iterating, so long as i is less than the length of s.

  • >> It turns out, we didn't do this last time,

  • that c comes with a function called Stirling.

  • Back in the day, and in general still when implementing functions,

  • humans will often choose very succinct names that kind of sound

  • like what you want, even though it's missing a few vowels or letters.

  • So Stirling is the name of a function that

  • takes an argument between parentheses that should be a string.

  • And it just returns an integer, the length of that string.

  • >> So this for loop on line 7 is going to start counting at i equals 0.

  • It's going to increment i on each iteration

  • by 1, as we've been doing a few times.

  • But it's going to only do this up until the point

  • when i is the length of the string itself.

  • >> So this is a way of, ultimately, iterating over the characters

  • in the string as is follows.

  • I'm going to print out not a whole string, but percent c,

  • a single character followed by a new line.

  • And then I'm going to go ahead, and I need

  • to say I want to print ith character of s.

  • >> So if i is the variable that indicates the index of the string, where

  • you are in it, I need to be able to say, give me the ith character of s.

  • And c has a way of doing this with square brackets.

  • You simply say the name of the string, which in this case is s.

  • Then you use square brackets, which are usually just above your Return or Enter

  • key on the keyboard.

  • And then you put the index of the character that you want to print.

  • So the index is going to be a number-- 0, or 1, or 2, or 3, or dot,

  • dot, dot, some other number.

  • >> And we ensure that it's going to be the right number, because I

  • start counting at 0.

  • And by default, the first character in a string is by convention 0.

  • And the second character is bracket 1.

  • And the third character is bracket 2.

  • And you don't want to go too far, but we won't because we're

  • going to only increment i until it equals the length of the string.

  • And at which point, this for loop will stop.

  • >> So let me go ahead and save this program, and run make string 0.

  • But I screwed up.

  • Implicitly declaring library function Stirling with type such and such-- now,

  • this sounds familiar.

  • But it's not printf.

  • And it's not get string.

  • >> I didn't screw up in the same way this time.

  • But notice down here a little down further, include the header string.h,

  • explicitly provide the declaration for Stirling.

  • So there is actually a clue in there.

  • >> And indeed it turns out there's another header file

  • that we've not used in class yet, but it's

  • among those available to you, called string.h.

  • And in that file, string.h is Stirling declared.

  • So let me go ahead and save this, make string

  • 0-- nice, no error messages this time.

  • >> ./string0 Zamyla, and I'm about to hit Enter,

  • at which point getstring is going to return the string, put it in s.

  • Then that for loop is going to iterate over S's characters one at a time,

  • and print them one per line, because I had that backslash n at the end.

  • So I could omit that backslash n, and then just print Zamyla all

  • in the same line, effectively reimplementing

  • printf, which isn't all that useful.

  • But in this case, I've not done that.

  • I've actually printed one character at a time, one per line,

  • so that we actually see the effect.

  • >> But I should note one thing here.

  • And we'll come back to this in a future week.

  • It turns out that this code is potentially buggy.

  • >> It turns out that get string and some other functions in life

  • don't necessarily always return what you're expecting.

  • We know from class last time in this that get

  • string is supposed to return a string.

  • But what if the user types out such a long word, or paragraph, or essay

  • that there's just not enough memory in the computer to fit it.

  • >> Like, what if something goes wrong underneath the hood?

  • It might not happen often, but it could happen once

  • in a while, very infrequently.

  • And so it turns out that get string and functions like it don't necessarily

  • always return strings.

  • They might return some error value, some sentinel value so to speak,

  • that indicates that something has gone wrong.

  • And you would only know this from having learned it in class now,

  • or having read some more documentation.

  • It turns out that get string can return a value called null.

  • Null is a special value that we'll come back to in a future week.

  • But for now, just know that if I want to be really proper in moving forward

  • using get string, I shouldn't just call it,

  • and blindly use its return value, trusting that it's a string.

  • >> I should first say, hey, wait a minute, only

  • proceed if s does not equal null, where null, again,

  • is just some special value.

  • And it's the only special value you need to worry about for get string.

  • Get string is either going to return a string or null.

  • >> And this exclamation point equals sign you might know from maybe math class

  • that you might draw an equal sign with a line through it to indicate not equal.

  • That's not generally a character you can type on your keyboard.

  • And so in most programming languages, when you want to say not equal,

  • you use an exclamation point, otherwise known as bang.

  • So you say bang equals, which means not equals, logically.

  • It's just like there's not a greater than, or equal to, or less than

  • or equal to key on your keyboard that does it all in one symbol.

  • So that's why, in past examples, you did an open bracket, and then

  • an equal sign, in order to do greater than or, say, less than.

  • >> So what's the takeaway here?

  • This is simply a way now of introducing this syntax, this feature,

  • iterating over individual characters in a string.

  • And just like those square brackets allow you to get at them,

  • consider those square brackets as kind of hinting at this underlying

  • design, whereby every character inside of a string

  • is kind of boxed in somewhere underneath the hood in your computer's memory.

  • >> But let's make a variant of this.

  • It turns out that this program is correct.

  • So per CS50's axes for evaluating code, this is correct now.

  • Especially now that I'm checking for null, this program should never crash.

  • And I just know that from experience.

  • But there's nothing else that we can really go wrong here.

  • But it's not very well-designed, because let's go back to basics.

  • >> First, principles-- what does a for loop do?

  • A for loop does three things.

  • It initializes some value, if you ask it to.

  • It checks a condition.

  • And then after each iteration, after each cycle,

  • it increments some value, or values, here.

  • >> So what does that mean?

  • We initialize i to 0.

  • We check and make sure i is less than the length of s, which is Z-A-M-Y-L-A,

  • so which is less than 6.

  • And, indeed, 0 as less than 6.

  • >> We print out Z from Zamyla's name.

  • Then we increment i from 0 to 1.

  • We then check, is 1 less than the length of s?

  • The length of s is 6.

  • Yes, it is.

  • >> So we print a in Zamyla's name, ZA.

  • We increment i from 0, to 1, to 2.

  • We then check, is 2 less than the length of Zamyla's name.

  • 6- so 2 is less than 6.

  • Yes, let's print out now M in Zamyla's name, the third character.

  • >> The key here is that on each iteration of the story, I'm checking,

  • is i less than the length of Zamyla?

  • But the catch is that Stirling is not a property.

  • Those of you who have programmed before in Java or other languages

  • might know the length of a string is a property, just some read only value.

  • >> In C in this case, if this is a function that is literally

  • counting the number of characters in Zamyla every time

  • we call that function.

  • Every time you ask the computer to use Stirling, it's taking a look at Zamyla,

  • and saying Z-A-M-Y-L-A, 6.

  • And it returns 6.

  • The next time you call it inside that for loop,

  • it's going to look at Zamyla again, say Z-A-M-Y-L-A, 6.

  • And it's going to return 6.

  • So what's stupid about this design?

  • >> Why is my code not a 5 out of 5 for design right now, so to speak?

  • Well, I'm asking a question unnecessarily.

  • I'm doing more work than I need to.

  • >> So even though the answer is correct, I am

  • asking the computer, what is the length of Zamyla again,

  • and again, and again, and again?

  • And that answer is never going to change.

  • It's always going to be 6.

  • >> So a better solution than this would be this next version.

  • Let me go ahead and put it in a separate file called string1.c,

  • just to keep it separate.

  • And it turns out in a for loop, you can actually

  • declare multiple variables at once.

  • >> So I'm going to keep i and set it to 0.

  • But I'm also going to add a comma, and say,

  • give me a variable called n, whose value equals the string length of s.

  • And now, please make my condition so long as i is less than n.

  • >> So in this way, the logic is identical at the end of the day.

  • But I am remembering the value 6, in this case.

  • What is the length of Zamyla's name?

  • And I'm putting it at n.

  • >> And I'm still checking the condition every time.

  • Is 0 less than 6?

  • Is 1 less than 6?

  • Is 2 less than 6, and so forth?

  • >> But I'm not asking the computer again, and again, what's

  • the length of Zamyla's name?

  • What's the length of Zamyla's name?

  • What's the length of this Zamyla's name?

  • I'm literally remembering that first and only answer in this second variable n.

  • So this now would be not only correct, but also well-designed.

  • >> Now, what about style?

  • I've named my variables pretty well, I would say.

  • They're super succinct right now.

  • And that's totally fine.

  • >> If you only have one string in a program,

  • you might as well call it s for string.

  • If you only have one variable for counting in a program,

  • you might as well call it i.

  • If you have a length, n is super common as well.

  • But I haven't commented any of my code.

  • >> I've not informed the reader-- whether that's my TF, or TA,

  • or just colleague-- what is supposed to be going on in this program.

  • And so to get good style, what I would want to do

  • is this-- something like ask user for input.

  • And I could rewrite this any number of ways.

  • >> Make sure s-- make sure get string returned a string.

  • And then in here-- and this is perhaps the most important comment-- iterate

  • over the characters in s one at a time.

  • And I could use any choice of English language

  • here to describe each of these chunks of code.

  • >> Notice that I haven't put a comment on every line of code,

  • really just on the interesting ones, the ones that

  • have some meaning that I might want to make super clear to someone

  • reading my code.

  • And why are you calling get string ask user for input?

  • Even that one is not necessarily all that descriptive.

  • But it helps tell a story, because the second line in the story is, make sure

  • get string returned a string.

  • >> And the third line in the story is, iterate over the characters in s one

  • at a time.

  • And now just for good measure, I'm going to go ahead and add

  • one more comment that just says print i-th character in s.

  • Now, what have I done at the end of the day?

  • >> I have added some English words in the form of comments.

  • The slash slash symbol means, hey, computer this is for the human,

  • not for you, the computer.

  • So they're ignored logically.

  • They're just there.

  • >> And, indeed, CS50 IDE shows them as gray, as being useful, but not key

  • to the program.

  • Notice what you can now do.

  • Whether you know C programming or not, you

  • can just stand back at this program, and skim the comments.

  • Ask user for input, make sure get string returned a string,

  • iterate over the characters in s one at a time, print the character

  • i-th character in s-- you don't even have to look at the code

  • to understand what this program does.

  • And, better yet, if you yourself look at this program in a week or two,

  • or a month, or a year, you too don't have

  • to stare at the code, trying to remember,

  • what was I trying to do with this code?

  • >> You've told yourself.

  • You've described it for yourself, or some colleague, or TA, or TF.

  • And so this would now be correct, and good design,

  • and ultimately good style as well.

  • So do keep that in mind.

  • >> So there's one other thing I'm going to do here

  • that can now reveal exactly what's going on underneath the hood.

  • So there's this feature in C, and other languages,

  • called typecasting that either implicitly

  • or explicitly allows you to convert from one data type to another.

  • We've been dealing so far today with strings.

  • >> And strings are characters.

  • But recall from week 0, what are characters?

  • Characters are just an abstraction on top of numbers-- decimal numbers,

  • and decimal numbers are really just an abstraction on top of binary numbers,

  • as we defined it.

  • >> So characters are numbers.

  • And numbers are characters, just depending on the context.

  • And it turns out that inside of a computer program,

  • can you specify how you want to look at the bits inside of that program?

  • >> Recall from week 0 that we had Ascii, which is just this code

  • mapping letters to numbers.

  • And we said, capital A is 65.

  • Capital B is 66, and so forth.

  • >> And notice, we essentially have chars on the top row here, as C would call them,

  • characters, and then ints on the second row.

  • And it turns out you can convert seamlessly between the two, typically.

  • And if we want to do this deliberately, we

  • might want to tackle something like this.

  • >> We might want to convert upper case to lower

  • case, or lower case to upper case.

  • And it turns out there's actually a pattern here

  • we can embrace in just a moment.

  • But let's look first at an example of doing this explicitly.

  • >> I'm going to go back into CS50 IDE.

  • I'm going to create a file called Ascii 0.c.

  • And I'm going to go ahead and add my standard io.h at the top, int main void

  • at the top of my function.

  • And then I'm just going to do the following-- a for loop from i equals,

  • let's say, 65.

  • >> And then i is going to be less than 65, plus 26 letters in the alphabet.

  • So I'll let the computer do the math for me there.

  • And then inside this loop, what am I going to print?

  • >> %c is % i backslash n.

  • And now I want to plug in two values.

  • I've temporarily put question marks there to invite the question.

  • >> I want to iterate from 65 onward for 26 letters of the alphabet,

  • printing out on each iteration that character's integral equivalent.

  • In other words, I want to iterate over 26 numbers printing

  • what the Ascii character is, the letter, and what the corresponding number is--

  • really just recreating the chart from that slide.

  • So what should these question marks be?

  • >> Well, it turns out that the second one should just be the variable i.

  • I want to see that as a number.

  • And the middle argument here, I can tell the computer

  • to treat that integer i as a character, so as

  • to substitute it here for percent C.

  • >> In other words, if I, the human programmer, know

  • these are just numbers at the end of the day.

  • And I know that 65 should map to some character.

  • With this explicit cast, with a parenthesis,

  • the name of the data type you want to convert to, and a closed parenthesis,

  • you can tell the computer, hey, computer,

  • convert this integer to a char.

  • >> So when I run this program after compiling,

  • let's see what I get-- make Ascii 0.

  • Darn it, what did I do wrong here?

  • Use of undeclared identifier, all right, not intentional,

  • but let's see if we can't reason through this.

  • >> So line five-- so I didn't get very far before screwing up.

  • That's OK.

  • So line 5 for i equals 65-- I see.

  • So remember that in C, unlike some languages if you have prior programming

  • experience, you have to tell the computer,

  • unlike Scratch, what type of variable it is.

  • >> And I forgot a key phrase here.

  • In line five, I've started using i.

  • But I haven't told C what data type it is.

  • So I'm going to go in here and say, ah, make it an integer.

  • >> Now I'm going to go ahead and recompile.

  • That fixed that.

  • ./ascii0 Enter, that's kind of cool.

  • Not only is it super fast to ask the computer this question,

  • rather than looking it up on a slide, it printed out one per line, A is 65,

  • B is 66, all the way down-- since I did this 26 times-- to the letters z,

  • which is 90.

  • And, in fact, slightly more intelligent would

  • have been for me not to rely on the computer to add 26.

  • I could have just done 90 as well, so long

  • as I don't make the same mistake twice.

  • I want to go up through z, not just up through y.

  • >> So that's an explicit cast.

  • It turns out that this isn't even necessary.

  • Let me go ahead and rerun this compiler, and rerun Ascii 0.

  • It turns out that C is pretty smart.

  • >> And printf, in particular, is pretty smart.

  • If you just pass an i twice for both placeholders, printf

  • will realize, oh, well I know you gave me an integer-- some number,

  • like 65, or 90, or whatever.

  • But I see that you want me to format that number like a character.

  • And so printf can implicitly cast the int to a char for you as well.

  • So that's not a problem at all.

  • >> But notice, because of this equivalence we can actually do this as well.

  • Let me go ahead and make one other version of this-- Ascii 1.c.

  • And instead of iterating over integers, can really blow your mind

  • by iterating over characters.

  • If a char c gets capital A, I want to go ahead and do this,

  • so long as C is less than or equal to capital Z. And on each iteration

  • I want to increment C, I can now in my printf line here

  • say, percent C is percent i again, comma C.

  • >> And now, I can go the other direction, casting the character explicitly

  • to an integer.

  • So, again, why would you do this?

  • It's a little weird to sort of count in terms of characters.

  • >> But if you understand what's going on underneath the hood,

  • there's really no magic.

  • You're just saying, hey, computer give me a variable called C of type char.

  • Initialize it to capital A. And notice single quotes matter.

  • >> For characters in C, recall from last week, you use single quotes.

  • For strings, for words, phrases, you use double quotes.

  • OK, computer, keep doing this, so long as the character is less than

  • or equal to z.

  • And I know from my Ascii table that all of these Ascii codes are contiguous.

  • >> There's no gaps.

  • So it's just A through Z, separated by one number each.

  • And then I can increment a char, if I really want.

  • At the end of the day, it's just a number.

  • I know this.

  • So I can just presume to add 1 to it.

  • >> And then this time, I print c, and then the integral equivalent.

  • And I don't even need the explicit cast.

  • I can let printf and the computer figure things out,

  • so that now if I run make Ascii1./ascii1,

  • I get the exact same thing as well.

  • >> Useless program, though-- no one is going to actually write software

  • in order to figure out, what was the number that maps to A, or B, or Z?

  • You're just going to Google it, or look it up online, or look it up

  • on a slide, or the like.

  • So where does this actually get useful?

  • >> Well, speaking of that slide, notice there's

  • an actual pattern here between uppercase and lowercase that was not accidental.

  • Notice that capital A is 65.

  • Lowercase a is 97.

  • And how far away is lower case a?

  • >> So 65 is how many steps away from 97?

  • So 97 minus 65 is 32.

  • So capital a is 65.

  • If you add 32 to that, you get lowercase a.

  • And, equivalently, if you subtract 32, you get back to capital A-- same with B

  • to little b, big C to little c.

  • >> All of these gaps are 32 apart.

  • Now, this would seem to allow us to do something like Microsoft Word,

  • or Google Docs feature, where you can select everything and then say,

  • change all to lowercase, or change all to upper case,

  • or change only the first word of a sentence to upper case.

  • We can actually do something like that ourselves.

  • >> Let me go ahead and save a file here called capitalize 0.c.

  • And let's go ahead and whip up a program that does exactly that as follows.

  • So include the CS50 library.

  • And include standard I/O.

  • >> And I know this is coming soon.

  • So I'm going to put it in there already, string.h,

  • so I have access to things like Stirling,

  • and then int main void, as usual.

  • And then I'm going to go ahead and do strings gets get string,

  • just to get a string from the user.

  • And then I'm going to do my sanity check.

  • If string does not equal null, then it's safe to proceed.

  • And what do I want to do?

  • I'm going to iterate from i equals 0, and n up to the string length of s.

  • >> And I'm going to do this so long as i is less than n, and i plus plus.

  • So far, I'm really just borrowing ideas from before.

  • And now I'm going to introduce a branch.

  • >> So think back to Scratch, where we had those forks in the road,

  • and last week in C. I'm going to say this, if the i-th character in s

  • is greater than or equal to lower case a,

  • and-- in Scratch you would literally say and, but in C you say ampersand,

  • ampersand-- and the i-th character in s is less than or equal to lower case z,

  • let's do something interesting.

  • Let's actually print out a character with no newline

  • that is the character in the string, the i-th character in the string.

  • >> But let's go ahead and subtract 32 from it.

  • Else if the character in the string that we're looking

  • is not between little a and little z, go ahead

  • and just printed it out unchanged.

  • So we've introduced this bracketed notation

  • for our strings to get at the i-th character in the string.

  • >> I've added some conditional logic, like Scratch in last week's week one, where

  • I'm just using my fundamental understanding of what's

  • going on underneath the hood.

  • Is the i-th character of s greater than or equal to a?

  • Like, is it 97, or 98, or 99, and so forth?

  • >> But is it also less than or equal to the value of lowercase z?

  • And if so, what does this line mean?

  • 14, this is sort of the germ of the whole idea,

  • capitalize the letter by simply subtracting 32 from it,

  • in this case, because I know, per that chart, how my numbers are represented.

  • So let's go ahead and run this, after compiling capitalize 0.c,

  • and run capitalize 0.

  • >> Let's type in something like Zamyla in all lowercase enter.

  • And now we have Zamyla in all uppercase.

  • Let's type in Rob in all lowercase.

  • Let's try Jason in all lowercase.

  • And we keep getting the forced capitalization.

  • There's a minor bug that I kind of didn't anticipate.

  • Notice my new prompt is ending up on the same line as their names,

  • which feels a little messy.

  • >> So I'm going to go here, and actually at the end of this program

  • print out a newline character.

  • That's all.

  • With printf, you don't need to pass in variables or format code.

  • You can literally just print something like a newline.

  • >> So let's go ahead and make capitalize 0 again, rerun it, Zamyla.

  • And now it's a little prettier.

  • Now, my prompt is on its own new line.

  • So that's all fine and good.

  • So that's a good example.

  • But I don't even necessarily need to hard code the 32.

  • You know what?

  • I could say-- I don't ever remember what the difference is.

  • >> But I know that if I have a lower case letter,

  • I essentially want to subtract off whatever the distance is between little

  • a and big A, because if I assume that all of the other letters are the same,

  • that should get the job done.

  • But rather than do that, you know what?

  • There's another way still.

  • >> If that's capitalize 1.c-- if I were to put that into a separate file.

  • let's do capitalize 2.c as follows.

  • I'm going to really clean this up here.

  • And instead of even having to know or care about those low level

  • implementation details, I'm instead just going to print a character,

  • quote unquote, percent C, and then call another function that

  • exists that takes an argument, which is a character, like this.

  • >> It turns out in C, there's another function call

  • to upper, which as its name suggests takes a character

  • and makes it to its upper case equivalent, and then returns it

  • so that printf can plug it in there.

  • And so to do this, though, I need to introduce one other file.

  • It turns out there's another file that you would only know from class,

  • or a textbook, or an online reference, called C type.h.

  • >> So if I add that up among my header files, and now re-compile this program,

  • capitalize2, ./capitalize2 Enter.

  • Let's type in Zamyla in all lowercase, still works the same.

  • But you know what?

  • It turns out that to upper has some other functionality.

  • >> And let me introduce this command here, sort of awkwardly

  • named, but man for manual.

  • It turns out that most Linux computers, as we are using here-- Linux operating

  • system-- have a command called man, which says,

  • hey, computer, give me the computer's manual.

  • What do you want to look up in that manual?

  • >> I want to look up the function called to upper, Enter.

  • And it's a little cryptic to read sometimes.

  • But notice we're in the Linux programmer's manual.

  • And it's all text.

  • And notice that there's the name of the function up here.

  • It turns out it has a cousin called to lower, which does the opposite.

  • And notice under synopsis, to use this function the man page, so to speak,

  • is telling me that I need to include c type.h.

  • And I knew that from practice.

  • >> Here, it's showing me the two prototypes for the function,

  • so that if I ever want to use this I know what they take as input,

  • and what they return as output.

  • And then if I read the description, I see

  • in more detail what the function does.

  • But more importantly, if I look under return value,

  • it says the value returned is that of the converted letter,

  • or C, the original input, if the conversion was not possible.

  • >> In other words, to upper will try to convert a letter to upper case.

  • And if so, it's going to return it.

  • But if it can't for some reason-- maybe it's already upper case,

  • maybe it's an exclamation point or some other punctuation--

  • it's just going to return the original C,

  • which means I can make my code better designed as follows.

  • >> I don't need all of these darn lines of code.

  • All of the lines I've just highlighted can

  • be collapsed into just one simple line, which is this-- printf percent

  • c to upper S bracket i.

  • And this would be an example of better design.

  • >> Why implement in 7 or 8 lines of code, whatever it was I just

  • deleted, when you can instead collapse all of that logic and decision making

  • into one single line, 13 now, that relies on a library function--

  • a function that comes with C, but that does exactly what you want it to do.

  • And, frankly, even if it didn't come with C,

  • you could implement it yourself, as we've seen, with get negative int

  • and get positive int last week as well.

  • >> This code now is much more readable.

  • And, indeed, if we scroll up, look how much more compact

  • this version of my program is.

  • It's a little top heavy now, with all these includes.

  • But that's OK, because now I'm standing on the shoulders of programmers

  • before me.

  • And whoever it was who implemented to upper really

  • did me a favor, much like whoever implemented Stirling really

  • did me a favor some time ago.

  • And so now we have a better design program

  • that implements the exact same logic.

  • >> Speaking of stirling, let me go ahead and do this.

  • Let me go ahead and save this file as stirling.c.

  • And it turns out, we can peel back one other layer pretty simply now.

  • I'm going to go ahead and whip up another program in main

  • here that simply re-implements string length as follows.

  • So here's a line of code that gets me a string from the user.

  • We keep using this again and again.

  • Let me give myself a variable called n of type int that stores a number.

  • >> And let me go ahead and do the following logic.

  • While the n-th character in s does not equal backslash 0, go ahead

  • and increment n.

  • And then print out printf percent i n.

  • I claim that this program here, without calling string length,

  • figures out the length of a string.

  • >> And the magic is entirely encapsulated in line 8

  • here with what looks like new syntax, this backslash 0 in single quotes.

  • But why is that?

  • Well, consider what's been going on all this time.

  • >> And as an aside before I forget, realize too, that in addition to the man pages

  • that come with a typical Linux system like CS50 IDE,

  • realize that we, the course's staff, have also

  • made a website version of this same idea called

  • reference.cs50.net, which has all of those same man pages,

  • all of that same documentation, as well as

  • a little box at the top that allows you to convert all of the fairly

  • arcane language into less comfortable mode, where we, the teaching staff,

  • have gone through and tried to simplify some of the language to keep things

  • focused on the ideas, and not some of the technicalities.

  • So keep in mind, reference.cs50.net as another resource as well.

  • >> But why does string length work in the way I proposed a moment ago?

  • Here's Zamyla's name again.

  • And here's Zamyla's name boxed in, as I keep doing,

  • to paint a picture of it being, really, just a sequence of characters.

  • But Zamyla does not exist in isolation in a program.

  • >> When you write and run a program, you're using your Mac or your PC

  • as memory, or RAM so to speak.

  • And you can think of your computer as having

  • lots of gigabytes of memory these days.

  • And a gig means billions, so billions of bytes.

  • >> But let's rewind in time.

  • And suppose that we're using a really old computer that

  • only has 32 bytes of memory.

  • I could, on my computer screen, simply draw this out as follows.

  • >> I could simply say that my computer has all of this memory.

  • And this is like a stick of memory, if you recall our picture from last time.

  • And if I just divide this in enough times,

  • I claim that I have 32 bytes of memory on the screen.

  • >> Now, in reality, I can only draw so far on this screen here.

  • So I'm going to go ahead, and just by convention,

  • draw my computer's memory as a grid, not just as one straight line.

  • Specifically, I claim now that this grid, this 8 by 4 grid,

  • just represents all 32 bytes of memory available in my Mac,

  • or available in my PC.

  • And they're wrapping on to two lines, just

  • because it fits more on the screen.

  • But this is the first byte.

  • This is the second byte.

  • This is the third byte.

  • >> And this is the 32nd byte.

  • Or, if we think like a computer scientist, this is byte 0, 1, 2, 3, 31.

  • So you have 0 to 31, if you start counting at 0.

  • >> So if we use a program that calls get string,

  • and we get a string from the human like I did called Zamyla, Z-A-M-Y-L-A,

  • how in the world does the computer keep track of which byte,

  • which chunk of memory, belongs to which string?

  • In other words, if we proceed to type another name into the computer,

  • like this Andi, calling get string a second time,

  • A-N-D-I has to end up in the computer's memory as well.

  • But how?

  • >> Well, it turns out that underneath the hood, what C does when storing strings

  • that the human types in, or that come from some other source, is it

  • delineates the end of them with a special character-- backslash

  • 0, which is just a special way of saying 80 bits in a row.

  • >> So A-- this is the number 97 recall.

  • So some pattern of 8 bits represents decimal number 97.

  • This backslash 0 is literally the number 0, a.k.a. nul, N-U-L, unlike earlier,

  • N-U-L-L, which we talked about.

  • But for now, just know that this backslash 0 is just 80 bits in a row.

  • >> And it's just this line in the sand that says anything to the left

  • belongs to one string, or one data type.

  • And anything to the right belongs to something else.

  • Andi's name, meanwhile, which just visually

  • happens to wrap on to the other line, but that's just an aesthetic detail,

  • similarly is nul terminated.

  • >> It is a string of a A-N-D-I characters, plus a fifth secret character,

  • all 0 bits, that just demarcates the end of Andi's name as well.

  • And if we call get string a third time in the computer to get a string like

  • Maria, M-A-R-I-A, similarly is Maria's name nul terminated with backslash 0.

  • >> This is fundamentally different from how a computer would typically

  • store an integer, or a float, or other data types still, because recall,

  • an integer is usually 32 bits, or 4 bytes, or maybe even 64 bits,

  • or eight bytes.

  • But many primitives in a computer in a programming language

  • have a fixed number of bytes underneath the hood--

  • maybe 1, maybe 2, maybe 4, maybe 8.

  • >> But strings, by design, have a dynamic number of characters.

  • You don't know in advance, until the human types in Z-A-M-Y-L-A,

  • or M-A-R-I-A, or A-N-D-I. You don't know how many times the user is going to hit

  • the keyboard.

  • Therefore, you don't know how many characters in advance

  • you're going to need.

  • >> And so C just kind of leaves like a secret breadcrumb underneath the hood

  • at the end of the string.

  • After storing Z-A-M-Y-L-A in memory, it also just puts the equivalent

  • of a period.

  • At the end of a sentence, it puts 80 bits, so as

  • to remember where Zamyla begins and ends.

  • >> So what's the connection, then, to this program?

  • This program here, Stirling, is simply a mechanism

  • for getting a string from the user, line 6.

  • Line 7, I declare a variable called n and set it equal to 0.

  • >> And then in line 8, I simply asked the question, while the n-th character does

  • not equal all 0 bits-- in other words, does not

  • equal this special character, backslash 0, which

  • was just that special nul character-- go ahead and just increment n.

  • >> And keep doing it, and keep doing it, and keep doing it.

  • And so even though in the past we've used i,

  • it's perfectly fine semantically to use n,

  • if you're just trying to count this time deliberately,

  • and just want to call it n.

  • So this just keeps asking the question, is the n-th character of s all 0s?

  • If not, look to the next look, look to the next, look to the next,

  • look to the next.

  • >> But as soon as you see backslash 0, this loop-- line 9 through 11-- stops.

  • You break out of the while loop, leaving inside of that variable n

  • a total count of all of the characters in the string you saw,

  • thereby printing it out.

  • So let's try this.

  • >> Let me go ahead and, without using the stirling function,

  • but just using my own homegrown version here called stirling, let me go ahead

  • and run stirling, type in something like Zamyla, which I know in advance

  • is six characters.

  • Let's see if it works.

  • Indeed, it's six.

  • Let's try with Rob, three characters, three characters as well, and so forth.

  • So that's all that's going on underneath the hood.

  • And notice the connections, then, with the first week

  • of class, where we talked about something like abstraction,

  • which is just this layering of ideas, or complexity, on top of basic principles.

  • Here, we're sort of looking underneath the hood of stirling,

  • so to speak, to figure out, how would it be implemented?

  • >> And we could re-implement it ourselves.

  • But we're never again going to re-implement stirling.

  • We're just going to use stirling in order

  • to actually get some strings length.

  • >> But there's no magic underneath the hood.

  • If you know that underneath the hood, a string

  • is just a sequence of characters.

  • And that sequence of characters all can be numerically addressed

  • with bracket 0, bracket 1, bracket 2, and you

  • know that at the end of a string is a special character, you can figure out

  • how to do most anything in a program, because all it boils down to

  • is reading and writing memory.

  • That is, changing and looking at memory, or moving things

  • around in memory, printing things on the screen, and so forth.

  • >> So let's now use this newfound understanding of what strings actually

  • are underneath the hood, and peel back one other layer

  • that up until now we've been ignoring altogether.

  • In particular, any time we've implemented a program,

  • we've had this line of code near the top declaring main.

  • And we've specified int main void.

  • >> And that void inside the parentheses has been saying all this time that main

  • itself does not take any arguments.

  • Any input that main is going to get from the user

  • has to come from some other mechanism, like get int,

  • or get float, or get string, or some other function.

  • But it turns out that when you write a program,

  • you can actually specify that this program shall

  • take inputs from the human at the command line itself.

  • >> In other words, even though we thus far have been running just ./hello hello

  • or similar programs, all of the other programs that we've been using,

  • that we ourselves didn't write, have been taking, it seems,

  • command line arguments-- things like make.

  • You say something like make, and then a second word.

  • Or clang, you say clang, and then a second word, the name of a file.

  • >> Or even RM or CP, as you might have seen or used already

  • to remove or copy files.

  • All of those take so-called command line arguments--

  • additional words at the terminal prompt.

  • But up until now, we ourselves have not had

  • this luxury of taking input from the user when he or she actually runs

  • the program itself at the command line.

  • >> But we can do that by re-declaring main moving forward, not as having

  • void in parentheses, but these two arguments

  • instead-- the first an integer, and the second something

  • new, something that we're going to call an array, something similar in spirit

  • to what we saw in Scratch as a list, but an array of strings, as we'll soon see.

  • But let's see this by way of example, before we

  • distinguish exactly what that means.

  • >> So if I go into CS50 IDE here, I've gone ahead

  • and declared in a file called argv0.c the following template.

  • And notice the only thing that's different so far

  • is that I've changed void to int argc string argv open bracket, close

  • bracket.

  • And notice for now, there's nothing inside of those brackets.

  • >> There's no number.

  • And there's no i, or n, or any other letter.

  • I'm just using the square brackets for now,

  • for reasons we'll come back to in just a moment.

  • >> And now what I'm going to do is this.

  • If argc equals equals 2-- and recall that equals equals

  • is the equality operator comparing the left and right for equality.

  • It's not the assignment operator, which is

  • the single equal sign, which means copy from the right to the left some value.

  • >> If argc equals equals 2, I want to say, printf, hello, percents, new line,

  • and then plug in-- and here's the new trick-- argv bracket 1, for reasons

  • that we'll come back to in a moment.

  • Else if argc does not equal 2, you know what?

  • Let's just go ahead and, as usual, print out hello world with no substitution.

  • >> So it would seem that if argc, which stands for argument count, equals 2,

  • I'm going to print out hello something or other.

  • Otherwise, by default, I'm going to print hello world.

  • So what does this mean?

  • >> Well, let me go ahead and save this file, and then do make argv0,

  • and then ./argv0, Enter.

  • And it says hello world.

  • Now, why is that?

  • >> Well, it turns out anytime you run a program at the command line,

  • you are filling in what we'll generally call an argument vector.

  • In other words, automatically the computer, the operating system,

  • is going to hand to your program itself a list of all of the words

  • that the human typed at the prompt, in case you

  • the programmer want to do something with that information.

  • And in this case, the only word I've typed at the prompt is ./argv0.

  • >> And so the number of arguments that is being passed to my program is just one.

  • In other words, the argument count, otherwise known as argc

  • here as an integer, is just one.

  • One, of course, does not equal two.

  • And so this is what prints, hello world.

  • >> But let me take this somewhere.

  • Let me say, argv0.

  • And then how about Maria?

  • And then hit Enter.

  • >> And notice what magically happens here.

  • Now, instead of hello world, I have changed the behavior of this program

  • by taking the input not from get string or some other function,

  • but from, apparently, my command itself, what I originally typed in.

  • And I can play this game again by changing it to Stelios, for instance.

  • >> And now I see another name still.

  • And here, I might say Andi.

  • And I might say Zamyla.

  • And we can play this game all day long, just plugging in different values,

  • so long as I provide exactly two words at the prompt,

  • such that argc, my argument count, is 2.

  • >> Do I see that name plugged into printf, per this condition here?

  • So we seem to have now the expressive capability

  • of taking input from another mechanism, from the so-called command line,

  • rather than having to wait until the user runs the program,

  • and then prompt him or her using something like get string.

  • >> So what is this?

  • Argc, again, is just an integer, the number of words-- arguments--

  • that the user provided at the prompt, at the terminal window,

  • including the program's name.

  • So our ./argv0 is, effectively, the program's name,

  • or how I run the program.

  • >> That counts as a word.

  • So argc would be 1.

  • But when I write Stelios, or Andi, or Zamyla, or Maria,

  • that means the argument count is two.

  • And so now there's two words passed in.

  • >> And notice, we can continue this logic.

  • If I actually say something like Zamyla Chan,

  • a full name, thereby passing three arguments in total,

  • now it says the default again, because, of course, 3 does not equal 2.

  • >> And so in this way, do I have access via argv this new argument

  • that we could technically call anything we want.

  • But by convention, it's argv and argc, respectively.

  • Argv, argument vector, is kind of a synonym for a programming

  • feature in C called an array.

  • >> An array is a list of similar values back, to back, to back, to back.

  • In other words, if one is right here in RAM, the next one is right next to it,

  • and right next to it.

  • They're not all over the place.

  • And that latter scenario, where things are all over the place in memory,

  • can actually be a powerful feature.

  • But we'll come back to that when we talk about fancier data structures.

  • For now, an array is just a chunk of contiguous memory,

  • each of whose elements are back, to back, to back, to back,

  • and generally the same type.

  • >> So if you think about, from a moment ago, what is a string?

  • Well, a string, like Zamyla, Z-A-M-Y-L-A, is, technically,

  • just an array.

  • It's an array of characters.

  • >> And so if we really draw this, as I did earlier, as a chunk of memory,

  • it turns out that each of these characters takes up a byte.

  • And then there's that special sentinel character, the backslash 0,

  • or all eight 0 bits, that demarcates the end of that string.

  • So a string, it turns out, quote unquote string,

  • is just an array of chara-- char being an actual data type.

  • >> And now argv, meanwhile-- let's go back to the program.

  • Argv, even though we see the word string here, is not a string itself.

  • Argv, argument vector, is an array of strings.

  • >> So just as you can have an array of characters, you can have higher level,

  • an array of strings-- so, for instance, when I typed a moment ago ./argv0

  • argv0, space Z-A-M-Y-L-A, I claimed that argv had two strings in it-- ./argv0,

  • and Z-A-M-Y-L-A. In other words, argc was 2.

  • Why is that?

  • >> Well, effectively, what's going on is that each of these strings

  • is, of course, an array of characters as before, each of whose characters

  • takes up one byte.

  • And don't confuse the actual 0 in the program's name with the 0,

  • which means all 80 bits.

  • And Zamyla, meanwhile, is still also an array of characters.

  • >> So at the end of the day, it really looks like this underneath the hood.

  • But argv, by nature of how main works, allows me to wrap all of this

  • up into, if you will, a bigger array that, if we slightly over simplify

  • what the picture looks like and don't quite draw it to scale up there,

  • this array is only of size 2, the first element of which contains a string,

  • the second element of which contains a string.

  • And, in turn, if you kind of zoom in on each

  • of those strings, what you see underneath the hood

  • is that each string is just an array of characters.

  • >> Now, just as with strings, we were able to get access

  • to the i-th character in a string using that square bracket notation.

  • Similarly, with arrays in general, can we

  • use square bracket notation to get at any number of strings in an array?

  • For instance, let me go ahead and do this.

  • >> Let me go ahead and create argv1.c, which is a little different this time.

  • Instead of checking for argc2, I'm going to instead do this.

  • For int I get 0, I is less than argc, I plus plus,

  • and then print out inside of this, percent s, new line, and then

  • argv bracket i.

  • >> So in other words, I'm not dealing with individual characters at the moment.

  • Argv, as implied by these empty square braces to the right of the name argv,

  • means argv is an array of strings.

  • And argc is just an int.

  • >> This line here, 6, is saying set i equal to 0.

  • Count all the way up to, but not including, argc.

  • And then on each iteration, print out a string.

  • What string?

  • >> The i-th string in argv.

  • So whereas before I was using the square bracket

  • notation to get at the ith character in a string, now

  • I'm using the square bracket notation to get at the ith string in an array.

  • So it's kind of one layer above, conceptually.

  • >> And so what's neat about this program now, if I compile argv1,

  • and then do ./argv1, and then type in something like foo bar baz,

  • which are the three default words that a computer scientist reaches for any time

  • he or she needs some placeholder words, and hit Enter, each of those words,

  • including the program's name, which is in argv at the first location,

  • ends up being printed one at a time.

  • And if I change this, and I say something like argv1 Zamyla Chan,

  • we get all three of those words, which is argv0,

  • argv1, argv2, because in this case argc, the count, is 3.

  • >> But what's neat is if you understand that argv is just an array of strings,

  • and you understand that a string is an array of characters,

  • we can actually kind of use this square bracket notation multiple times

  • to choose a string, and then choose a character within the string,

  • diving in deeper as follows.

  • In this example, let me go ahead and call this argv2.c.

  • And in this example, let me go ahead and do the following-- for int i get 0,

  • i is less than argc, i plus plus, just like before.

  • So in other words-- and now this is getting complicated enough.

  • Then I'm going to say iterate over strings in argv,

  • as a comment to myself.

  • And then I'm going to have a nested for loop, which you probably

  • have done, or considered doing, in Scratch, where

  • I'm going to say int-- I'm not going to use i again,

  • because I don't want to shadow, or sort of overwrite the existing i.

  • >> I'm going to, instead, say j, because that's my go to variable after i,

  • when I'm just trying to count simple numbers.

  • For j gets 0-- and also, n, is going to get the stern length of argv bracket i,

  • so long as j is less than m, j plus plus, do the following.

  • And here's the interesting part.

  • >> Print out a character and a new line, plugging in argv bracket i, bracket j.

  • OK, so let me add some comments here.

  • Iterate over characters in current string,

  • print j-th character in i-th string.

  • So now, let's consider what these comments mean.

  • >> Iterate over the strings in argv-- how many

  • strings are in argv, which is an array?

  • Argc many, so I'm iterating from i equal 0 up to argc.

  • Meanwhile, how many characters are in the i-th string in argv?

  • >> Well, to get that answer, I just call string length

  • on the current string I care about, which is argv bracket i.

  • And I'm going to temporarily store that value in n, just for caching purposes,

  • to remember it for efficiency.

  • And then I'm going initialize j to 0, keep going so long as j is less than n,

  • and on each iteration increment j.

  • >> And then in here, per my comment on line 12,

  • print out a character, followed by a new line,

  • specifically argv bracket i gives me the i-th string

  • in argv-- so the first word, the second word, the third word, whatever.

  • And then j dives in deeper, and gets me the j-th character of that word.

  • And so, in effect, you can treat argv as a multi-dimensional,

  • as a two-dimensional, array, whereby every word kind of looks

  • like this in your mind's eye, and every character

  • is kind of composed in a column, if that helps.

  • >> In reality, when we tease this apart in future weeks,

  • it's going to be a little more sophisticated than that.

  • But you can really think of that, for now,

  • as just this two-dimensional array, whereby one level of it

  • is all of the strings.

  • And then if you dive in deeper, you can get at the individual characters

  • therein by using this notation here.

  • >> So what is the net effect?

  • Let me go ahead and make argv2-- darn it.

  • I made a mistake here.

  • Implicitly declaring the library function stirling.

  • So all this time, it's perhaps appropriate

  • that we're sort of finishing exactly where we started.

  • >> I screwed up, implicitly declaring library function stirling.

  • OK, wait a minute.

  • I remember that, especially since it's right here.

  • I need to include string.h in this version of the program.

  • >> Let me go ahead and include string.h, save that, go ahead

  • and recompile argv2.

  • And now, here we go, make argv2, Enter.

  • And though it's a little cryptic at first glance,

  • notice that, indeed, what is printed is dot argv2.

  • >> But if I type some words after the prompt, like argv2 Zamyla Chan,

  • Enter, also a little cryptic at first glance.

  • But if we scroll back up, ./argv2 Z-A-M-Y-L-A C-H-A-N.

  • So we've iterated over every word.

  • And, in turn, we've iterated over every character within a word.

  • >> Now, after all of this, realize that there's

  • one other detail we've been kind of ignoring this whole time.

  • We just teased apart what main's inputs can be?

  • What about main's output?

  • >> All of this time, we've been just copying and pasting

  • the word int in front of main, though you may see online,

  • sometimes incorrectly in older versions of C and compilers, that they say void,

  • or nothing at all.

  • But, indeed, for the version of C that we're using,

  • C 11, or 2011, realize that it should be int.

  • And it should either be void or argc and argv here.

  • >> But why int main?

  • What is it actually returning?

  • Well, it turns out all of this time, any time you've written a program main

  • is always returning something.

  • But it's been doing so secretly.

  • >> That something is an int, as line 5 suggests.

  • But what int?

  • Well, there's this convention in programming,

  • whereby if nothing has gone wrong and all is well,

  • programs and functions generally return-- somewhat counterintuitively--

  • 0.

  • 0 generally signifies all is well.

  • So even though you think of it as false in many contexts,

  • it actually generally means a good thing

  • >> Meanwhile, if a program returns 1, or negative 1, or 5, or negative 42,

  • or any non-0 value, that generally signifies

  • that something has gone wrong.

  • In fact, on your own Mac or PC, you might have actually seen

  • an error message, whereby it says something or other, error

  • code negative 42, or error code 23, or something like that.

  • That number is generally just a hint to the programmer, or the company

  • that made the software, what went wrong and why,

  • so that they can look through their documentation or code,

  • and figure out what the error actually means.

  • It's generally not useful to us end users.

  • >> But when main returns 0, all is well.

  • And if you don't specify what main should return,

  • it will just automatically return 0 for you.

  • But returning something else is actually useful.

  • >> In this final program, let me go ahead and call this exit.c,

  • and introduce the last of today's topics, known as an error code.

  • Let me go ahead and include our familiar files up top, do int main.

  • And this time, let's do int argc, string argv, and with my brackets

  • to imply that it's in the array.

  • And then let me just do a sanity check.

  • This time, if argc does not equal 2, then you know what?

  • Forget it.

  • I am going to say that, hey, user, you are missing command line argument

  • backslash n.

  • >> And then that's it.

  • I want to exit.

  • I am going to preemptively, and prematurely really, return

  • something other than the number 1.

  • The go to value for the first error that can happen is 1.

  • If you have some other erroneous situation that might occur,

  • you might say return 2 or return 3, or maybe even negative 1 or negative 2.

  • >> These are just exit codes that are, generally,

  • only useful to the programmer, or the company that's shipping the software.

  • But the fact that it's not 0 is what's important.

  • So if in this program, I want to guarantee that this program only

  • works if the user provides me with an argument count of two,

  • the name of the program, and some other word, I can enforce as much as follows,

  • yell at the user with printf saying, missing command line argument,

  • return 1.

  • That will just immediately quit the program.

  • >> Only if argc equals 2 will we get down here, at which point I'm going to say,

  • hello percent s, backslash n, argv1.

  • In other words, I'm not going after argv 0,

  • which is just the name of the program.

  • I want to print out hello, comma, the second word that the human typed.

  • And in this case on line 13, all is well.

  • >> I know that argc is 2 logically from this program.

  • I'm going to go ahead and return 0.

  • As an aside, keep in mind that this is true in Scratch as well.

  • >> Logically, I could do this and encapsulate these lines

  • of code in this else clause here.

  • But that's sort of unnecessarily indenting my code.

  • And I want to make super clear that no matter what,

  • by default, hello something will get printed,

  • so long as the user cooperates.

  • >> So it's very common to use a condition, just an if,

  • to catch some erroneous situation, and then exit.

  • And then, so long all is well, not have an else,

  • but just have the code outside that if, because it's

  • equivalent in this particular case, logically.

  • So I'm returning 0, just to explicitly signify all is well.

  • >> If I omitted the return 0, it would be automatically assumed for me.

  • But now that I'm returning one in at least this case,

  • I'm going to, for good measure and clarity, return 0 in this case.

  • So now let me go ahead and make exit, which is a perfect segue to just leave.

  • >> But make exit, and let me go ahead and do ./exit, Enter.

  • And the program yelled at me, missing command line argument.

  • OK, let me cooperate.

  • >> Let me instead do ./exit, David, Enter.

  • And now it says, hello David.

  • And you wouldn't normally see this.

  • >> But it turns out that there's a special way in Linux to actually see

  • with what exit code a program exited.

  • Sometimes in a graphical world like Mac OS or Windows,

  • you only see these numbers when an error message pops up on the screen

  • and the programmer shows you that number.

  • But if we want to see what the error message is, we can do it here--

  • so ./exit, Enter, print missing command line argument.

  • >> If I now do echo $?, which is ridiculously cryptic looking.

  • But $?

  • is the magical incantation that says, hey, computer,

  • tell me what the previous program's exit code was.

  • And I hit Enter.

  • I see 1, because that's what I told my main function to return.

  • >> Meanwhile, if I do ./exit David, and hit Enter, I see, hello David.

  • And if I now do echo $?, I see hello 0.

  • And so this will actually be valuable information

  • in the context of the debugger, not so much that you, the human, would care.

  • But the debugger and other programs we'll use this semester

  • will often look at that number, even though it's sort of hidden away

  • unless you look for it, to determine whether or not a program's

  • execution was correct or incorrect.

  • >> And so that brings us to this, at the end of the day.

  • We started today by looking at debugging, and in turn at the course

  • itself, and then more interestingly, technically underneath the hood

  • at what strings are, which last week we just took for granted,

  • and certainly took them for granted in Scratch.

  • >> We then looked at how we can access individual characters in a string,

  • and then again took a higher level look at things, looking at how well--

  • if we want to get at individual elements in a list like structure,

  • can't we do that with multiple strings?

  • And we can with command line arguments.

  • But this picture here of just boxes is demonstrative of this general idea

  • of an array, or a list, or a vector.

  • And depending on the context, all of these words

  • mean slightly different things.

  • So in C, we're only going to talk about an array.

  • And an array is a chunk of memory, each of whom's

  • elements are contiguous, back, to back, to back, to back.

  • >> And those elements are, generally, of the same data type, character,

  • character, character, character, or string, string, string, string, or int,

  • int, int, whatever it is we're trying to store.

  • But at the end of the day, this is what it looks like conceptually.

  • You're taking your computer's memory or RAM.

  • And you're carving it out into identically sized boxes, all of which

  • are back, to back, to back, to back in this way.

  • >> And what's nice about this idea, and the fact

  • that we can express values in this way with the first of our data structures

  • in the class, means we can start to solve problems with code

  • that came so intuitively in week 0.

  • You'll recall the phone book example, where

  • we used a divide and conquer, or a binary search algorithm,

  • to sift through a whole bunch of names and numbers.

  • But we assumed, recall, that that phone book was already sorted,

  • that someone else had already figured out-- given a list of names

  • and numbers-- how to alphabetize them.

  • And now that in C we, too, have the ability

  • to lay things out, not physically in a phone book

  • but virtually in a computer's memory, will we be able next week

  • to introduce again this-- the first of our data structures in an array--

  • but more importantly, actual computer science algorithms implemented

  • in code, with which we can store data in structures like this,

  • and then start to manipulate it, and to actually solve problems with it,

  • and to build on top of that, ultimately, programs in C,

  • in Python, in JavaScript, querying databases with SQL?

  • >> And we'll see that all of these different ideas interlock.

  • But for now, recall that the domain that we introduced today

  • was this thing here, and the world of cryptography.

  • And among the next problems you yourself will solve is the art of cryptography,

  • scrambling and de-scrambling information, and ciphering

  • and deciphering text, and assuming ultimately

  • that you now know what is underneath the hood

  • so that when you see or receive a message like this, you

  • yourself can decipher it.

  • All this, and more next time.

  • >> [VIDEO PLAYBACK]

  • >> -Mover just arrived.

  • I'm going to go visit his college professor.

  • Yep.

  • Hi.

  • It's you.

  • Wait!

  • David.

  • I'm just trying to figure out what happened to you.

  • Please, anything could help.

  • You were his college roommate, weren't you?

  • You were there with him when he finished the CS50 project?

  • >> [MUSIC PLAYING]

  • >> -That was CS50.

  • >> I love this place.

  • >> -Eat up.

  • We're going out of business.

  • >> [END PLAYBACK]

>> [MUSIC PLAYING]

字幕與單字

單字即點即查 點擊單字可以查詢單字解釋

B1 中級 美國腔

2016年CS50--第2周--數組 (CS50 2016 - Week 2 - Arrays)

  • 149 23
    洪鈺翔 發佈於 2021 年 01 月 14 日
影片單字